Hide Forgot
Description of problem: pcs has strange/inconsistent behaviour and operation namings, first configuration example: $ pcs cluster cib vm_cfg $ pcs -f vm_cfg resource create vm ocf:heartbeat:VirtualDomain config=/etc/libvirt/qemu/vm.xml snapshot=/var/lib/libvirt/qemu/pacemaker $ pcs -f vm_cfg resource op add vm monitor interval=60s timeout=30s $ pcs -f vm_cfg resource op add vm start interval=0 timeout=120s $ pcs -f vm_cfg resource op add vm stop interval=0 timeout=120s $ pcs -f vm_cfg constraint colocation add vm libvirtd INFINITY $ pcs -f vm_cfg constraint order libvirtd then vm $ pcs cluster cib-push vm_cfg This results in: $ pcs config [...] Resource: vm (class=ocf provider=heartbeat type=VirtualDomain) Attributes: config=/etc/libvirt/qemu/vm.xml snapshot=/var/lib/libvirt/qemu/pacemaker Operations: monitor interval=60s (vm-monitor-interval-60s) monitor interval=60s timeout=30s (vm-name-monitor-interval-60s-timeout-30s) start interval=0 timeout=120s (vm-name-start-interval-0-timeout-120s) stop interval=0 timeout=120s (vm-name-stop-interval-0-timeout-120s) $ Ouch? How did that happen? Yes, it's a new feature of pcs 0.9.90 (RHEL 6.5) to add a monitor interval by default if none has been specified. However pcs doesn't seem to be really transaction safe thus it does not know that later (in same cib!) a proper operations monitor is added. Thus the configuration is not valid: $ crm_verify -L -V error: is_op_dup: Operation vm-name-monitor-interval-60s-timeout-30s is a duplicate of vm-monitor-interval-60s error: is_op_dup: Do not use the same (name, interval) combination more than once per resource error: is_op_dup: Operation vm-name-monitor-interval-60s-timeout-30s is a duplicate of vm-monitor-interval-60s error: is_op_dup: Do not use the same (name, interval) combination more than once per resource Errors found during check: config not valid $ That brought me to the second configuration example where I tried to add the operations monitor using create and not afterwards: $ pcs cluster cib vm_cfg $ pcs -f vm_cfg resource create vm ocf:heartbeat:VirtualDomain config=/etc/libvirt/qemu/vm.xml snapshot=/var/lib/libvirt/qemu/pacemaker op monitor interval=60s timeout=30s $ pcs -f vm_cfg resource op add vm start interval=0 timeout=120s $ pcs -f vm_cfg resource op add vm stop interval=0 timeout=120s $ pcs -f vm_cfg constraint colocation add vm libvirtd INFINITY $ pcs -f vm_cfg constraint order libvirtd then vm $ pcs cluster cib-push vm_cfg This results in: $ pcs config [...] Resource: vm (class=ocf provider=heartbeat type=VirtualDomain) Attributes: config=/etc/libvirt/qemu/vm.xml snapshot=/var/lib/libvirt/qemu/pacemaker Operations: monitor interval=60s timeout=30s (vm-monitor-interval-60s) start interval=0 timeout=120s (vm-name-start-interval-0-timeout-120s) stop interval=0 timeout=120s (vm-name-stop-interval-0-timeout-120s) $ Ehm? Why is it named "vm-monitor-interval-60s" rather "vm-name-monitor- interval-60s-timeout-30s" even I added "monitor interval=60s timeout=30s" to the create command? I would expect "vm-name-monitor-interval-60s-timeout-30s" both times... Version-Release number of selected component (if applicable): pcs-0.9.90-1.el6_4.noarch How reproducible: Everytime, see above and below. Actual results: pcs has strange/inconsistent behaviour and operation namings. Expected results: At least proper naming as suggested/expected above. That pcs is not really transaction safe is a pity but obviously not easy to change, I guess?
Created attachment 969012 [details] proposed fix
Created attachment 970019 [details] fix for the original proposed fix
Created attachment 970020 [details] proposed fix - make default resource operations unique
Before Fix: [root@rh66-node1 ~]# rpm -q pcs pcs-0.9.123-9.el6.x86_64 [root@rh66-node1:~]# pcs resource create dummy Dummy [root@rh66-node1:~]# pcs resource show dummy Resource: dummy (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (dummy-start-timeout-20) stop interval=0s timeout=20 (dummy-stop-timeout-20) monitor interval=10 timeout=20 (dummy-monitor-interval-10) [root@rh66-node1:~]# pcs resource op add dummy monitor interval=10s timeout=20s [root@rh66-node1:~]# pcs resource op add dummy start interval=0s timeout=20s [root@rh66-node1:~]# pcs resource op add dummy stop interval=0s timeout=20s [root@rh66-node1:~]# pcs resource show dummy Resource: dummy (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (dummy-start-timeout-20) stop interval=0s timeout=20 (dummy-stop-timeout-20) monitor interval=10 timeout=20 (dummy-monitor-interval-10) monitor interval=10s timeout=20s (dummy-name-monitor-interval-10s-timeout-20s) start interval=0s timeout=20s (dummy-name-start-interval-0s-timeout-20s) stop interval=0s timeout=20s (dummy-name-stop-interval-0s-timeout-20s) [root@rh66-node1:~]# pcs resource create dummy1 Dummy op monitor interval=10s timeout=20s [root@rh66-node1:~]# pcs resource show dummy1 Resource: dummy1 (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (dummy1-start-timeout-20) stop interval=0s timeout=20 (dummy1-stop-timeout-20) monitor interval=10s timeout=20s (dummy1-monitor-interval-10s) After Fix: [root@rh66-node1:~]# rpm -q pcs pcs-0.9.138-1.el6.x86_64 [root@rh66-node1:~]# pcs resource create dummy Dummy [root@rh66-node1:~]# pcs resource show dummy Resource: dummy (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (dummy-start-interval-0s) stop interval=0s timeout=20 (dummy-stop-interval-0s) monitor interval=10 timeout=20 (dummy-monitor-interval-10) [root@rh66-node1:~]# pcs resource op add dummy monitor interval=10s timeout=20s Error: operation monitor with interval 10s already specified for dummy: monitor interval=10 timeout=20 (dummy-monitor-interval-10) [root@rh66-node1:~]# echo $? 1 [root@rh66-node1:~]# pcs resource op add dummy start interval=0s timeout=20s Error: operation start with interval 0s already specified for dummy: start interval=0s timeout=20 (dummy-start-interval-0s) [root@rh66-node1:~]# echo $? 1 [root@rh66-node1:~]# pcs resource op add dummy stop interval=0s timeout=20s Error: operation stop with interval 0s already specified for dummy: stop interval=0s timeout=20 (dummy-stop-interval-0s) [root@rh66-node1:~]# echo $? 1 [root@rh66-node1:~]# pcs resource show dummy Resource: dummy (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (dummy-start-interval-0s) stop interval=0s timeout=20 (dummy-stop-interval-0s) monitor interval=10 timeout=20 (dummy-monitor-interval-10) [root@rh66-node1:~]# pcs resource create dummy1 Dummy op monitor interval=10s timeout=20s [root@rh66-node1:~]# pcs resource show dummy1 Resource: dummy1 (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (dummy1-start-interval-0s) stop interval=0s timeout=20 (dummy1-stop-interval-0s) monitor interval=10s timeout=20s (dummy1-monitor-interval-10s) [root@rh66-node1:~]# pcs resource create dummy2 Dummy op monitor interval=20s timeout=30s [root@rh66-node1:~]# pcs resource show dummy2 Resource: dummy2 (class=ocf provider=heartbeat type=Dummy) Operations: start interval=0s timeout=20 (dummy2-start-interval-0s) stop interval=0s timeout=20 (dummy2-stop-interval-0s) monitor interval=20s timeout=30s (dummy2-monitor-interval-20s)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-1446.html