1031141 – pcs has strange/inconsistent behaviour and operation namings

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1031141 - pcs has strange/inconsistent behaviour and operation namings

Summary: pcs has strange/inconsistent behaviour and operation namings

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	pcs
Sub Component:
Version:	6.5
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Tomas Jelinek
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-11-15 17:18 UTC by Robert Scheck
Modified:	2015-07-22 06:15 UTC (History)
CC List:	5 users (show)
Fixed In Version:	pcs-0.9.138-1.el6
Doc Type:	Bug Fix
Doc Text:	* After the user added a duplicate resource operation, Pacemaker configuration became invalid. With this update, pcs does not add the operation and instead informs the user that the same operation already exists. (BZ#1031141)
Clone Of:
Environment:
Last Closed:	2015-07-22 06:15:31 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
proposed fix (57.31 KB, patch) 2014-12-15 13:58 UTC, Tomas Jelinek	no flags	Details \| Diff
fix for the original proposed fix (1.26 KB, patch) 2014-12-17 10:36 UTC, Tomas Jelinek	no flags	Details \| Diff
proposed fix - make default resource operations unique (4.19 KB, patch) 2014-12-17 10:37 UTC, Tomas Jelinek	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2015:1446	0	normal	SHIPPED_LIVE	pcs bug fix and enhancement update	2015-07-20 18:43:57 UTC

Description Robert Scheck 2013-11-15 17:18:59 UTC

Description of problem:
pcs has strange/inconsistent behaviour and operation namings, first 
configuration example:

$ pcs cluster cib vm_cfg
$ pcs -f vm_cfg resource create vm ocf:heartbeat:VirtualDomain config=/etc/libvirt/qemu/vm.xml snapshot=/var/lib/libvirt/qemu/pacemaker
$ pcs -f vm_cfg resource op add vm monitor interval=60s timeout=30s
$ pcs -f vm_cfg resource op add vm start interval=0 timeout=120s
$ pcs -f vm_cfg resource op add vm stop interval=0 timeout=120s
$ pcs -f vm_cfg constraint colocation add vm libvirtd INFINITY
$ pcs -f vm_cfg constraint order libvirtd then vm
$ pcs cluster cib-push vm_cfg

This results in:

$ pcs config
[...]
 Resource: vm (class=ocf provider=heartbeat type=VirtualDomain)
  Attributes: config=/etc/libvirt/qemu/vm.xml snapshot=/var/lib/libvirt/qemu/pacemaker 
  Operations: monitor interval=60s (vm-monitor-interval-60s)
              monitor interval=60s timeout=30s (vm-name-monitor-interval-60s-timeout-30s)
              start interval=0 timeout=120s (vm-name-start-interval-0-timeout-120s)
              stop interval=0 timeout=120s (vm-name-stop-interval-0-timeout-120s)
$

Ouch? How did that happen? Yes, it's a new feature of pcs 0.9.90 (RHEL 6.5)
to add a monitor interval by default if none has been specified. However pcs 
doesn't seem to be really transaction safe thus it does not know that later
(in same cib!) a proper operations monitor is added. Thus the configuration 
is not valid:

$ crm_verify -L -V
   error: is_op_dup: 	Operation vm-name-monitor-interval-60s-timeout-30s is a duplicate of vm-monitor-interval-60s
   error: is_op_dup: 	Do not use the same (name, interval) combination more than once per resource
   error: is_op_dup: 	Operation vm-name-monitor-interval-60s-timeout-30s is a duplicate of vm-monitor-interval-60s
   error: is_op_dup: 	Do not use the same (name, interval) combination more than once per resource
Errors found during check: config not valid
$ 

That brought me to the second configuration example where I tried to add the 
operations monitor using create and not afterwards:

$ pcs cluster cib vm_cfg
$ pcs -f vm_cfg resource create vm ocf:heartbeat:VirtualDomain config=/etc/libvirt/qemu/vm.xml snapshot=/var/lib/libvirt/qemu/pacemaker op monitor interval=60s timeout=30s
$ pcs -f vm_cfg resource op add vm start interval=0 timeout=120s
$ pcs -f vm_cfg resource op add vm stop interval=0 timeout=120s
$ pcs -f vm_cfg constraint colocation add vm libvirtd INFINITY
$ pcs -f vm_cfg constraint order libvirtd then vm
$ pcs cluster cib-push vm_cfg

This results in:

$ pcs config
[...]
 Resource: vm (class=ocf provider=heartbeat type=VirtualDomain)
  Attributes: config=/etc/libvirt/qemu/vm.xml snapshot=/var/lib/libvirt/qemu/pacemaker 
  Operations: monitor interval=60s timeout=30s (vm-monitor-interval-60s)
              start interval=0 timeout=120s (vm-name-start-interval-0-timeout-120s)
              stop interval=0 timeout=120s (vm-name-stop-interval-0-timeout-120s)
$ 

Ehm? Why is it named "vm-monitor-interval-60s" rather "vm-name-monitor-
interval-60s-timeout-30s" even I added "monitor interval=60s timeout=30s"
to the create command?

I would expect "vm-name-monitor-interval-60s-timeout-30s" both times...

Version-Release number of selected component (if applicable):
pcs-0.9.90-1.el6_4.noarch

How reproducible:
Everytime, see above and below.

Actual results:
pcs has strange/inconsistent behaviour and operation namings.

Expected results:
At least proper naming as suggested/expected above. That pcs is not really
transaction safe is a pity but obviously not easy to change, I guess?

Comment 4 Tomas Jelinek 2014-12-15 13:58:03 UTC

Created attachment 969012 [details]
proposed fix

Comment 5 Tomas Jelinek 2014-12-17 10:36:40 UTC

Created attachment 970019 [details]
fix for the original proposed fix

Comment 6 Tomas Jelinek 2014-12-17 10:37:32 UTC

Created attachment 970020 [details]
proposed fix - make default resource operations unique

Comment 7 Tomas Jelinek 2015-01-27 13:57:36 UTC

Before Fix:
[root@rh66-node1 ~]# rpm -q pcs
pcs-0.9.123-9.el6.x86_64

[root@rh66-node1:~]# pcs resource create dummy Dummy
[root@rh66-node1:~]# pcs resource show dummy
 Resource: dummy (class=ocf provider=heartbeat type=Dummy)
  Operations: start interval=0s timeout=20 (dummy-start-timeout-20)
              stop interval=0s timeout=20 (dummy-stop-timeout-20)
              monitor interval=10 timeout=20 (dummy-monitor-interval-10)
[root@rh66-node1:~]# pcs resource op add dummy monitor interval=10s timeout=20s
[root@rh66-node1:~]# pcs resource op add dummy start interval=0s timeout=20s
[root@rh66-node1:~]# pcs resource op add dummy stop interval=0s timeout=20s
[root@rh66-node1:~]# pcs resource show dummy
 Resource: dummy (class=ocf provider=heartbeat type=Dummy)
  Operations: start interval=0s timeout=20 (dummy-start-timeout-20)
              stop interval=0s timeout=20 (dummy-stop-timeout-20)
              monitor interval=10 timeout=20 (dummy-monitor-interval-10)
              monitor interval=10s timeout=20s (dummy-name-monitor-interval-10s-timeout-20s)
              start interval=0s timeout=20s (dummy-name-start-interval-0s-timeout-20s)
              stop interval=0s timeout=20s (dummy-name-stop-interval-0s-timeout-20s)

[root@rh66-node1:~]# pcs resource create dummy1 Dummy op monitor interval=10s timeout=20s
[root@rh66-node1:~]# pcs resource show dummy1
 Resource: dummy1 (class=ocf provider=heartbeat type=Dummy)
  Operations: start interval=0s timeout=20 (dummy1-start-timeout-20)
              stop interval=0s timeout=20 (dummy1-stop-timeout-20)
              monitor interval=10s timeout=20s (dummy1-monitor-interval-10s)



After Fix:
[root@rh66-node1:~]# rpm -q pcs
pcs-0.9.138-1.el6.x86_64

[root@rh66-node1:~]# pcs resource create dummy Dummy
[root@rh66-node1:~]# pcs resource show dummy
 Resource: dummy (class=ocf provider=heartbeat type=Dummy)
  Operations: start interval=0s timeout=20 (dummy-start-interval-0s)
              stop interval=0s timeout=20 (dummy-stop-interval-0s)
              monitor interval=10 timeout=20 (dummy-monitor-interval-10)
[root@rh66-node1:~]# pcs resource op add dummy monitor interval=10s timeout=20s
Error: operation monitor with interval 10s already specified for dummy:
monitor interval=10 timeout=20 (dummy-monitor-interval-10)
[root@rh66-node1:~]# echo $?
1
[root@rh66-node1:~]# pcs resource op add dummy start interval=0s timeout=20s
Error: operation start with interval 0s already specified for dummy:
start interval=0s timeout=20 (dummy-start-interval-0s)
[root@rh66-node1:~]# echo $?
1
[root@rh66-node1:~]# pcs resource op add dummy stop interval=0s timeout=20s
Error: operation stop with interval 0s already specified for dummy:
stop interval=0s timeout=20 (dummy-stop-interval-0s)
[root@rh66-node1:~]# echo $?
1
[root@rh66-node1:~]# pcs resource show dummy
 Resource: dummy (class=ocf provider=heartbeat type=Dummy)
  Operations: start interval=0s timeout=20 (dummy-start-interval-0s)
              stop interval=0s timeout=20 (dummy-stop-interval-0s)
              monitor interval=10 timeout=20 (dummy-monitor-interval-10)

[root@rh66-node1:~]# pcs resource create dummy1 Dummy op monitor interval=10s timeout=20s
[root@rh66-node1:~]# pcs resource show dummy1
 Resource: dummy1 (class=ocf provider=heartbeat type=Dummy)
  Operations: start interval=0s timeout=20 (dummy1-start-interval-0s)
              stop interval=0s timeout=20 (dummy1-stop-interval-0s)
              monitor interval=10s timeout=20s (dummy1-monitor-interval-10s)
[root@rh66-node1:~]# pcs resource create dummy2 Dummy op monitor interval=20s timeout=30s
[root@rh66-node1:~]# pcs resource show dummy2
 Resource: dummy2 (class=ocf provider=heartbeat type=Dummy)
  Operations: start interval=0s timeout=20 (dummy2-start-interval-0s)
              stop interval=0s timeout=20 (dummy2-stop-interval-0s)
              monitor interval=20s timeout=30s (dummy2-monitor-interval-20s)

Comment 11 errata-xmlrpc 2015-07-22 06:15:31 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1446.html

Note You need to log in before you can comment on or make changes to this bug.