Bug 1539113

Summary: crm_master does not work unless --lifetime is specified
Product: Red Hat Enterprise Linux 7 Reporter: Ken Gaillot <kgaillot>
Component: pacemakerAssignee: Ken Gaillot <kgaillot>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.4CC: abeekhof, aherr, cfeist, cluster-maint, phagara
Target Milestone: rcKeywords: Regression
Target Release: 7.5   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: pacemaker-1.1.18-11.el7 Doc Type: Bug Fix
Doc Text:
(This affects a small subset of customers, and thus does not need to be in the release notes.) Cause: pacemaker-1.1.16-12.el7_4.3 introduced a regression in the crm_master tool such that it would not always pass an option to crm_attribute indicating a change in node attributes. Consequence: Promotion of master resources would not occur when using custom resource agents that did not pass such an option (such as --lifetime) themselves when calling crm_master. Fix: crm_master now always passes --lifetime to crm_attribute Result: Promotion of master resources works correctly, even when used with custom resource agents that do not use the --lifetime option.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 15:34:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ken Gaillot 2018-01-26 17:18:13 UTC
Description of problem: As part of the fix for Bug 1497602, pacemaker-1.1.16-12.el7_4.3 introduced a bug in crm_master. If crm_master is called without --lifetime/-l, it will mistakenly look for the master score in the cluster properties (crm_config scope) rather than node attributes.


Version-Release number of selected component (if applicable): 1.1.16-12.el7_4.3


How reproducible: Consistently


Steps to Reproduce:
1. Create a custom resource agent (ocf:pacemaker:Stateful would be a good template) that calls "crm_master -v" without any "-l" or "--lifetime" option.
2. Configure and start a pacemaker cluster with a master/slave resource using that agent.
3. See where the "master-*" name/value is set in the CIB.

Actual results: The value is set in the cluster properties (crm_config).


Expected results: The value is set in the permanent node attributes.


Additional info: Workaround is to modify the agent to use "-l" (with either "reboot" or "forever" as desired).

Comment 2 Ken Gaillot 2018-01-26 20:04:37 UTC
Fixed upstream by commit a8b1ded

Comment 7 Patrik Hagara 2018-02-06 14:02:03 UTC
before the fix (1.1.16-12.el7_4.7-94ff4df):
* modified /usr/lib/ocf/resource.d/pacemaker/Stateful to NOT include the "-l reboot" option in crm_master call
* created a master/slave "test" resource using the modified resource agent
* all nodes remain slaves as the "master-test" nvpair is located under /cib/configuration/crm_config/cluster_property_set in the cib xml (whilst the cluster expects it to be stored under /cib/configuration/nodes/node/instance_attributes for master election purposes)


implementing the workaround (ie. adding "-l reboot" to the ocf:pacemaker:Stateful RA again) results in the master/slave resource working as expected -- "master-test" nvpair gets stored either under /cib/configuration/nodes/node/instance_attributes (with "-l forever") or /cib/status/node_state/transient_attributes/instance_attributes (with "-l reboot") and one of the nodes being properly promoted to master according to its score value


after the fix (1.1.18-11.el7-2b07d5c5a9):
* using neither "-l reboot" nor "-l forever" in the master/slave RA results in crm_master (and subsequently crm_attribute) defaulting to setting a permanent node attribute
* promotion to master works as expected
* there is no negative side-effect of crm_master script always prepending "-l forever" to the options array passed to crm_attribute, as this will get overridden correctly if the RA passes custom lifetime option (cmdline args parsed in-order, later arg replaces earlier)
* (as can be inferred from the previous point) the node score attribute is stored either as a permanent node attribute (by default & with "-l forever") or a transient one (if "-l reboot" is used)

Comment 10 errata-xmlrpc 2018-04-10 15:34:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0860