Bug 1474463 - fencing-device not properly registered after disable/enable cycle
fencing-device not properly registered after disable/enable cycle
Status: VERIFIED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pacemaker (Show other bugs)
7.4
Unspecified Unspecified
urgent Severity urgent
: rc
: 7.5
Assigned To: Klaus Wenninger
cluster-qe@redhat.com
: ZStream
Depends On:
Blocks: 1481142
  Show dependency treegraph
 
Reported: 2017-07-24 12:57 EDT by Klaus Wenninger
Modified: 2017-12-05 10:36 EST (History)
10 users (show)

See Also:
Fixed In Version: pacemaker-1.1.18-1.el7
Doc Type: Bug Fix
Doc Text:
Previously, Pacemaker's stonithd service used an incorrect search pattern when checking the configuration for re-enabled fence devices. Consequently, re-enabled fence devices were shown to be available only on the node where they had been started. With this update, stonithd now uses the correct search pattern, and the described problem no longer occurs.
Story Points: ---
Clone Of:
: 1481142 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Klaus Wenninger 2017-07-24 12:57:01 EDT
Description of problem:

fencing-devices are foreseen to be used even if not shown as started.
It is possible to disable them either via location-rules (no rules that dynamically change their result like score attributes or alike as they are not reevaluated by stonithd) or by explicitely setting the target-role to stopped.
After a diable/enable cycle stonith devices are just used if explicitly started on a node. 


Version-Release number of selected component (if applicable):
1.1.16-12.el7

How reproducible:
100%

Steps to Reproduce:
1. create a fencing-resource-primitive without any rules restricting it to a node
2. restart cluster
3. use crm_mon to assure that fencing-resource accounts as started on one of the cluster nodes
4. use 'stonith_admin -L' to verify that the fencing-resource is available both on the node where it is claimed to be started and on at least one other node
5. issue 'pcs stonith disable {your-fencing-resource}' and 'pcs stonith disable {your-fencing-resource}'

Actual results:
'stonith_admin -L' shows the fencing-resource just on the node where it is claimed to be started

Expected results:
'stonith_admin -L' should show the fencing-resource to be available on all nodes

Additional info:
Comment 2 Klaus Wenninger 2017-07-24 13:01:19 EDT
cib-diffs are not being checked properly in stonithd to decide if a full parsing should be triggered or not.

https://github.com/ClusterLabs/pacemaker/pull/1314/commits/5e3cd2614e1db60a14d5615c9c175575409b56d6

seems to solve the issue.
Comment 4 michal novacek 2017-08-04 05:56:14 EDT
qa-ack+: clear reproducer in initial commit
Comment 7 Patrik Hagara 2017-12-05 10:36:42 EST
Reproduced as outlined in the first comment:
  * "pcs cluster setup --start --wait ..."
  * "pcs stonith create ..."
  * "pcs cluster stop --all"
  * "pcs cluster start --all --wait"
  * "crm_mon -X"; verify fence resource seen as started on 1 node from all nodes
  * "stonith_admin -L"; verify fence device registered/available on all nodes
  * "pcs stonith disable --wait ..."
  * "pcs stonith enable --wait ..."
  * "crm_mon -X"; verify fence resource seen as started on 1 node from all nodes
  * "stonith_admin -L"; verify fence device still registered/available on all nodes

Before the fix (1.1.16-12.el7) only the node on which the fence resource is in "Started" role sees the corresponding fence device as registered/available, other nodes do not (as reported by "stonith_admin -L"). After the fix (1.1.18-1.el7) all nodes see the fence device as registered/available. Marking verified.

Note You need to log in before you can comment on or make changes to this bug.