Bug 1297564 - service pacemaker_remote stop causes node to be fenced
Summary: service pacemaker_remote stop causes node to be fenced
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: pacemaker
Version: 6.7
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: 6.8
Assignee: Ken Gaillot
QA Contact: cluster-qe@redhat.com
Steven J. Levine
Depends On: 1288929
Blocks: 1185030 1323259 1325009
TreeView+ depends on / blocked
Reported: 2016-01-11 21:39 UTC by Ken Gaillot
Modified: 2016-10-24 13:36 UTC (History)
10 users (show)

Fixed In Version: pacemaker-1.1.14-5.el6
Doc Type: Release Note
Doc Text:
Graceful migration of resources when the *pacemaker_remote* service is stopped on an active Pacemaker Remote node If the *pacemaker_remote* service is stopped on an active Pacemaker Remote node, the cluster will gracefully migrate resources off the node before stopping the node. Previously, Pacemaker Remote nodes were fenced when the service was stopped (including by commands such as "yum update"), unless the node was first explicitly taken out of the cluster. Software upgrades and other routine maintenance procedures are now much easier to perform on Pacemaker Remote nodes. Note: All nodes in the cluster must be upgraded to a version supporting this feature before it can be used on any node.
Clone Of: 1288929
: 1323259 (view as bug list)
Last Closed: 2016-05-10 23:52:33 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1388102 None None None Never
Red Hat Product Errata RHBA-2016:0856 normal SHIPPED_LIVE pacemaker bug fix and enhancement update 2016-05-10 22:44:25 UTC

Internal Links: 1388102

Description Ken Gaillot 2016-01-11 21:39:17 UTC
+++ This bug was initially created as a clone of Bug #1288929 +++

Description of problem:

Graceful shutdown doesn't work

Version-Release number of selected component (if applicable):

How reproducible:


Steps to Reproduce:
1. Run: service pacemaker_remote stop 

Actual results:


Expected results:

1. Contacts/notifies peer cluster
2. Peer cluster stops all services
3. Peer cluster tells pacemaker_remote it can shut down
4. Peer cluster recognises that the remote node was expected to shutdown

Additional info:

Esp. relevant for OSP upgrades

--- Additional comment from Raoul Scarazzini on 2015-12-24 05:28:02 EST ---

As a side note, and as a workaround, the sequence of the commands we are using to avoid fencing is this one:

1) Reboot the compute node from console
2) Do a nova stop <computenodeid> from the undercloud
3) Do a nova start <computenodeid> from the undercloud
4) Do a cycle like this on one of the controller:
$ while true; do sudo pcs resource cleanup overcloud-novacompute-0; sleep 5; done
from a controller node
5) Once the machine is up, stop the cycle from step 4

--- Additional comment from Ken Gaillot on 2016-01-08 16:29:33 EST ---

Fixed upstream as of commit da17fd0

Comment 3 Klaus Wenninger 2016-01-29 15:12:37 UTC
QA on RHEL-7.2 z-stream with same fixes as used  here found
some issues (see rhbz#1299348).
Fixed in Version: pacemaker-1.1.14-2.0.el6

Comment 6 Ken Gaillot 2016-03-18 17:38:26 UTC
Upstream commit cd10f0b is needed to support this feature on RHEL6 due to the use of legacy attrd, and will be backported

Comment 7 Ken Gaillot 2016-03-18 19:19:16 UTC
build has been updated

Comment 18 errata-xmlrpc 2016-05-10 23:52:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.