Graceful migration of resources when the *pacemaker_remote* service is stopped on an active Pacemaker Remote node
If the *pacemaker_remote* service is stopped on an active Pacemaker Remote node, the cluster will gracefully migrate resources off the node before stopping the node. Previously, Pacemaker Remote nodes were fenced when the service was stopped (including by commands such as "yum update"), unless the node was first explicitly taken out of the cluster. Software upgrades and other routine maintenance procedures are now much easier to perform on Pacemaker Remote nodes.
Note: All nodes in the cluster must be upgraded to a version supporting this feature before it can be used on any node.
Description of problem:
Graceful shutdown doesn't work
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Run: service pacemaker_remote stop
1. Contacts/notifies peer cluster
2. Peer cluster stops all services
3. Peer cluster tells pacemaker_remote it can shut down
4. Peer cluster recognises that the remote node was expected to shutdown
Esp. relevant for OSP upgrades
As a side note, and as a workaround, the sequence of the commands we are using to avoid fencing is this one:
1) Reboot the compute node from console
2) Do a nova stop <computenodeid> from the undercloud
3) Do a nova start <computenodeid> from the undercloud
4) Do a cycle like this on one of the controller:
$ while true; do sudo pcs resource cleanup overcloud-novacompute-0; sleep 5; done
from a controller node
5) Once the machine is up, stop the cycle from step 4
Fixed upstream as of commit da17fd0
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see email@example.com with any questions
Verified on RHEL-OSP director 9.0 puddle - 2016-06-03.1
Using "systemctl stop pacemaker_remote.service" stopping the service and pacemaker change the status to stopped:
overcloud-novacompute-1 (ocf::pacemaker:remote): Stopped
When using kill command the status is changed to FAILED and the compute is fenced:
overcloud-novacompute-1 (ocf::pacemaker:remote): FAILED
I'm putting in the same doc text that we had for this feature for RHEL 6.8 for the release notes.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.