Bug 1288929
| Summary: | service pacemaker_remote stop causes node to be fenced | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Andrew Beekhof <abeekhof> | |
| Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> | |
| Status: | CLOSED ERRATA | QA Contact: | Asaf Hirshberg <ahirshbe> | |
| Severity: | urgent | Docs Contact: | Steven J. Levine <slevine> | |
| Priority: | urgent | |||
| Version: | 7.2 | CC: | abeekhof, cfeist, cluster-maint, michele, oblaut, royoung, rscarazz, tlavigne, ushkalim | |
| Target Milestone: | rc | Keywords: | ZStream | |
| Target Release: | 7.3 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | pacemaker-1.1.15-1.2c148ac.git.el7 | Doc Type: | Release Note | |
| Doc Text: |
Graceful migration of resources when the *pacemaker_remote* service is stopped on an active Pacemaker Remote node
If the *pacemaker_remote* service is stopped on an active Pacemaker Remote node, the cluster will gracefully migrate resources off the node before stopping the node. Previously, Pacemaker Remote nodes were fenced when the service was stopped (including by commands such as "yum update"), unless the node was first explicitly taken out of the cluster. Software upgrades and other routine maintenance procedures are now much easier to perform on Pacemaker Remote nodes.
Note: All nodes in the cluster must be upgraded to a version supporting this feature before it can be used on any node.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1297564 1299348 (view as bug list) | Environment: | ||
| Last Closed: | 2016-11-03 18:57:31 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1304771 | |||
| Bug Blocks: | 1185030, 1297564, 1299348, 1323259, 1325009 | |||
|
Description
Andrew Beekhof
2015-12-07 03:37:24 UTC
As a side note, and as a workaround, the sequence of the commands we are using to avoid fencing is this one: 1) Reboot the compute node from console 2) Do a nova stop <computenodeid> from the undercloud 3) Do a nova start <computenodeid> from the undercloud 4) Do a cycle like this on one of the controller: $ while true; do sudo pcs resource cleanup overcloud-novacompute-0; sleep 5; done from a controller node 5) Once the machine is up, stop the cycle from step 4 Fixed upstream as of commit da17fd0 This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions Verified on RHEL-OSP director 9.0 puddle - 2016-06-03.1 Using "systemctl stop pacemaker_remote.service" stopping the service and pacemaker change the status to stopped: overcloud-novacompute-1 (ocf::pacemaker:remote): Stopped When using kill command the status is changed to FAILED and the compute is fenced: overcloud-novacompute-1 (ocf::pacemaker:remote): FAILED rpm: pacemaker-debuginfo-1.1.13-10.el7_2.2.x86_64 pacemaker-1.1.13-10.el7_2.2.x86_64 pacemaker-libs-1.1.13-10.el7_2.2.x86_64 pacemaker-cluster-libs-1.1.13-10.el7_2.2.x86_64 pacemaker-cli-1.1.13-10.el7_2.2.x86_64 pacemaker-nagios-plugins-metadata-1.1.13-10.el7_2.2.x86_64 pacemaker-doc-1.1.13-10.el7_2.2.x86_64 pacemaker-remote-1.1.13-10.el7_2.2.x86_64 I'm putting in the same doc text that we had for this feature for RHEL 6.8 for the release notes. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2578.html |