Bug 1297564

Summary:	service pacemaker_remote stop causes node to be fenced
Product:	Red Hat Enterprise Linux 6	Reporter:	Ken Gaillot <kgaillot>
Component:	pacemaker	Assignee:	Ken Gaillot <kgaillot>
Status:	CLOSED ERRATA	QA Contact:	cluster-qe <cluster-qe>
Severity:	medium	Docs Contact:	Steven J. Levine <slevine>
Priority:	high
Version:	6.7	CC:	abeekhof, cfeist, cluster-maint, cluster-qe, kwenning, michele, royoung, rscarazz, slevine, tlavigne
Target Milestone:	rc
Target Release:	6.8
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	pacemaker-1.1.14-5.el6	Doc Type:	Release Note
Doc Text:	Graceful migration of resources when the pacemaker_remote service is stopped on an active Pacemaker Remote node If the pacemaker_remote service is stopped on an active Pacemaker Remote node, the cluster will gracefully migrate resources off the node before stopping the node. Previously, Pacemaker Remote nodes were fenced when the service was stopped (including by commands such as "yum update"), unless the node was first explicitly taken out of the cluster. Software upgrades and other routine maintenance procedures are now much easier to perform on Pacemaker Remote nodes. Note: All nodes in the cluster must be upgraded to a version supporting this feature before it can be used on any node.	Story Points:	---
Clone Of:	1288929
Clones:	1323259 (view as bug list)		Environment:
Last Closed:	2016-05-10 23:52:33 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1288929
Bug Blocks:	1185030, 1323259, 1325009

Description Ken Gaillot 2016-01-11 21:39:17 UTC

+++ This bug was initially created as a clone of Bug #1288929 +++

Description of problem:

Graceful shutdown doesn't work

Version-Release number of selected component (if applicable):


How reproducible:

100%

Steps to Reproduce:
1. Run: service pacemaker_remote stop 

Actual results:

Fencing!

Expected results:

1. Contacts/notifies peer cluster
2. Peer cluster stops all services
3. Peer cluster tells pacemaker_remote it can shut down
4. Peer cluster recognises that the remote node was expected to shutdown

Additional info:

Esp. relevant for OSP upgrades

--- Additional comment from Raoul Scarazzini on 2015-12-24 05:28:02 EST ---

As a side note, and as a workaround, the sequence of the commands we are using to avoid fencing is this one:

1) Reboot the compute node from console
2) Do a nova stop <computenodeid> from the undercloud
3) Do a nova start <computenodeid> from the undercloud
4) Do a cycle like this on one of the controller:
$ while true; do sudo pcs resource cleanup overcloud-novacompute-0; sleep 5; done
from a controller node
5) Once the machine is up, stop the cycle from step 4

--- Additional comment from Ken Gaillot on 2016-01-08 16:29:33 EST ---

Fixed upstream as of commit da17fd0

Comment 3 Klaus Wenninger 2016-01-29 15:12:37 UTC

QA on RHEL-7.2 z-stream with same fixes as used  here found
some issues (see rhbz#1299348).
Fixed in Version: pacemaker-1.1.14-2.0.el6

Comment 6 Ken Gaillot 2016-03-18 17:38:26 UTC

Upstream commit cd10f0b is needed to support this feature on RHEL6 due to the use of legacy attrd, and will be backported

Comment 7 Ken Gaillot 2016-03-18 19:19:16 UTC

build has been updated

Comment 18 errata-xmlrpc 2016-05-10 23:52:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0856.html