Bug 1646347

Summary: Rebooting controller nodes hangs for 20 minutes when running 'pcs cluster stop' [rhel-7.6.z]
Product: Red Hat Enterprise Linux 7 Reporter: Oneata Mircea Teodor <toneata>
Component: pacemakerAssignee: Ken Gaillot <kgaillot>
Status: CLOSED ERRATA QA Contact: Marian Krcmarik <mkrcmari>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.6CC: abeekhof, aherr, autobot-eus-copy, cfeist, cluster-maint, ctowsley, dbecker, dciabrin, jeckersb, jpokorny, kgaillot, lmanasko, mburns, mcornea, mjuricek, mkrcmari, morazi, nwahl, yprokule
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: 7.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pacemaker-1.1.19-8.el7_6.1 Doc Type: Bug Fix
Doc Text:
Previously, cloned notify actions on the Pacemaker Remote node were routed through the wrong cluster node when the Pacemaker Remote node was moving. As a consequence, the cluster looped indefinitely when trying to move the connection. With this update, notify actions are now routed through a proper cluster node when the Pacemaker Remote connection is moving. As a result, the describer problem no longer occurs.
Story Points: ---
Clone Of: 1644076 Environment:
Last Closed: 2018-11-27 01:21:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1644076    
Bug Blocks:    

Description Oneata Mircea Teodor 2018-11-05 12:35:23 UTC
This bug has been copied from bug #1644076 and has been proposed to be backported to 7.6 z-stream (EUS).

Comment 2 Ken Gaillot 2018-11-06 01:17:30 UTC
This has been fixed in the upstream 1.1 branch by commit 2ba3fffc

Comment 3 Ken Gaillot 2018-11-07 21:30:21 UTC
QA: To reproduce, configure a cluster with at least three cluster nodes, and a guest node and/or bundle. Run "pcs cluster stop" on one of the cluster nodes that is running the guest and/or bundle. (I expect the problem would also appear if you ban the guest node and/or bundle from one of the cluster nodes running it.)

Before the fix, the operation will not complete, and the node will eventually force-exit after a timeout. After the fix, shutdown proceeds normally.

Comment 6 errata-xmlrpc 2018-11-27 01:21:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3667

Comment 7 Ken Gaillot 2018-11-29 15:22:44 UTC
*** Bug 1654601 has been marked as a duplicate of this bug. ***

Comment 8 Ken Gaillot 2021-01-06 21:17:25 UTC
*** Bug 1910166 has been marked as a duplicate of this bug. ***