Red Hat Bugzilla – Bug 504195
cpg confchg's delivered in different order
Last modified: 2016-04-26 10:26:06 EDT
Description of problem:
Nate found this bug running revolver on 5 nodes, when two or three nodes were killed at the same time.
The groupd logs show that cpg delivers the confchg's for the killed nodes in different orders.
The first time, nodes 1,3,4 were killed, leaving 2,5.
nodeid 2 got confchg order: 4,3,1
nodeid 5 got confchg order: 1,3,4
The second time, nodes 2,5 were killed, leaving 1,3,4.
1 got 2,5
3 got 5,2
4 got 5,2
Version-Release number of selected component (if applicable):
Steps to Reproduce:
The result of this bug is that after a cluster failure, all cluster services will be stuck because recovery can not complete. The entire cluster needs to be rebooted to recover from this scenario.
changed to 5.4, all archs, urgent, urgent.
one liner patch in testing now, assuming it fixes the problem this is a serious regression in the 5.4 version.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.