Bug 150237

Summary: Simultaneous stopping of clurgmgrd on multiple nodes wedges in update state.
Product: [Retired] Red Hat Cluster Suite Reporter: Derek Anderson <danderso>
Component: rgmanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-05-06 20:42:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch fixes problem none

Description Derek Anderson 2005-03-03 21:25:20 UTC
Description of problem:
I have a 4 node cluster, all running ccsd, cman, and fenced services.
 Start clurgmgrd on all, attain rgmanager quorum.  Stop them all at
the same time with 'killall clurgmgrd'.  I use the Send to All
Sessions feature of Konsole to do this.  They all hang while trying to
exit.

link-08:
========
User:            "usrm::manager"                     3   4 update   
SU-10,280,011,9,11
[8 10 12]

 3374 ?        S<s    0:00 clurgmgrd
 3399 ?        D      0:00 [cman_userleave]

link-10:
========
User:            "usrm::manager"                     3   4 update   
SU-10,280,011,9,11
[10 12 8]

 3400 ?        S<s    0:00 clurgmgrd
 3425 ?        D      0:00 [cman_userleave]

link-11:
========
User:            "usrm::manager"                     3   4 run      
S-15,200,4
[11 8 10 12]

 3277 ?        S<s    0:00 clurgmgrd
 3302 ?        D      0:00 [cman_userleave]

link-12:
========
User:            "usrm::manager"                     3   4 update   
SU-10,280,011,9,11
[12 10 8]

 3280 ?        S<s    0:00 clurgmgrd
 3305 ?        D      0:00 [cman_userleave]

Version-Release number of selected component (if applicable):
rgmanager-1.9.20-0

How reproducible:
Every time on this cluster.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Lon Hohberger 2005-03-04 16:07:20 UTC
This happens because rgmanager stops processing events while it's
shutting down.

So, node A says "I wanna leave this service group".
Node B says "sure".
Node C says "I wanna leave this service group", which is not
acknowledged by node A.

etc.

What I need to do is continue to process events during a logout so
that we can handle this.  This might be a magma-plugin problem, not an
rgmanager problem.

Comment 2 Lon Hohberger 2005-03-04 16:23:26 UTC
Created attachment 111664 [details]
Patch fixes problem

Comment 3 Lon Hohberger 2005-03-04 16:25:49 UTC
Fixes in CVS too.