Bug 150237 - Simultaneous stopping of clurgmgrd on multiple nodes wedges in update state.
Simultaneous stopping of clurgmgrd on multiple nodes wedges in update state.
Status: CLOSED CURRENTRELEASE
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: rgmanager (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Lon Hohberger
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-03-03 16:25 EST by Derek Anderson
Modified: 2009-04-16 16:16 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-06 16:42:38 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Patch fixes problem (909 bytes, patch)
2005-03-04 11:23 EST, Lon Hohberger
no flags Details | Diff

  None (edit)
Description Derek Anderson 2005-03-03 16:25:20 EST
Description of problem:
I have a 4 node cluster, all running ccsd, cman, and fenced services.
 Start clurgmgrd on all, attain rgmanager quorum.  Stop them all at
the same time with 'killall clurgmgrd'.  I use the Send to All
Sessions feature of Konsole to do this.  They all hang while trying to
exit.

link-08:
========
User:            "usrm::manager"                     3   4 update   
SU-10,280,011,9,11
[8 10 12]

 3374 ?        S<s    0:00 clurgmgrd
 3399 ?        D      0:00 [cman_userleave]

link-10:
========
User:            "usrm::manager"                     3   4 update   
SU-10,280,011,9,11
[10 12 8]

 3400 ?        S<s    0:00 clurgmgrd
 3425 ?        D      0:00 [cman_userleave]

link-11:
========
User:            "usrm::manager"                     3   4 run      
S-15,200,4
[11 8 10 12]

 3277 ?        S<s    0:00 clurgmgrd
 3302 ?        D      0:00 [cman_userleave]

link-12:
========
User:            "usrm::manager"                     3   4 update   
SU-10,280,011,9,11
[12 10 8]

 3280 ?        S<s    0:00 clurgmgrd
 3305 ?        D      0:00 [cman_userleave]

Version-Release number of selected component (if applicable):
rgmanager-1.9.20-0

How reproducible:
Every time on this cluster.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Lon Hohberger 2005-03-04 11:07:20 EST
This happens because rgmanager stops processing events while it's
shutting down.

So, node A says "I wanna leave this service group".
Node B says "sure".
Node C says "I wanna leave this service group", which is not
acknowledged by node A.

etc.

What I need to do is continue to process events during a logout so
that we can handle this.  This might be a magma-plugin problem, not an
rgmanager problem.
Comment 2 Lon Hohberger 2005-03-04 11:23:26 EST
Created attachment 111664 [details]
Patch fixes problem
Comment 3 Lon Hohberger 2005-03-04 11:25:49 EST
Fixes in CVS too.

Note You need to log in before you can comment on or make changes to this bug.