Bug 150237 - Simultaneous stopping of clurgmgrd on multiple nodes wedges in update state.
Summary: Simultaneous stopping of clurgmgrd on multiple nodes wedges in update state.
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: rgmanager (Show other bugs)
(Show other bugs)
Version: 4
Hardware: All Linux
medium
medium
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-03-03 21:25 UTC by Derek Anderson
Modified: 2009-04-16 20:16 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-06 20:42:38 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch fixes problem (909 bytes, patch)
2005-03-04 16:23 UTC, Lon Hohberger
no flags Details | Diff

Description Derek Anderson 2005-03-03 21:25:20 UTC
Description of problem:
I have a 4 node cluster, all running ccsd, cman, and fenced services.
 Start clurgmgrd on all, attain rgmanager quorum.  Stop them all at
the same time with 'killall clurgmgrd'.  I use the Send to All
Sessions feature of Konsole to do this.  They all hang while trying to
exit.

link-08:
========
User:            "usrm::manager"                     3   4 update   
SU-10,280,011,9,11
[8 10 12]

 3374 ?        S<s    0:00 clurgmgrd
 3399 ?        D      0:00 [cman_userleave]

link-10:
========
User:            "usrm::manager"                     3   4 update   
SU-10,280,011,9,11
[10 12 8]

 3400 ?        S<s    0:00 clurgmgrd
 3425 ?        D      0:00 [cman_userleave]

link-11:
========
User:            "usrm::manager"                     3   4 run      
S-15,200,4
[11 8 10 12]

 3277 ?        S<s    0:00 clurgmgrd
 3302 ?        D      0:00 [cman_userleave]

link-12:
========
User:            "usrm::manager"                     3   4 update   
SU-10,280,011,9,11
[12 10 8]

 3280 ?        S<s    0:00 clurgmgrd
 3305 ?        D      0:00 [cman_userleave]

Version-Release number of selected component (if applicable):
rgmanager-1.9.20-0

How reproducible:
Every time on this cluster.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Lon Hohberger 2005-03-04 16:07:20 UTC
This happens because rgmanager stops processing events while it's
shutting down.

So, node A says "I wanna leave this service group".
Node B says "sure".
Node C says "I wanna leave this service group", which is not
acknowledged by node A.

etc.

What I need to do is continue to process events during a logout so
that we can handle this.  This might be a magma-plugin problem, not an
rgmanager problem.

Comment 2 Lon Hohberger 2005-03-04 16:23:26 UTC
Created attachment 111664 [details]
Patch fixes problem

Comment 3 Lon Hohberger 2005-03-04 16:25:49 UTC
Fixes in CVS too.



Note You need to log in before you can comment on or make changes to this bug.