This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 163168 - clustat hangs when member is fenced off.
clustat hangs when member is fenced off.
Status: CLOSED WORKSFORME
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: dlm (Show other bugs)
4
All Linux
medium Severity high
: ---
: ---
Assigned To: Christine Caulfield
Cluster QE
:
Depends On: 171153
Blocks:
  Show dependency treegraph
 
Reported: 2005-07-13 13:43 EDT by jim wilcox
Modified: 2009-04-16 15:59 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-05-05 03:27:57 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
strace of a clustat hang (19.24 KB, text/plain)
2005-09-09 14:07 EDT, Henry Harris
no flags Details
strace of a clustat hang (19.24 KB, text/plain)
2005-09-09 14:07 EDT, Henry Harris
no flags Details
strace of clusvcadm hang (18.09 KB, text/plain)
2005-09-09 14:09 EDT, Henry Harris
no flags Details
strace of clusvcadm hang (18.09 KB, text/plain)
2005-09-09 14:09 EDT, Henry Harris
no flags Details
This is a DLM hang, not an rgmanager/clustat problem per se. (74.02 KB, text/plain)
2005-12-16 13:46 EST, Lon Hohberger
no flags Details

  None (edit)
Description jim wilcox 2005-07-13 13:43:32 EDT
Description of problem:

- With both 2 and 3 node clusters (and suspect this is true for any number of 
nodes in a cluster)when a member is manually fenced off clustat hangs on the 
nodes still in quorum.
- All gfs related operations also hang - which we would expect, but we still 
anticipate that clustat needs to still function and give accurate status.

Version-Release number of selected component (if applicable):

kernel - 2.6.9-11.EL_smp
gfs - 6.1
cluster suite - 4

How reproducible:

Everytime

Steps to Reproduce:
1. configuration a 2 or 3 node cluster (believe it will be the same on any 
number though) for manual fencing
2. pull the heartbeat or do something to stop the heartbeat communication to a 
member.
3. verify the member was fenced off
  
4. go to another that should still have quorum and try executing clustat

Actual results:

clustat hangs

Expected results:

- clustat should not hang and show the fenced off node no longer in the 
cluster.

Additional info:

please let me know if any additional info is required to reproduce. this is a 
high priority item for us. 

thanks in advance.

Jim
Comment 1 Lon Hohberger 2005-07-13 14:39:35 EDT
clustat is really a piece of rgmanager, which will stop during transitions if
GFS is in use.

clustat should probably time out after a few seconds of trying to reach clurgmgrd.



Comment 2 Henry Harris 2005-09-09 14:07:26 EDT
Created attachment 118651 [details]
strace of a clustat hang

This clustat hang occurred while running the test described in bug #166701.
Comment 3 Henry Harris 2005-09-09 14:07:39 EDT
Created attachment 118652 [details]
strace of a clustat hang

This clustat hang occurred while running the test described in bug #166701.
Comment 4 Henry Harris 2005-09-09 14:09:09 EDT
Created attachment 118653 [details]
strace of clusvcadm hang

This clusvcadm hang occurred while running the test described in bug #166701.
Comment 5 Henry Harris 2005-09-09 14:09:17 EDT
Created attachment 118654 [details]
strace of clusvcadm hang

This clusvcadm hang occurred while running the test described in bug #166701.
Comment 8 Lon Hohberger 2005-12-16 13:46:17 EST
Created attachment 122339 [details]
This is a DLM hang, not an rgmanager/clustat problem per se. 

Rgmanager goes into D (disk-wait/task-uninterruptible state) waiting on the
DLM; here's a Sysrq-T when this happens.
Comment 10 Christine Caulfield 2005-12-22 05:21:11 EST
With luck this will turn out to be the same bug as #175805.

Has anybody tested with this fix in place ?

Note You need to log in before you can comment on or make changes to this bug.