|Summary:||clustat hangs when member is fenced off.|
|Product:||[Retired] Red Hat Cluster Suite||Reporter:||jim wilcox <jim.wilcox>|
|Component:||dlm||Assignee:||Christine Caulfield <ccaulfie>|
|Status:||CLOSED WORKSFORME||QA Contact:||Cluster QE <mspqa-list>|
|Version:||4||CC:||cluster-maint, henry.harris, kanderso|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2006-05-05 07:27:57 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:||171153|
Description jim wilcox 2005-07-13 17:43:32 UTC
Description of problem: - With both 2 and 3 node clusters (and suspect this is true for any number of nodes in a cluster)when a member is manually fenced off clustat hangs on the nodes still in quorum. - All gfs related operations also hang - which we would expect, but we still anticipate that clustat needs to still function and give accurate status. Version-Release number of selected component (if applicable): kernel - 2.6.9-11.EL_smp gfs - 6.1 cluster suite - 4 How reproducible: Everytime Steps to Reproduce: 1. configuration a 2 or 3 node cluster (believe it will be the same on any number though) for manual fencing 2. pull the heartbeat or do something to stop the heartbeat communication to a member. 3. verify the member was fenced off 4. go to another that should still have quorum and try executing clustat Actual results: clustat hangs Expected results: - clustat should not hang and show the fenced off node no longer in the cluster. Additional info: please let me know if any additional info is required to reproduce. this is a high priority item for us. thanks in advance. Jim
Comment 1 Lon Hohberger 2005-07-13 18:39:35 UTC
clustat is really a piece of rgmanager, which will stop during transitions if GFS is in use. clustat should probably time out after a few seconds of trying to reach clurgmgrd.
Comment 2 Henry Harris 2005-09-09 18:07:26 UTC
Created attachment 118651 [details] strace of a clustat hang This clustat hang occurred while running the test described in bug #166701.
Comment 3 Henry Harris 2005-09-09 18:07:39 UTC
Created attachment 118652 [details] strace of a clustat hang This clustat hang occurred while running the test described in bug #166701.
Comment 4 Henry Harris 2005-09-09 18:09:09 UTC
Created attachment 118653 [details] strace of clusvcadm hang This clusvcadm hang occurred while running the test described in bug #166701.
Comment 5 Henry Harris 2005-09-09 18:09:17 UTC
Created attachment 118654 [details] strace of clusvcadm hang This clusvcadm hang occurred while running the test described in bug #166701.
Comment 8 Lon Hohberger 2005-12-16 18:46:17 UTC
Created attachment 122339 [details] This is a DLM hang, not an rgmanager/clustat problem per se. Rgmanager goes into D (disk-wait/task-uninterruptible state) waiting on the DLM; here's a Sysrq-T when this happens.
Comment 10 Christine Caulfield 2005-12-22 10:21:11 UTC
With luck this will turn out to be the same bug as #175805. Has anybody tested with this fix in place ?