Description of problem: On my 5-nodes cluster, I'm able to mount my GFS filesystem on 4 nodes, but on sam56, mount blocks ("ps" show it is in state "D"). My nodes are named sam12, sam16, sam17, sam56 and sam57. Version-Release number of selected component (if applicable): GFS-6.0.2.27-0 on 5 nodes (2 nodes are x86_64, 3 nodes are i686) How reproducible: Just one time. When I reboot all the cluster, it works again. Steps to Reproduce: 1. start GFS on 5 nodes Actual results: With "gulm_tool nodelist localhost", I see: - that all nodes except sam56 think that 1/ sam12 is the master 2/ sam16, sam17, sam57 are slaves 3/ sam56 is not in the list - sam56 think that sam16 is arbitrating. sam56 is unable to mount the GFS filesystem. Expected results: sam56 is able to mount the GFS filesystem like others nodes. Additional info: in cluster.ccs: lock_gulm { servers=["sam12.toulouse","sam16.toulouse","sam17.toulouse","sam56.toulouse","sam57.toulouse"]
Created attachment 126886 [details] syslog-gulm-sam12
Created attachment 126887 [details] syslog-gulm-sam16
Created attachment 126888 [details] syslog-gulm-sam56
It appears the problem occured because sam56 thought that sam16 was the master, but sam16 actually became a slave to sam12. A workaround to this problem is to just restart lock_gulmd on sam56. I'll work on a patch to fix this issue.
This should not be occuring any more w/ the latest GFS. Please re-open if this becomes an issue.