Bug 187073

Summary: lock_gulmd GenerationID mismatch at startup, cannot mount the GFS filesystem
Product: [Retired] Red Hat Cluster Suite Reporter: Alban Crequy <alban.crequy>
Component: gulmAssignee: Chris Feist <cfeist>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 3CC: cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: GFS-6.0.2.35-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-29 20:57:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
syslog-gulm-sam12
none
syslog-gulm-sam16
none
syslog-gulm-sam56 none

Description Alban Crequy 2006-03-28 08:45:28 UTC
Description of problem:
On my 5-nodes cluster, I'm able to mount my GFS filesystem on 4 nodes, but on
sam56, mount blocks ("ps" show it is in state "D").

My nodes are named sam12, sam16, sam17, sam56 and sam57.

Version-Release number of selected component (if applicable):
GFS-6.0.2.27-0 on 5 nodes (2 nodes are x86_64, 3 nodes are i686)

How reproducible:
Just one time. When I reboot all the cluster, it works again.

Steps to Reproduce:
1. start GFS on 5 nodes
  
Actual results:
With "gulm_tool nodelist localhost", I see:
 - that all nodes except sam56 think that
   1/ sam12 is the master
   2/ sam16, sam17, sam57 are slaves
   3/ sam56 is not in the list
  - sam56 think that sam16 is arbitrating.
sam56 is unable to mount the GFS filesystem.

Expected results:
sam56 is able to mount the GFS filesystem like others nodes.

Additional info:
in cluster.ccs:
lock_gulm {
servers=["sam12.toulouse","sam16.toulouse","sam17.toulouse","sam56.toulouse","sam57.toulouse"]

Comment 1 Alban Crequy 2006-03-28 08:45:28 UTC
Created attachment 126886 [details]
syslog-gulm-sam12

Comment 2 Alban Crequy 2006-03-28 08:46:56 UTC
Created attachment 126887 [details]
syslog-gulm-sam16

Comment 3 Alban Crequy 2006-03-28 08:48:22 UTC
Created attachment 126888 [details]
syslog-gulm-sam56

Comment 4 Chris Feist 2006-03-28 16:49:08 UTC
It appears the problem occured because sam56 thought that sam16 was the master,
but sam16 actually became a slave to sam12.  A workaround to this problem is to
just restart lock_gulmd on sam56.  I'll work on a patch to fix this issue.

Comment 5 Chris Feist 2006-09-29 20:57:30 UTC
This should not be occuring any more w/ the latest GFS.  Please re-open if this
becomes an issue.