Bug 187073 - lock_gulmd GenerationID mismatch at startup, cannot mount the GFS filesystem
Summary: lock_gulmd GenerationID mismatch at startup, cannot mount the GFS filesystem
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: gulm
Version: 3
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Chris Feist
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-03-28 08:45 UTC by Alban Crequy
Modified: 2009-04-16 20:02 UTC (History)
1 user (show)

Fixed In Version: GFS-6.0.2.35-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-09-29 20:57:30 UTC
Embargoed:


Attachments (Terms of Use)
syslog-gulm-sam12 (2.74 KB, application/octet-stream)
2006-03-28 08:45 UTC, Alban Crequy
no flags Details
syslog-gulm-sam16 (2.79 KB, application/octet-stream)
2006-03-28 08:46 UTC, Alban Crequy
no flags Details
syslog-gulm-sam56 (4.24 KB, application/octet-stream)
2006-03-28 08:48 UTC, Alban Crequy
no flags Details

Description Alban Crequy 2006-03-28 08:45:28 UTC
Description of problem:
On my 5-nodes cluster, I'm able to mount my GFS filesystem on 4 nodes, but on
sam56, mount blocks ("ps" show it is in state "D").

My nodes are named sam12, sam16, sam17, sam56 and sam57.

Version-Release number of selected component (if applicable):
GFS-6.0.2.27-0 on 5 nodes (2 nodes are x86_64, 3 nodes are i686)

How reproducible:
Just one time. When I reboot all the cluster, it works again.

Steps to Reproduce:
1. start GFS on 5 nodes
  
Actual results:
With "gulm_tool nodelist localhost", I see:
 - that all nodes except sam56 think that
   1/ sam12 is the master
   2/ sam16, sam17, sam57 are slaves
   3/ sam56 is not in the list
  - sam56 think that sam16 is arbitrating.
sam56 is unable to mount the GFS filesystem.

Expected results:
sam56 is able to mount the GFS filesystem like others nodes.

Additional info:
in cluster.ccs:
lock_gulm {
servers=["sam12.toulouse","sam16.toulouse","sam17.toulouse","sam56.toulouse","sam57.toulouse"]

Comment 1 Alban Crequy 2006-03-28 08:45:28 UTC
Created attachment 126886 [details]
syslog-gulm-sam12

Comment 2 Alban Crequy 2006-03-28 08:46:56 UTC
Created attachment 126887 [details]
syslog-gulm-sam16

Comment 3 Alban Crequy 2006-03-28 08:48:22 UTC
Created attachment 126888 [details]
syslog-gulm-sam56

Comment 4 Chris Feist 2006-03-28 16:49:08 UTC
It appears the problem occured because sam56 thought that sam16 was the master,
but sam16 actually became a slave to sam12.  A workaround to this problem is to
just restart lock_gulmd on sam56.  I'll work on a patch to fix this issue.

Comment 5 Chris Feist 2006-09-29 20:57:30 UTC
This should not be occuring any more w/ the latest GFS.  Please re-open if this
becomes an issue.


Note You need to log in before you can comment on or make changes to this bug.