Bug 221629

Summary: groupd can create duplicate global ids
Product: Red Hat Enterprise Linux 5 Reporter: David Teigland <teigland>
Component: cmanAssignee: David Teigland <teigland>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.0CC: ccaulfie, cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RC Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-02-08 02:09:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Teigland 2007-01-05 18:47:08 UTC
Description of problem:

groupd creates uint32 global id's for each group.  It doesn't
use them itself, but provides them to each registered app to use
if it wants.  (The dlm and gfs each use the global id in messages
to distinguish between different lockspaces/fs's.)  groupd's
method of creating these gid's (local counter | local nodeid)
can result in duplicate gid's in the cluster given a somewhat
uncommon sequence of events.

This has been sitting on my todo list for a long time, have finally
fixed it.

Version-Release number of selected component (if applicable):


How reproducible:

uncommon

Steps to Reproduce:
1. mount fsX on nodeA
2. mount fsX on nodeB
3. umount fsX on nodeA
4. stop cluster stuff on nodeA
5. start cluster stuff on nodeA
6. mount fsY on nodeA
7. mount fsY on nodeB
  
Actual results:

dlm messages will get mixed up between the X and Y lockspaces causing
dlm recovery to be stuck

Expected results:


Additional info:

Comment 1 David Teigland 2007-01-05 18:50:31 UTC
Fix checked into HEAD and RHEL5 branches.


Comment 2 Kiersten (Kerri) Anderson 2007-01-05 19:43:39 UTC
Requesting blocker status for this defect.  Closes a hole discovered during dlm
recovery unit testing.  Fix is available and impact is minimal.  We are
rebuilding these packages in the next week, so can pick up this fix as well.

Comment 4 David Teigland 2007-01-05 19:57:21 UTC
checked into RHEL50 branch.

Checking in app.c;
/cvs/cluster/cluster/group/daemon/app.c,v  <--  app.c
new revision: 1.52.4.2; previous revision: 1.52.4.1
done
Checking in cpg.c;
/cvs/cluster/cluster/group/daemon/cpg.c,v  <--  cpg.c
new revision: 1.36.4.2; previous revision: 1.36.4.1
done
Checking in gd_internal.h;
/cvs/cluster/cluster/group/daemon/gd_internal.h,v  <--  gd_internal.h
new revision: 1.44.4.2; previous revision: 1.44.4.1
done


Comment 5 RHEL Program Management 2007-02-08 02:09:42 UTC
A package has been built which should help the problem described in 
this bug report. This report is therefore being closed with a resolution 
of CURRENTRELEASE. You may reopen this bug report if the solution does 
not work for you.