Bug 130691

Summary: filesystem gets hung if a node holding the listlock expires
Product: [Retired] Red Hat Cluster Suite Reporter: Adam "mantis" Manthei <amanthei>
Component: gfsAssignee: Brian Stevens <bstevens>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 3   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-29 21:49:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 137219    
Attachments:
Description Flags
insert a couple breakpoints into lock_gulm.o for testing
none
make jid mappings ignore expired state of locks none

Description Adam "mantis" Manthei 2004-08-23 18:59:52 UTC
Description of problem:
If a node holding the listlock for a filesystem expires, recovery and
mounting of that filesystem will hang.  This is because inorder to
clean up the lock space for GFS, you must first access the lock space
for GFS (which is where the jid mappings are stored).  Since these jid
specific locks are being accessed without the IgnoreExpire flag set,
they will block untill the holder of the listlock is replayed, which
can not happen unless the listlock is held.

Version-Release number of selected component (if applicable):
GFS-6.0.0-1.2

How reproducible:
Very rare.

Steps to Reproduce:
1. Mount a GFS filesystem on Node1
2. Cause Node2 to expire
3. Crash Node1 while it holds the listlock
4. GFS filesystem is now hung.  No other node can mount that
filesystem or recover that filesystem until the entire lockspace is
cleaned (all lock servers are shutdown and then restarted) or the
filesystem is given a new name.  


Additional info:
I ran into this problem twice while running some recovery tests.  Both
cases required that the nodes got bounced rather frequently (every
couple minutes).

Comment 1 Adam "mantis" Manthei 2004-08-23 19:18:28 UTC
Created attachment 102993 [details]
insert a couple breakpoints into lock_gulm.o for testing

The attached patch creates some breakpoints for reproducing the bug.  Two
reproduce the bug with this patch:

1. Mount GFS on Node1
2. Load lock_gulm.o with the breakpoint number on Node2
   `insmod gulm_breakpoint=1 lock_gulm.o`
3. Crash Node1
4. Node2 will now panic when trying to recover Node1.  Once this happens, no
   new nodes can mount (any other node that may have been mounted at the time
will not be able to replay the journal for Node1 or Node2 either)

Comment 2 Adam "mantis" Manthei 2004-08-23 20:58:08 UTC
Created attachment 103002 [details]
make jid mappings ignore expired state of locks

The attached patch makes the jid mapping requests use the
lg_lock_flag_IgnoreExp (ignore expired flag) when aquiring the listlock and
journal locks.	I've done some rather basic testing and it seems to work.  I'm
waiting for Mike Tilstra to review the code once he returns back from vacation.


(Does this need to also be set when dealing with unlock requests?  I don't
think it does since it means that the holder of the lock is trying to unlock
it's locks while expired, which is just not allowed)

Comment 3 michael conrad tadpol tilstra 2004-08-30 14:42:11 UTC
Unlocks always work.  The only thing that will block an unlock request
is a state update to the slave servers.  Which happen pretty quick.

looking at patch now......

Comment 4 michael conrad tadpol tilstra 2004-08-30 15:34:05 UTC
looks ok to me.

Comment 7 Lon Hohberger 2010-10-29 21:49:09 UTC
This bugzilla is reported to have been fixed years ago.