Bug 206470 - gfs recovery mixed with unmounting can hang
Summary: gfs recovery mixed with unmounting can hang
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: gfs
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: David Teigland
QA Contact: GFS Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-09-14 16:16 UTC by David Teigland
Modified: 2010-01-12 03:13 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-08-13 20:14:29 UTC
Embargoed:


Attachments (Terms of Use)

Description David Teigland 2006-09-14 16:16:12 UTC
Description of problem:

gfs/lock_dlm get a callback to do recovery at the same time that
a local gfs unmount happens.

lock_dlm prints "pr_start 31060 skip for umount/wd"
and tries to do a kcl_service_leave() which won't work
because the service (in SM) is still in recovery state 2
and needs a start_done() ack from lock_dlm.  In this case,
the node that got the unmount and recovery callback at
the same time was the only node with the fs mounted.

Version-Release number of selected component (if applicable):


How reproducible:

Do a test with lots of mounting/unmounting and throw in some node
failures and you'll run into this.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 2 David Teigland 2006-10-17 16:48:42 UTC
This will require a lot of work, will put it off until it becomes
an issue for someone.


Comment 3 Kiersten (Kerri) Anderson 2006-11-10 16:24:46 UTC
Moving out for consideration for 4.6

Comment 4 David Teigland 2007-08-13 20:14:29 UTC
This issue has never actually been seen, so not planning on changing it
(which would be a high regression risk).



Note You need to log in before you can comment on or make changes to this bug.