Bug 382671 - mount hung after recovery, lock_gulmd_LT in busy wait
mount hung after recovery, lock_gulmd_LT in busy wait
Status: CLOSED DUPLICATE of bug 252209
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gulm (Show other bugs)
4
All Linux
low Severity low
: ---
: ---
Assigned To: Chris Feist
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-11-14 10:52 EST by Nate Straz
Modified: 2009-04-16 16:33 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-17 11:21:17 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
gzipped tcpdump of port 41040 from morph-01 (8.79 MB, application/x-gzip)
2007-11-14 10:55 EST, Nate Straz
no flags Details

  None (edit)
Description Nate Straz 2007-11-14 10:52:03 EST
Description of problem:

While doing GFS recovery testing with lock_gulm a mount hung after shooting only
one node.  The lock_gulmd_LT threads on all nodes are very busy.  I did a packet
capture of port 41040 on the master to hopefully shed some light on what is
going on.


Version-Release number of selected component (if applicable):
gulm-1.0.10-0
kernel-hugemem-2.6.9-67.EL
GFS-6.1.15-1
GFS-kernel-hugemem-2.6.9-75.9


How reproducible:
Unknown

Actual results:

Senario iteration 1.1 started at Tue Nov 13 16:09:51 CST 2007
Sleeping 5 minute(s) to let the I/O get its lock count up...
        Gulm Status
        ===========
        morph-02: Client
        morph-04: Client
        morph-05: Master
        morph-03: Slave
        morph-01: Slave
Senario: GULM kill Master

Those picked to face the revolver... morph-05 
...
checking Gulm recovery...
Verifying that clvmd was started properly on the dueler(s)
mounting /dev/mapper/morph--cluster-morph--cluster0 on /mnt/morph-cluster0 on
morph-05
mounting /dev/mapper/morph--cluster-morph--cluster1 on /mnt/morph-cluster1 on
morph-05
(hung)

Expected results:
The mount should not hang.

Additional info:

The recovery was done with a load on each node.
Comment 1 Nate Straz 2007-11-14 10:55:52 EST
Created attachment 258231 [details]
gzipped tcpdump of port 41040 from morph-01
Comment 2 Nate Straz 2007-11-15 09:30:35 EST
I think I've hit this twice now.  The most recent time I thought it was a hung
mount after losing quorum, but after re-fencing the nodes which were fenced the
mount did not continue.  
Comment 3 Nate Straz 2008-07-17 11:21:17 EDT

*** This bug has been marked as a duplicate of 252209 ***

Note You need to log in before you can comment on or make changes to this bug.