Bug 485336
| Summary: | GFS2 mount hung after recovery | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Nate Straz <nstraz> | ||||
| Component: | kernel | Assignee: | Steve Whitehouse <swhiteho> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Cluster QE <mspqa-list> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 5.3 | CC: | cluster-maint, edamato | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2009-02-17 14:57:51 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
The attachment seems to be corrupt: [steve@dolmen hang]$ tar -zxf ./z-mount-hang.tar.gz tar: This does not look like a tar archive tar: Skipping to next header tar: Error exit delayed from previous errors (In reply to comment #1) > The attachment seems to be corrupt: > > [steve@dolmen hang]$ tar -zxf ./z-mount-hang.tar.gz > tar: This does not look like a tar archive > tar: Skipping to next header > tar: Error exit delayed from previous errors When I downloaded it, it was double-gzipped. I changed the mime-type to application/octet-stream and now it downloads as a gzipped tarball. Looking at the hangalyzer output, I've spotted this: z2 : Z_Cluster1: G: s:SH n:1/2 f:lsDpr t:UN d:UN/3644766000 l:0 a:0 r:6 z2 : (locked, sticky, demote, demote in prog, rep ly pending) z2 : Z_Cluster1: H: s:SH f:W e:0 p:5725 [glock_workqueue] gfs2_do_trans_ begin+0xce/0x144 [gfs2] i.e. the glock_workqueue trying to get the transaction lock. This looks like a dup of #483541 to me. We already have a patch which should solve that problem, but its currently untested so far as I know. If you have no objections, I'll close this as a dup of #483541. *** This bug has been marked as a duplicate of bug 483541 *** |
Created attachment 331778 [details] collected debug information for mount hang Description of problem: While running revolver during 5.3.z testing of GFS2, I hit a state where the mount on one node hung. Running gfs2_hangalyzer found for locks with waiters but no holders. Attached is the output of gfs2_hangalyzer, SysRq-T, dlm lock dump and glock dump. Version-Release number of selected component (if applicable): kernel-2.6.18-128.1.1.el5 gfs2-utils-0.1.53-1.el5_3.1 How reproducible: Unknown Steps to Reproduce: 1. run revolver and pray Actual results: Expected results: Additional info: