Bug 487608 - GFS2: gfs2_tool unfreeze hangs
Summary: GFS2: gfs2_tool unfreeze hangs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gfs2-utils
Version: 5.3
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Robert Peterson
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-02-26 22:52 UTC by Robert Peterson
Modified: 2010-01-12 03:42 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-02 11:02:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Don't let gfs2_tool touch the mountpoint (2.96 KB, patch)
2009-03-08 17:20 UTC, Abhijith Das
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:1337 0 normal SHIPPED_LIVE Low: gfs2-utils security and bug fix update 2009-09-01 10:41:56 UTC

Description Robert Peterson 2009-02-26 22:52:41 UTC
Description of problem:
gfs2_tool unfreeze hangs

Version-Release number of selected component (if applicable):
RHEL5 with the 2.6.18-132 kernel

How reproducible:
Always

Steps to Reproduce:
1. mount -tgfs2 /dev/exxon_vg/exxon_lv /mnt/gfs2
2. gfs2_tool freeze /mnt/gfs2
3. echo foo > /mnt/gfs2/gronk &
4. gfs2_tool unfreeze /mnt/gfs2
  
Actual results:
Unfreeze hangs permanently

Expected results:
Unfreeze should work

Additional info:

Comment 1 Eric Sandeen 2009-02-27 06:03:28 UTC
Did a little debugging; it never even makes it to ->unfreeze_fs, some other lock above must be holding on.

Also xfs_io (which I was using to try the unfreeze) can't open the mountpoint:

open("/mnt/test", O_RDONLY <HANG>

gfs2_tool also hangs up at:

stat("/mnt/test", <HANG>

but a direct twiddle of the sysfs file works fine.

gfs2_tool is stuck down a path like this:

gfs2_tool     D ffff81000101d480     0  3442   3441                     (NOTLB)
 ffff81012a4f1cf8 0000000000000086 0007810100000007 ffffffff801248d7
 000001a400000005 0000000000000007 ffff81013a9770c0 ffff810104796100
 0000012dfdcbec62 0000000000002a04 ffff81013a9772a8 00000003000001a4
Call Trace:
 [<ffffffff801248d7>] avc_has_perm+0x43/0x55
 [<ffffffff88650150>] :gfs2:just_schedule+0x0/0xe
 [<ffffffff88650159>] :gfs2:just_schedule+0x9/0xe
 [<ffffffff80063ac7>] __wait_on_bit+0x40/0x6e
 [<ffffffff88650150>] :gfs2:just_schedule+0x0/0xe
 [<ffffffff80063b61>] out_of_line_wait_on_bit+0x6c/0x78
 [<ffffffff8009dbe3>] wake_bit_function+0x0/0x23
 [<ffffffff8865014b>] :gfs2:gfs2_glock_wait+0x2b/0x30
 [<ffffffff8865d861>] :gfs2:gfs2_getattr+0x85/0xc4
 [<ffffffff8865d859>] :gfs2:gfs2_getattr+0x7d/0xc4
 [<ffffffff8000dfe9>] vfs_getattr+0x2d/0xa9
 [<ffffffff80027fe7>] vfs_stat_fd+0x32/0x4a
 [<ffffffff800bd0b0>] utrace_quiescent+0x20f/0x256
 [<ffffffff80022d97>] sys_newstat+0x19/0x31
 [<ffffffff8005d229>] tracesys+0x71/0xe0
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

and xfs_io like this:

xfs_io        D ffff810001004400     0  3618   3615                     (NOTLB)
 ffff810128925c88 0000000000000082 000001a400000071 ffff810128f4f100
 ffff81013ebc0c80 0000000000000004 ffff810128f4f100 ffffffff802f0ae0
 0000014672a7fb54 00000000000913bb ffff810128f4f2e8 0000000000100000
Call Trace:
 [<ffffffff801248d7>] avc_has_perm+0x43/0x55
 [<ffffffff88650150>] :gfs2:just_schedule+0x0/0xe
 [<ffffffff88650159>] :gfs2:just_schedule+0x9/0xe
 [<ffffffff80063ac7>] __wait_on_bit+0x40/0x6e
 [<ffffffff88650150>] :gfs2:just_schedule+0x0/0xe
 [<ffffffff80063b61>] out_of_line_wait_on_bit+0x6c/0x78
 [<ffffffff8009dbe3>] wake_bit_function+0x0/0x23
 [<ffffffff8865014b>] :gfs2:gfs2_glock_wait+0x2b/0x30
 [<ffffffff8865f250>] :gfs2:gfs2_permission+0x83/0xd3
 [<ffffffff8865f248>] :gfs2:gfs2_permission+0x7b/0xd3
 [<ffffffff8000d5da>] permission+0x81/0xc8
 [<ffffffff80011f05>] may_open+0x65/0x22f
 [<ffffffff8001aafb>] open_namei+0x2c4/0x6d5
 [<ffffffff80066bcd>] do_page_fault+0x4fe/0x830
 [<ffffffff80026cdb>] do_filp_open+0x1c/0x38
 [<ffffffff80019676>] do_sys_open+0x44/0xbe
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

and the "echo" process is this:

bash          R  running task       0  3383   2927                     (NOTLB)
bash          D ffff810104634338     0  3614   2999          3615       (NOTLB)
 ffff8101289e3ae8 0000000000000086 0000000000000000 0000000000000000
 0000000000000000 0000000000000007 ffff81013f590860 ffff810128f4f860
 00000146725cc4fe 00000000000402e2 ffff81013f590a48 00000001800133e6
Call Trace:
 [<ffffffff80025390>] find_or_create_page+0x22/0x72
 [<ffffffff88650150>] :gfs2:just_schedule+0x0/0xe
 [<ffffffff88650159>] :gfs2:just_schedule+0x9/0xe
 [<ffffffff80063ac7>] __wait_on_bit+0x40/0x6e
 [<ffffffff88650150>] :gfs2:just_schedule+0x0/0xe
 [<ffffffff80063b61>] out_of_line_wait_on_bit+0x6c/0x78
 [<ffffffff8009dbe3>] wake_bit_function+0x0/0x23
 [<ffffffff8865014b>] :gfs2:gfs2_glock_wait+0x2b/0x30
 [<ffffffff88666f0d>] :gfs2:gfs2_do_trans_begin+0xd6/0x144
 [<ffffffff886536bf>] :gfs2:gfs2_createi+0x114/0xd28
 [<ffffffff8012d391>] sidtab_context_to_sid+0x93/0x1d9
 [<ffffffff801312b8>] security_compute_sid+0x307/0x329
 [<ffffffff801248d7>] avc_has_perm+0x43/0x55
 [<ffffffff8865e80f>] :gfs2:gfs2_create+0x65/0x143
 [<ffffffff8865360e>] :gfs2:gfs2_createi+0x63/0xd28
 [<ffffffff80039e74>] vfs_create+0xe6/0x158
 [<ffffffff8001a9d4>] open_namei+0x19d/0x6d5
 [<ffffffff80026cdb>] do_filp_open+0x1c/0x38
 [<ffffffff80019676>] do_sys_open+0x44/0xbe
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

I suppose it's holding something that the stat & open need, but it's blocked by the freeze.

-Eric

Comment 2 Abhijith Das 2009-03-08 17:20:45 UTC
Created attachment 334446 [details]
Don't let gfs2_tool touch the mountpoint

gfs2_tool tries to stat() the mountpoint in order to get to the device number and ultimately get to the sysfs freeze tunable.

When the filesystem is frozen, followed by another process holding an exclusive lock on the mountpoint (eg. touch creating a new file in root dir), gfs2_tool hangs behind this lock at the stat() call.

This patch makes freeze/unfreeze stat the block device instead of the mountpoint.

Comment 3 Steve Whitehouse 2009-03-09 17:03:34 UTC
Patch looks good to me.

Comment 4 Abhijith Das 2009-03-16 15:33:35 UTC
Checked in patch to RHEL5, STABLE3 and master.

Comment 6 Ray Van Dolson 2009-05-08 01:16:24 UTC
What version of gfs2-utils is this fixed in?  Is there a scratch build somewhere?  Using gfs2-utils-0.1.53-1.el5_3.2 and this appears to still be happening.

Comment 7 Nate Straz 2009-05-08 02:31:45 UTC
(In reply to comment #6)
> What version of gfs2-utils is this fixed in? 

You could check gfs2-utils-0.1.55-1.el5.  This bug is not listed in the changelog for that version, but it was checked into the tree before it was built.

Comment 8 Ray Van Dolson 2009-05-08 03:46:38 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > What version of gfs2-utils is this fixed in? 
> 
> You could check gfs2-utils-0.1.55-1.el5.  This bug is not listed in the
> changelog for that version, but it was checked into the tree before it was
> built.  

I don't know where to find that.  I'll just be patient and wait for the errata release. :-)

Thanks.

Comment 11 Jan Stodola 2009-07-24 08:25:03 UTC
Tested on RHEL5.4 Snapshot 3 with gfs2-utils-0.1.61-1.el5, arch: x86_64 in kvm

# mount -tgfs2 /dev/VolGroup00/gfs2 /mnt/gfs2
# gfs2_tool freeze /mnt/gfs2
# echo foo > /mnt/gfs2/gronk &
[1] 2379
# gfs2_tool unfreeze /mnt/gfs2
[1]+  Done                 echo foo > /mnt/gfs2/gronk
# cat /mnt/gfs2/gronk
foo

Moving to VERIFIED

Comment 12 Ray Van Dolson 2009-07-24 18:49:55 UTC
Should we be able to take cLVM snapshots now as a result of this being fixed?

Comment 14 errata-xmlrpc 2009-09-02 11:02:12 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1337.html


Note You need to log in before you can comment on or make changes to this bug.