Bug 461290 - GFS2: mount during fsck protections not working
GFS2: mount during fsck protections not working
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gfs2-utils (Show other bugs)
All Linux
medium Severity medium
: rc
: ---
Assigned To: Robert Peterson
Cluster QE
Depends On:
  Show dependency treegraph
Reported: 2008-09-05 12:41 EDT by Nate Straz
Modified: 2010-01-11 22:41 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-01-20 15:52:05 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Proposed patch to fix the problem (397 bytes, patch)
2008-09-09 13:19 EDT, Robert Peterson
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0087 normal SHIPPED_LIVE gfs2-utils bug-fix update 2009-01-20 11:04:24 EST

  None (edit)
Description Nate Straz 2008-09-05 12:41:45 EDT
Description of problem:

While running gfs2_fsck_stress I found that the test case for mounting a file system while a fsck was running hung.  The test was doing the following.

1. start gfs2_fsck on node A
2. checking for "fsck" in sb_lockproto on node B

The check in step 2 never succeeded.

I checked a few other conditions to see what would happen.

Test: mount on node B while fsck is running on node A
Result: mount succeeds

Test: check sb_lockproto on node A while fsck is running on node A
Result: sb_lockproto = fsck_dlm

Test: mount on node A while fsck is running on node A
Result: panic

BUG: unable to handle kernel NULL pointer dereference at virtual address 0000061c
 printing eip:
*pde = 30672001
Oops: 0002 [#1]
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
Modules linked in: gnbd(U) lock_nolock gfs(U) lock_dlm gfs2 dlm configfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath video sbs backlight i2c_ec button battery asus_acpi ac lp floppy ide_cd e7xxx_edac intel_rng e1000 edac_mc i2c_i801 cdrom parport_pc pcspkr sg i2c_core parport dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
CPU:    1
EIP:    0060:[<c0438daf>]    Tainted: G      VLI
EFLAGS: 00010246   (2.6.18-107.el5PAE #1) 
EIP is at down_write+0xf/0x19
eax: 0000061c   ebx: 0000061c   ecx: 00000001   edx: ffff0001
esi: 00000000   edi: 0000061c   ebp: f291d4c0   esp: f057fd50
ds: 007b   es: 007b   ss: 0068
Process mount.gfs2 (pid: 7400, ti=f057f000 task=f387d550 task.ti=f057f000)
Stack: 00000000 f8d9264f 00000000 c062778d 00000000 f057fda4 f0fc7000 f2fc37c0 
       00000000 f0fc7000 f0fc7000 f291d4c0 f8d92a47 00000000 f8d9798d f0fc7000 
       f8db2ee0 c0475ddb fffffffe 00000000 c04763bb 322d6d64 c0684e00 000080d0 
Call Trace:
 [<f8d9264f>] gfs2_log_flush+0x18/0x406 [gfs2]
 [<f8d92a47>] gfs2_meta_syncfs+0xa/0x31 [gfs2]
 [<f8d9798d>] gfs2_kill_sb+0x11/0x52 [gfs2]
 [<c0475ddb>] deactivate_super+0x52/0x65
 [<c04763bb>] get_sb_bdev+0xdb/0x110
 [<c0457e76>] __alloc_pages+0x57/0x297
 [<f8d979e0>] gfs2_get_sb+0x12/0x16 [gfs2]
 [<f8d988ac>] fill_super+0x0/0xaac [gfs2]
 [<c0475e6b>] vfs_kern_mount+0x7d/0xf2
 [<c0475f12>] do_kern_mount+0x25/0x36
 [<c0488ad9>] do_mount+0x5f5/0x665
 [<c04541fd>] do_generic_mapping_read+0x3d0/0x3d8
 [<c045381d>] find_get_page+0x18/0x3f
 [<c04561dc>] filemap_nopage+0x192/0x312
 [<c045ff8d>] __handle_mm_fault+0x453/0xb7b
 [<c0457b82>] get_page_from_freelist+0x96/0x333
 [<c04879cb>] copy_mount_options+0x26/0x109
 [<c0488bb6>] sys_mount+0x6d/0xa5
 [<c0404f17>] syscall_call+0x7/0xb

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

See tests above
Comment 2 Robert Peterson 2008-09-09 13:19:10 EDT
Created attachment 316214 [details]
Proposed patch to fix the problem

The gfs2_fsck is trying to overwrite the superblock with the temporary
value as required, but I think the timing is off because the writes
were not synced to disk.  This patch adds the necessary fsync.
Comment 3 Robert Peterson 2008-09-09 18:19:18 EDT
The patch was tested on system roth-01 and pushed to the RHEL5,
STABLE2 and master branches of the cluster git tree.  Setting status
to modified.
Comment 5 Nate Straz 2008-11-04 17:01:54 EST
Verified with gfs2-utils-0.1.49-1.el5.
Comment 7 errata-xmlrpc 2009-01-20 15:52:05 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.