This is really fun, do something like mount -o noatim and watch gfs2 oops GFS2: can't parse mount arguments BUG: unable to handle kernel NULL pointer dereference at virtual address 000009d8 printing eip: f8c215cb *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lock_dlm gfs2 dlm configfs sunrpc sg iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 dm_multipath video sbs i2c_ec button battery asus_acpi ac parport_pc lp parport floppy pcspkr i2c_piix4 i2c_core cfi_probe gen_probe scb2_flash mtdcore chipreg tg3 serio_raw ide_cd cdrom dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd CPU: 0 EIP: 0060:[<f8c215cb>] Not tainted VLI EFLAGS: 00010246 (2.6.21-rc1 #2) EIP is at gfs2_delete_debugfs_file+0x0/0x10 [gfs2] eax: 00000000 ebx: ef554000 ecx: 00000000 edx: 00000000 esi: f8c47e00 edi: ef554000 ebp: f19d3e40 esp: ef5c1d78 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process mount.gfs2 (pid: 4292, ti=ef5c1000 task=f19ac030 task.ti=ef5c1000) Stack: f8c2c6fd ef554000 c04701b9 ffffffea 00000000 c04707a2 f19d3e40 322d6d64 c0457600 00000044 f7da8ec0 00000246 f1819000 00000000 000280d0 f181a000 f02c1740 f8c47e00 f181a000 f8c2c726 f181a000 f8c2d533 f02c1740 00000000 Call Trace: [<f8c2c6fd>] gfs2_kill_sb+0xe/0x16 [gfs2] [<c04701b9>] deactivate_super+0x52/0x65 [<c04707a2>] get_sb_bdev+0xe6/0x11b [<c0457600>] get_page_from_freelist+0x26e/0x2a5 [<f8c2c726>] gfs2_get_sb+0x21/0x3e [gfs2] [<f8c2d533>] fill_super+0x0/0x51d [gfs2] [<c047024f>] vfs_kern_mount+0x83/0xf6 [<c0470304>] do_kern_mount+0x2d/0x3e [<c0481a2a>] do_mount+0x608/0x67b [<c04376d5>] autoremove_wake_function+0x0/0x35 [<c05a9575>] sock_aio_read+0xfc/0x108 [<c0453610>] find_get_page+0x18/0x38 [<c060efcf>] __sched_text_start+0x937/0x9e7 [<c0450b47>] handle_fasteoi_irq+0x0/0xa6 [<c0406fdb>] do_IRQ+0xc5/0xda [<c048059f>] copy_mount_options+0x8e/0x109 [<c04224d7>] __cond_resched+0x16/0x34 [<c060f0a5>] cond_resched+0x26/0x31 [<c0481b14>] sys_mount+0x77/0xae [<c0404e4c>] syscall_call+0x7/0xb ======================= Code: 8b 41 08 8b 00 85 c0 75 0b ff 01 c7 41 0c 00 00 00 00 eb ba 89 41 08 83 79 08 00 74 b1 31 c0 c3 c3 a1 80 96 c4 f8 e9 2d 27 89 c7 <8b> 80 d8 09 00 00 85 c0 74 05 e9 1e 27 89 c7 c3 31 d2 b8 1b 87 EIP: [<f8c215cb>] gfs2_delete_debugfs_file+0x0/0x10 [gfs2] SS:ESP 0068:ef5c1d78
Created attachment 149502 [details] patch that resolves the problem this patch resolves the problem, submitted to cluster-devel list.
This patch is now upstream. Josef, can you post this patch on rhkernel-list too?
sure thing. did you mean to assign the bugzilla to me?
actually, since this is a bug with the lockdump patch that bob did, and since bob hasn't submitted that patch to rhkl yet should we just change his lockdump patch before he submits it to rhkl?
I'd prefer to keep them as separate patches for now, but I'll make sure that they are submitted together when the time comes.
This is queued for RHEL 5.1 as part of bz #228540 so we don't need this bug any more. *** This bug has been marked as a duplicate of 228540 ***
I hit this during mount_stress on kernel -107.el5. Reproducible: Every time Steps to Reproduce: 1. mount -t gfs2 -o garbage /dev/foo /mnt/foo 2. *panic* GFS2: can't parse mount arguments BUG: unable to handle kernel NULL pointer dereference at virtual address 0000061c printing eip: c04389b7 *pde = 00000000 Oops: 0002 [#1] SMP last sysfs file: /devices/pci0000:00/0000:00:02.0/0000:01:1f.0/0000:03:02.1/irq Modules linked in: lock_nolock gfs(U) lock_dlm gfs2 dlm gnbd(U) sctp configfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath video sbs backlight i2c_ec button battery asus_acpi ac lp sg floppy i2c_i801 ide_cd intel_rng i2c_core parport_pc e7xxx_edac pcspkr e1000 cdrom parport edac_mc dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd CPU: 0 EIP: 0060:[<c04389b7>] Tainted: G VLI EFLAGS: 00010246 (2.6.18-107.el5 #1) EIP is at down_write+0xf/0x19 eax: 0000061c ebx: 0000061c ecx: c200c8e0 edx: ffff0001 esi: 00000000 edi: 0000061c ebp: f22e6500 esp: f017cd50 ds: 007b es: 007b ss: 0068 Process mount.gfs2 (pid: 14754, ti=f017c000 task=f2ec1000 task.ti=f017c000) Stack: 00000000 f8e0164f 00000020 00000000 00000000 f017cda4 f7ded400 f7ded43c 00000000 f7ded400 f7ded400 f22e6500 f8e01a47 00000000 f8e06981 f7ded400 f8e21ee0 c0477137 ffffffea 00000000 c0477717 372d6d64 c0686000 000080d0 Call Trace: [<f8e0164f>] gfs2_log_flush+0x18/0x406 [gfs2] [<f8e01a47>] gfs2_meta_syncfs+0xa/0x31 [gfs2] [<f8e06981>] gfs2_kill_sb+0x11/0x52 [gfs2] [<c0477137>] deactivate_super+0x52/0x65 [<c0477717>] get_sb_bdev+0xdb/0x110 [<c045a046>] __alloc_pages+0x57/0x297 [<f8e069d4>] gfs2_get_sb+0x12/0x16 [gfs2] [<f8e078a0>] fill_super+0x0/0xaac [gfs2] [<c04771c7>] vfs_kern_mount+0x7d/0xf2 [<c047726e>] do_kern_mount+0x25/0x36 [<c0489e0d>] do_mount+0x5f5/0x665 [<c04360ab>] autoremove_wake_function+0x0/0x2d [<c05ab1f8>] do_sock_read+0xae/0xb7 [<c05ab78b>] sock_aio_read+0x53/0x61 [<c05af11d>] sock_def_readable+0x31/0x5b [<c0459d52>] get_page_from_freelist+0x96/0x333 [<c0488cff>] copy_mount_options+0x26/0x109 [<c0489eea>] sys_mount+0x6d/0xa5 [<c0404f17>] syscall_call+0x7/0xb =======================
Making this an release candidate blocker since it requires kernel changes. If we get the patch soon, will consider fighting to get it into the beta kernel builds.
Created attachment 316457 [details] Patch to fix problem This bug is not exactly the same as before, but the scenario is similar. The gfs2 superblock pointer is NULL after a failed mount. When control eventually goes to gfs2_kill_sb, we dereference this NULL pointer. This patch ensures that the gfs2 superblock pointer is not NULL before being dereferenced in gfs2_kill_sb.
Patch in comment#10 posted to rhkernel-list. Marking POST.
Created attachment 317092 [details] Updated patch
in kernel-2.6.18-116.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html