Easily reproducible panic when unmounting a GFS2 filesystem. It seems like I saw this panic in rawhide a few months ago, so I expect that this is a known issue. general protection fault: 0000 [1] SMP last sysfs file: /kernel/dlm/lt1/event_done CPU 1 Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lock_dlm gfs2 dlm configfs rpcsec_gss_krb5 auth_rpcgss testmgr_cipher testmgr aead crypto_blkcipher crypto_algapi des sunrpc ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy xen_vbd 8139too i2c_piix4 xen_platform_pci 8139cp i2c_core mii pcspkr serio_raw dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 2639, comm: umount.gfs2 Not tainted 2.6.18-128.el5debug #1 RIP: 0010:[<ffffffff8012304c>] [<ffffffff8012304c>] debugfs_remove+0x12/0xc2 RSP: 0018:ffff8100081ede68 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000001 RDX: ffff81001e7858c0 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6b RBP: ffffffff8848c880 R08: ffff81001e7858c0 R09: 0000000000000001 R10: 0000000000000246 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 00007fff35b0b250 R15: 0000000000000000 FS: 00002b9274fb0210(0000) GS:ffff81001ffea430(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002b6672dbb51c CR3: 00000000081d2000 CR4: 00000000000006e0 Process umount.gfs2 (pid: 2639, threadinfo ffff8100081ec000, task ffff810006afa240) Stack: 00007fff35b0b250 ffff81000e85a000 ffffffff8848c880 ffffffff8845fddd ffff8100180ae2c8 ffffffff800ea763 ffff8100180ae2c8 ffff81001c7ef5f0 ffff8100180ae2c8 ffffffff800f4225 ffff81000f3ac410 ffff81001c7ef5f0 Call Trace: [<ffffffff8845fddd>] :gfs2:gfs2_delete_debugfs_file+0x24/0x48 [<ffffffff800ea763>] deactivate_super+0x6c/0x84 [<ffffffff800f4225>] sys_umount+0x246/0x28a [<ffffffff800be99f>] audit_syscall_entry+0x16e/0x1a1 [<ffffffff800602a6>] tracesys+0xd5/0xdf Code: 48 8b 6f 58 48 85 ed 0f 84 9f 00 00 00 48 8b 45 40 48 85 c0 RIP [<ffffffff8012304c>] debugfs_remove+0x12/0xc2 RSP <ffff8100081ede68> <0>Kernel panic - not syncing: Fatal exception
Created attachment 330912 [details] This should fix the problem This is the RHEL5 version of the following upstream fix: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=88a19ad066c1aab2f9713beb670525fcc06e1c09
Posted above patch to rhkernel-list
Do we need to clone this for 5.3.z ?
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Confirmed. Patch seems to fix the panic.
Subhendu, I'd like to have this cloned for z-stream for 5.3. Can I do that, or do I have to ask you to do it, or ...?
What was the procedure to reproduce this panic?
Just umount the fs. Some people always seem to see it, others never see it. Its a simple use-after-free bug.
It's easily reproducible if you're using kernel-debug since the memory poisoning helps trigger the panic.
Updating PM score.
in kernel-2.6.18-132.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
I tried the scenario in comment #11 with the debug version of 2.6.18-154.el5 and was not able to hit the panic.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html