Bug 614957
Summary: | ext4: mount error path corrupts slab memory | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Karsten Weiss <knweiss> |
Component: | kernel | Assignee: | Eric Sandeen <esandeen> |
Status: | CLOSED ERRATA | QA Contact: | Boris Ranto <branto> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 5.5 | CC: | bj+bugzilla, branto, esandeen, frans, green, ihok, rwheeler |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-01-13 21:43:30 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Karsten Weiss
2010-07-15 16:08:33 UTC
Thanks, I can reproduce this on rhel5 but not rhel6 or upstream, I'll take a look at it. -Eric Ok this is probably due to a stray kfree(&sbi->s_blockgroup_lock) in the error path of mount; upstream that matches an allocation, but in rhel5.5 it was a mistaken backport. Thanks to Johann @ lustre for pointing that out to me .... I fixed this by backporting commit 705895b61133ef43d106fe6a6bbdb2eec923867e Author: Pekka Enberg <penberg.fi> Date: Sun Feb 15 18:07:52 2009 -0500 ext4: allocate ->s_blockgroup_lock separately rather than by removing the extraneous kfree. -Eric This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. *** Bug 594446 has been marked as a duplicate of this bug. *** in kernel-2.6.18-219.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. Verified in kernel-2.6.18-219.el5. The mount failed all the times as expected. With kernel-2.6.18-219.el5, the bug was hit with second attempt to mount: NMI Watchdog detected LOCKUP on CPU 1 Call Trace: [<ffffffff8005c6b4>] cache_alloc_refill+0xf1/0x186 [<ffffffff800dc9e3>] kmem_cache_zalloc+0x6f/0x94 [<ffffffff8851d56f>] :ext4:ext4_fill_super+0xd5/0x20a5 [<ffffffff8851d49a>] :ext4:ext4_fill_super+0x0/0x20a5 [<ffffffff80153cb1>] snprintf+0x44/0x4c [<ffffffff800655ab>] __down_write_nested+0x12/0x92 [<ffffffff8012cb3a>] selinux_sb_alloc_security+0x3e/0x82 [<ffffffff800ed9be>] get_filesystem+0x12/0x3b [<ffffffff800e4490>] test_bdev_super+0x0/0xd [<ffffffff8851d49a>] :ext4:ext4_fill_super+0x0/0x20a5 [<ffffffff800e544f>] get_sb_bdev+0x10a/0x16c [<ffffffff8012d73d>] selinux_sb_copy_data+0x1a1/0x1c5 [<ffffffff800e4dec>] vfs_kern_mount+0x93/0x11a [<ffffffff800e4eb5>] do_kern_mount+0x36/0x4d [<ffffffff800ef2ed>] do_mount+0x6a9/0x719 [<ffffffff80009101>] __handle_mm_fault+0x96f/0xfaa [<ffffffff8002cd2c>] mntput_no_expire+0x19/0x89 [<ffffffff8000a759>] __link_path_walk+0xf1e/0xf42 [<ffffffff80022127>] __up_read+0x19/0x7f [<ffffffff80067b88>] do_page_fault+0x4fe/0x874 [<ffffffff8002cd2c>] mntput_no_expire+0x19/0x89 [<ffffffff8000ea75>] link_path_walk+0xa6/0xb2 [<ffffffff800cd378>] zone_statistics+0x3e/0x6d [<ffffffff8000f2ff>] __alloc_pages+0x78/0x308 [<ffffffff8004c9fd>] sys_mount+0x8a/0xcd [<ffffffff8005e28d>] tracesys+0xd5/0xe0 Hi, still unfixed in 2.6.18-194.26.1.el5? I have a similar crash here: EXT4-fs (dm-1): Unrecognized mount option "uid=100" or missing value NMI Watchdog detected LOCKUP on CPU 1 CPU 1 Modules linked in: mptctl mptbase ipmi_watchdog ipmi_devintf ipmi_si ipmi_msghandler ipv6 xfrm_nalgo crypto_api ext4 jbd2 crc16 dm_mirror dm_round_robin dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sr_mod cdrom hpilo serio_raw pcspkr sg bnx2 dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache qla2xxx scsi_transport_fc ata_piix libata shpchp cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 3501, comm: cmaidad Not tainted 2.6.18-194.26.1.el5 #1 RIP: 0010:[<ffffffff801543fb>] [<ffffffff801543fb>] list_del+0xb/0x71 RSP: 0018:ffff810191ebfc38 EFLAGS: 00003082 RAX: ffff810105b4b9c0 RBX: ffff81019d04e4c0 RCX: 0000000000000000 RDX: ffff81019d04e4c0 RSI: ffff810105b4b9c0 RDI: ffff81019d04e4c0 RBP: ffff81019d04e4c0 R08: ffff81019ff11cc0 R09: ffff81019fffd460 R10: 0000000000000000 R11: 0000000000000000 R12: ffff81019ff11cc0 R13: ffff810105b4b9c0 R14: 0000000000000004 R15: ffff81019ffe1540 FS: 0000000000000000(0000) GS:ffff81019ff11840(0063) knlGS:00000000f7decac0 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 00002b78084260a0 CR3: 000000019183f000 CR4: 00000000000006e0 Process cmaidad (pid: 3501, threadinfo ffff810191ebe000, task ffff810191f91100) Stack: ffff81019ffe1540 ffffffff8005c130 000000d000003286 ffff81019ffe1540 0000000000003246 00000000000000d0 ffff81019e550000 00000000ff9b57f0 00000000ff9b589c ffffffff800dbbc3 00000000ff9b57f0 0000000000000000 Call Trace: [<ffffffff8005c130>] cache_alloc_refill+0xf1/0x186 [<ffffffff800dbbc3>] __kmalloc+0x95/0x9f [<ffffffff880bad04>] :cciss:cciss_ioctl+0x50a/0xc58 [<ffffffff8000cf57>] do_lookup+0x65/0x1e6 [<ffffffff8000d47a>] dput+0x2c/0x114 [<ffffffff80057e46>] kobject_get+0x12/0x17 [<ffffffff8005ab33>] exact_lock+0xc/0x14 [<ffffffff880bb47c>] :cciss:do_ioctl+0x2a/0x39 [<ffffffff880bb69f>] :cciss:cciss_compat_ioctl+0x214/0x249 [<ffffffff800e5952>] blkdev_open+0x0/0x4f [<ffffffff800e5975>] blkdev_open+0x23/0x4f [<ffffffff80146b1c>] compat_blkdev_ioctl+0x4c/0x5f [<ffffffff800fb0b8>] compat_sys_ioctl+0xc5/0x2b2 [<ffffffff8006149d>] sysenter_do_call+0x1e/0x76 Code: 48 39 fa 74 1b 48 89 fe 31 c0 48 c7 c7 56 bb 2b 80 e8 0c e1 Kernel panic - not syncing: nmi watchdog <0>Rebooting in 60 seconds..BUG: warning at kernel/panic.c:113/panic() (Not tainted) Call Trace: <NMI> [<ffffffff80091c1f>] panic+0x146/0x1eb [<ffffffff8006bad1>] _show_stack+0xdb/0xea [<ffffffff8006bbc4>] show_registers+0xe4/0x100 [<ffffffff800652c5>] die_nmi+0x66/0xa3 [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3 [<ffffffff80065629>] default_do_nmi+0x81/0x225 [<ffffffff80065896>] do_nmi+0x43/0x61 [<ffffffff80064eef>] nmi+0x7f/0x88 [<ffffffff801543fb>] list_del+0xb/0x71 <<EOE>> [<ffffffff8005c130>] cache_alloc_refill+0xf1/0x186 [<ffffffff800dbbc3>] __kmalloc+0x95/0x9f [<ffffffff880bad04>] :cciss:cciss_ioctl+0x50a/0xc58 [<ffffffff8000cf57>] do_lookup+0x65/0x1e6 [<ffffffff8000d47a>] dput+0x2c/0x114 [<ffffffff80057e46>] kobject_get+0x12/0x17 [<ffffffff8005ab33>] exact_lock+0xc/0x14 [<ffffffff880bb47c>] :cciss:do_ioctl+0x2a/0x39 [<ffffffff880bb69f>] :cciss:cciss_compat_ioctl+0x214/0x249 [<ffffffff800e5952>] blkdev_open+0x0/0x4f [<ffffffff800e5975>] blkdev_open+0x23/0x4f [<ffffffff80146b1c>] compat_blkdev_ioctl+0x4c/0x5f [<ffffffff800fb0b8>] compat_sys_ioctl+0xc5/0x2b2 [<ffffffff8006149d>] sysenter_do_call+0x1e/0x76 BUG: warning at drivers/input/serio/i8042.c:846/i8042_panic_blink() (Not tainted) Call Trace: <NMI> [<ffffffff8020b0df>] i8042_panic_blink+0x112/0x2a5 [<ffffffff80091bc5>] panic+0xec/0x1eb [<ffffffff8006bad1>] _show_stack+0xdb/0xea [<ffffffff8006bbc4>] show_registers+0xe4/0x100 [<ffffffff800652c5>] die_nmi+0x66/0xa3 [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3 [<ffffffff80065629>] default_do_nmi+0x81/0x225 [<ffffffff80065896>] do_nmi+0x43/0x61 [<ffffffff80064eef>] nmi+0x7f/0x88 [<ffffffff801543fb>] list_del+0xb/0x71 <<EOE>> [<ffffffff8005c130>] cache_alloc_refill+0xf1/0x186 [<ffffffff800dbbc3>] __kmalloc+0x95/0x9f [<ffffffff880bad04>] :cciss:cciss_ioctl+0x50a/0xc58 [<ffffffff8000cf57>] do_lookup+0x65/0x1e6 [<ffffffff8000d47a>] dput+0x2c/0x114 [<ffffffff80057e46>] kobject_get+0x12/0x17 [<ffffffff8005ab33>] exact_lock+0xc/0x14 [<ffffffff880bb47c>] :cciss:do_ioctl+0x2a/0x39 [<ffffffff880bb69f>] :cciss:cciss_compat_ioctl+0x214/0x249 [<ffffffff800e5952>] blkdev_open+0x0/0x4f [<ffffffff800e5975>] blkdev_open+0x23/0x4f [<ffffffff80146b1c>] compat_blkdev_ioctl+0x4c/0x5f [<ffffffff800fb0b8>] compat_sys_ioctl+0xc5/0x2b2 [<ffffffff8006149d>] sysenter_do_call+0x1e/0x76 BUG: warning at drivers/input/serio/i8042.c:849/i8042_panic_blink() (Not tainted) Call Trace: <NMI> [<ffffffff8020b1c8>] i8042_panic_blink+0x1fb/0x2a5 [<ffffffff80091bc5>] panic+0xec/0x1eb [<ffffffff8006bad1>] _show_stack+0xdb/0xea [<ffffffff8006bbc4>] show_registers+0xe4/0x100 [<ffffffff800652c5>] die_nmi+0x66/0xa3 [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3 [<ffffffff80065629>] default_do_nmi+0x81/0x225 [<ffffffff80065896>] do_nmi+0x43/0x61 [<ffffffff80064eef>] nmi+0x7f/0x88 [<ffffffff801543fb>] list_del+0xb/0x71 <<EOE>> [<ffffffff8005c130>] cache_alloc_refill+0xf1/0x186 [<ffffffff800dbbc3>] __kmalloc+0x95/0x9f [<ffffffff880bad04>] :cciss:cciss_ioctl+0x50a/0xc58 [<ffffffff8000cf57>] do_lookup+0x65/0x1e6 [<ffffffff8000d47a>] dput+0x2c/0x114 [<ffffffff80057e46>] kobject_get+0x12/0x17 [<ffffffff8005ab33>] exact_lock+0xc/0x14 [<ffffffff880bb47c>] :cciss:do_ioctl+0x2a/0x39 [<ffffffff880bb69f>] :cciss:cciss_compat_ioctl+0x214/0x249 [<ffffffff800e5952>] blkdev_open+0x0/0x4f [<ffffffff800e5975>] blkdev_open+0x23/0x4f [<ffffffff80146b1c>] compat_blkdev_ioctl+0x4c/0x5f [<ffffffff800fb0b8>] compat_sys_ioctl+0xc5/0x2b2 [<ffffffff8006149d>] sysenter_do_call+0x1e/0x76 BUG: warning at drivers/input/serio/i8042.c:851/i8042_panic_blink() (Not tainted) Call Trace: <NMI> [<ffffffff8020b245>] i8042_panic_blink+0x278/0x2a5 [<ffffffff80091bc5>] panic+0xec/0x1eb [<ffffffff8006bad1>] _show_stack+0xdb/0xea [<ffffffff8006bbc4>] show_registers+0xe4/0x100 [<ffffffff800652c5>] die_nmi+0x66/0xa3 [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3 Call Trace: <NMI> [<ffffffff8020b1c8>] i8042_panic_blink+0x1fb/0x2a5 [<ffffffff80091bc5>] panic+0xec/0x1eb [<ffffffff8006bad1>] _show_stack+0xdb/0xea [<ffffffff8006bbc4>] show_registers+0xe4/0x100 [<ffffffff800652c5>] die_nmi+0x66/0xa3 [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3 [<ffffffff80065629>] default_do_nmi+0x81/0x225 [<ffffffff80065896>] do_nmi+0x43/0x61 [<ffffffff80064eef>] nmi+0x7f/0x88 [<ffffffff801543fb>] list_del+0xb/0x71 <<EOE>> [<ffffffff8005c130>] cache_alloc_refill+0xf1/0x186 [<ffffffff800dbbc3>] __kmalloc+0x95/0x9f [<ffffffff880bad04>] :cciss:cciss_ioctl+0x50a/0xc58 [<ffffffff8000cf57>] do_lookup+0x65/0x1e6 [<ffffffff8000d47a>] dput+0x2c/0x114 [<ffffffff80057e46>] kobject_get+0x12/0x17 [<ffffffff8005ab33>] exact_lock+0xc/0x14 [<ffffffff880bb47c>] :cciss:do_ioctl+0x2a/0x39 [<ffffffff880bb69f>] :cciss:cciss_compat_ioctl+0x214/0x249 [<ffffffff800e5952>] blkdev_open+0x0/0x4f [<ffffffff800e5975>] blkdev_open+0x23/0x4f [<ffffffff80146b1c>] compat_blkdev_ioctl+0x4c/0x5f [<ffffffff800fb0b8>] compat_sys_ioctl+0xc5/0x2b2 [<ffffffff8006149d>] sysenter_do_call+0x1e/0x76 BUG: warning at drivers/input/serio/i8042.c:851/i8042_panic_blink() (Not tainted) Call Trace: <NMI> [<ffffffff8020b245>] i8042_panic_blink+0x278/0x2a5 [<ffffffff80091bc5>] panic+0xec/0x1eb [<ffffffff8006bad1>] _show_stack+0xdb/0xea [<ffffffff8006bbc4>] show_registers+0xe4/0x100 [<ffffffff800652c5>] die_nmi+0x66/0xa3 [<ffffffff80065a0b>] nmi_watchdog_tick+0x157/0x1d3 [<ffffffff80065629>] default_do_nmi+0x81/0x225 [<ffffffff80065896>] do_nmi+0x43/0x61 [<ffffffff80064eef>] nmi+0x7f/0x88 [<ffffffff801543fb>] list_del+0xb/0x71 <<EOE>> [<ffffffff8005c130>] cache_alloc_refill+0xf1/0x186 [<ffffffff800dbbc3>] __kmalloc+0x95/0x9f [<ffffffff880bad04>] :cciss:cciss_ioctl+0x50a/0xc58 [<ffffffff8000cf57>] do_lookup+0x65/0x1e6 [<ffffffff8000d47a>] dput+0x2c/0x114 [<ffffffff80057e46>] kobject_get+0x12/0x17 [<ffffffff8005ab33>] exact_lock+0xc/0x14 [<ffffffff880bb47c>] :cciss:do_ioctl+0x2a/0x39 [<ffffffff880bb69f>] :cciss:cciss_compat_ioctl+0x214/0x249 [<ffffffff800e5952>] blkdev_open+0x0/0x4f [<ffffffff800e5975>] blkdev_open+0x23/0x4f [<ffffffff80146b1c>] compat_blkdev_ioctl+0x4c/0x5f [<ffffffff800fb0b8>] compat_sys_ioctl+0xc5/0x2b2 [<ffffffff8006149d>] sysenter_do_call+0x1e/0x76 Bjorn, I do not see ext4 calls in your trace - why do you think this is the same issue? If not, please open an issue with via your redhat support channel so we can get our field people to help gather information. Thanks! At any rate, the bug is not fixed in the kernel you are testing. See comment #8. Ric, (In reply to comment #12) > Bjorn, I do not see ext4 calls in your trace - why do you think this is the > same issue? > > If not, please open an issue with via your redhat support channel so we can get > our field people to help gather information. Thanks! I searched Bugzilla and found Bug #594446 in which the Op used almost exactly the same command to let the kernel crash. (I used "mount -o uid=foo,gid=foo -t ext4 /dev/mapper/foop1 /home/foo") Eric replied to #594446 and told it's as a duplicate of this Bug #614957. So if this looks like a new bug I'll open an issue via RH support. Thanks! An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html *** Bug 684048 has been marked as a duplicate of this bug. *** |