Description of problem: Customer reported this panic after remounting their xfs filesystem after a forced shutdown. Sep 27 21:08:44 r05b16 kernel: XFS mounting filesystem sdm1 Sep 27 21:08:45 r05b16 kernel: Starting XFS recovery on filesystem: sdm1 (logdev: internal) Sep 27 21:08:45 r05b16 kernel: Ending XFS recovery on filesystem: sdm1 (logdev: internal) Sep 27 21:08:46 r05b16 hotswap[5834]: Mount of c2d66959-52fb-4354-90ad-5114897037b6 successful. Sep 27 21:08:46 r05b16 signal_video-server[5835]: Signalling Storage Server that c2d66959-52fb-4354-90ad-5114897037b6 is added back Sep 27 21:09:46 r05b16 kernel: ----------- [cut here ] --------- [please bite here ] --------- Sep 27 21:09:46 r05b16 kernel: Kernel BUG at mm/slab.c:3114 Sep 27 21:09:46 r05b16 kernel: invalid opcode: 0000 [1] SMP Sep 27 21:09:46 r05b16 kernel: last sysfs file: /block/sdl/stat Sep 27 21:09:46 r05b16 kernel: CPU 7 Sep 27 21:09:46 r05b16 kernel: Modules linked in: xfs ses(FU) enclosure(FU) ipv6 xfrm_nalgo crypto_api autofs4 lockd sunrpc video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport cdc_ether usbnet sd_mod sg shpchp cxgb3 ata_piix i2c_i801 i2c_core uhci_hcd ehci_hcd pcspkr libata mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas scsi_mod igb 8021q dca Sep 27 21:09:46 r05b16 kernel: Pid: 5832, comm: xfssyncd Tainted: GF 2.6.18-194.el5 #1 Sep 27 21:09:46 r05b16 kernel: RIP: 0010:[<ffffffff800dc5fd>] [<ffffffff800dc5fd>] __cache_alloc_node+0x61/0xd2 Sep 27 21:09:46 r05b16 kernel: RSP: 0018:ffff8105e7c11d00 EFLAGS: 00010046 Sep 27 21:09:46 r05b16 kernel: RAX: 0000000000000013 RBX: ffff8101747db000 RCX: 0000000000000001 Sep 27 21:09:46 r05b16 kernel: RDX: 0000000000000000 RSI: 0000000000000250 RDI: ffff81032a60f580 Sep 27 21:09:47 r05b16 kernel: RBP: ffff81032a60f540 R08: 0000000000000000 R09: ffff8101749f77a0 Sep 27 21:09:47 r05b16 kernel: R10: 0000000000000000 R11: 000002d000000000 R12: ffff81032a604500 Sep 27 21:09:47 r05b16 kernel: R13: 0000000000000000 R14: 0000000000000250 R15: 0000000000000000 Sep 27 21:09:47 r05b16 kernel: FS: 0000000000000000(0000) GS:ffff81038aaff3c0(0000) knlGS:0000000000000000 Sep 27 21:09:47 r05b16 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Sep 27 21:09:47 r05b16 kernel: CR2: 00002aac8370a000 CR3: 000000037f5e0000 CR4: 00000000000006e0 Sep 27 21:09:47 r05b16 kernel: Process xfssyncd (pid: 5832, threadinfo ffff8105e7c10000, task ffff8101749f77a0) Sep 27 21:09:47 r05b16 kernel: Stack: 0000000000000246 0000000000000250 ffff81032a604500 ffff81032a604500 Sep 27 21:09:47 r05b16 kernel: 0000000000000069 ffffffff8000abeb 0000000000000009 0000000000000000 Sep 27 21:09:47 r05b16 kernel: 0000000000000250 ffffffff883f711a 00000000000002d0 0000000000000000 Sep 27 21:09:47 r05b16 kernel: Call Trace: Sep 27 21:09:47 r05b16 kernel: [<ffffffff8000abeb>] kmem_cache_alloc+0x34/0x76 Sep 27 21:09:47 r05b16 kernel: [<ffffffff883f711a>] :xfs:kmem_zone_alloc+0x56/0xa3 Sep 27 21:09:47 r05b16 kernel: [<ffffffff883f7175>] :xfs:kmem_zone_zalloc+0xe/0x2f Sep 27 21:09:48 r05b16 kernel: [<ffffffff883e6e8f>] :xfs:xlog_ticket_get+0x30/0xe6 Sep 27 21:09:48 r05b16 kernel: [<ffffffff883e6fcd>] :xfs:xfs_log_reserve+0x88/0xc9 Sep 27 21:09:48 r05b16 kernel: [<ffffffff883ef2e1>] :xfs:xfs_trans_reserve+0xe4/0x1c5 Sep 27 21:09:48 r05b16 kernel: [<ffffffff883f241f>] :xfs:xfs_syncsub+0x167/0x226 Sep 27 21:09:48 r05b16 kernel: [<ffffffff800a198c>] keventd_create_kthread+0x0/0xc4 Sep 27 21:09:48 r05b16 kernel: [<ffffffff883ffc8c>] :xfs:xfs_sync_worker+0x17/0x36 Sep 27 21:09:48 r05b16 kernel: [<ffffffff88400ba2>] :xfs:xfssyncd+0xfe/0x138 Sep 27 21:09:48 r05b16 kernel: [<ffffffff88400aa4>] :xfs:xfssyncd+0x0/0x138 Sep 27 21:09:48 r05b16 kernel: [<ffffffff80032bdc>] kthread+0xfe/0x132 Sep 27 21:09:48 r05b16 kernel: [<ffffffff8005efb1>] child_rip+0xa/0x11 Sep 27 21:09:48 r05b16 kernel: [<ffffffff800a198c>] keventd_create_kthread+0x0/0xc4 Sep 27 21:09:48 r05b16 kernel: [<ffffffff80032ade>] kthread+0x0/0x132 Sep 27 21:09:49 r05b16 kernel: [<ffffffff8005efa7>] child_rip+0x0/0x11 Sep 27 21:09:49 r05b16 kernel: Sep 27 21:09:49 r05b16 kernel: Sep 27 21:09:49 r05b16 kernel: Code: 0f 0b 68 84 2b 2b 80 c2 2a 0c 48 89 de 4c 89 e7 44 89 ea e8 Sep 27 21:09:49 r05b16 kernel: RIP [<ffffffff800dc5fd>] __cache_alloc_node+0x61/0xd2 Sep 27 21:09:49 r05b16 kernel: RSP <ffff8105e7c11d00> Sep 27 21:09:49 r05b16 kernel: <0>Kernel panic - not syncing: Fatal exception Version-Release number of selected component (if applicable): kernel-2.6.18-194.el5 How reproducible: Everytime they run their test. Steps to Reproduce: From the customer: "The setup we are talking about is a 12 disk machine that has a disk with HW problems. We are using fio tool to fill up the bad disk, we get scsi errors during the test that leads the controller to remove the device and then attach it back after few seconds. What we are doing in this case is unmount and mount to the new device. The kernel panic will occurred after a 1 minute. The scenario above occurred few times but now using the same scenario we get different behavior, we do not get kernel panic but we do see that the cpu get stuck for more than 10 sec."
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in kernel-2.6.18-245.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
Confirm patch in kernel git tree
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1065.html