Hide Forgot
Description of problem: After installed Redhat enterprise Linux 6.1 on a X86_64 machine, It will panic 4-5 times of 10 times boot with call trace printed out: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff8110fb2c>] mempool_alloc+0x5c/0x140 PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/module/lpfc/initstate CPU 0 Modules linked in: sd_mod crc_t10dif usb_storage lpfc ahci qla2xxx scsi_transport_fc scsi_tgt mpt2sas scsi_transport_sas raid_class dm_mod Modules linked in: sd_mod crc_t10dif usb_storage lpfc ahci qla2xxx scsi_transport_fc scsi_tgt mpt2sas scsi_transport_sas raid_class dm_mod Pid: 1549, comm: async/2 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 SUN FIRE X4470 M2 SERVER RIP: 0010:[<ffffffff8110fb2c>] [<ffffffff8110fb2c>] mempool_alloc+0x5c/0x140 RSP: 0018:ffff883f641eb770 EFLAGS: 00010002 RAX: ffff883f64ffb540 RBX: 0000000000000000 RCX: 0000000000000002 RDX: 0000000000000002 RSI: 0000000000000020 RDI: 0000000000011220 RBP: ffff883f641eb7f0 R08: 0000000000000000 R09: ffff883f664d1180 R10: 0000000000000000 R11: ffff883f6441dbc0 R12: 0000000000011220 R13: ffff883f641eb790 R14: ffff883f641eb7a8 R15: 0000000000000030 FS: 0000000000000000(0000) GS:ffff88018a600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000018 CR3: 0000003f642e5000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process async/2 (pid: 1549, threadinfo ffff883f641ea000, task ffff883f64ffb540) Stack: ffff883f641eb790 ffffffff81356790 ffff883f64ffb540 0000000062951800 <0> ffff883f641eb7d0 ffffffff81356a8a ffff883f641eb7e0 ffff883f6447e5e0 <0> ffff883f62951800 ffff883f639a2aa0 ffff883f62951800 ffff883f6447e5e0 Call Trace: [<ffffffff81356790>] ? scsi_init_sgtable+0x40/0x70 [<ffffffff81356a8a>] ? scsi_init_io+0x3a/0x170 [<ffffffffa018c371>] sd_prep_fn+0x761/0xea0 [sd_mod] [<ffffffff8110d3d0>] ? sync_page+0x0/0x50 [<ffffffff81249dd3>] blk_peek_request+0xd3/0x210 [<ffffffff81355f93>] scsi_request_fn+0x63/0x590 [<ffffffff8110d3d0>] ? sync_page+0x0/0x50 [<ffffffff81247b52>] __generic_unplug_device+0x32/0x40 [<ffffffff81247b8e>] generic_unplug_device+0x2e/0x50 [<ffffffff81242914>] blk_unplug+0x34/0x70 [<ffffffff81242962>] blk_backing_dev_unplug+0x12/0x20 [<ffffffff811a21be>] block_sync_page+0x3e/0x50 [<ffffffff8110d408>] sync_page+0x38/0x50 [<ffffffff814dc1ea>] __wait_on_bit_lock+0x5a/0xc0 [<ffffffff811a8d70>] ? blkdev_get_block+0x0/0x70 [<ffffffff8110d3a7>] __lock_page+0x67/0x70 [<ffffffff8108e1a0>] ? wake_bit_function+0x0/0x50 [<ffffffff8110f2aa>] do_read_cache_page+0xca/0x180 [<ffffffff811a9cf0>] ? blkdev_readpage+0x0/0x20 [<ffffffff8110f3a9>] read_cache_page_async+0x19/0x20 [<ffffffff8110f3be>] read_cache_page+0xe/0x20 [<ffffffff811e0910>] read_dev_sector+0x30/0xa0 [<ffffffff811e3651>] read_lba+0x101/0x110 [<ffffffff811e3a85>] find_valid_gpt+0xd5/0x6b0 [<ffffffff811e40df>] efi_partition+0x7f/0x360 [<ffffffff814dad34>] ? printk+0x41/0x45 [<ffffffff811e1635>] rescan_partitions+0x1a5/0x470 [<ffffffffa018b281>] ? sd_open+0x81/0x1f0 [sd_mod] [<ffffffff811aa456>] __blkdev_get+0x1b6/0x3c0 [<ffffffff811aa670>] blkdev_get+0x10/0x20 [<ffffffff811e0ad5>] register_disk+0x155/0x170 [<ffffffff81251526>] add_disk+0xa6/0x160 [<ffffffffa018e80b>] sd_probe_async+0x13b/0x210 [sd_mod] [<ffffffff8108e4c6>] ? add_wait_queue+0x46/0x60 [<ffffffff81096302>] async_thread+0x102/0x250 [<ffffffff8105dc60>] ? default_wake_function+0x0/0x20 [<ffffffff81096200>] ? async_thread+0x0/0x250 [<ffffffff8108ddf6>] kthread+0x96/0xa0 [<ffffffff8100c1ca>] child_rip+0xa/0x20 [<ffffffff8108dd60>] ? kthread+0x0/0xa0 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 Code: 12 01 00 4c 8d 6d a0 4c 8d 7b 30 44 89 e0 44 89 e7 83 e0 10 4d 8d 75 18 83 e7 af 89 45 9c 65 48 8b 04 25 00 cc 00 00 48 89 45 90 <48> 8b 73 18 ff 53 20 48 85 c0 48 89 c2 74 24 48 89 d0 48 8b 5d RIP [<ffffffff8110fb2c>] mempool_alloc+0x5c/0x140 RSP <ffff883f641eb770> CR2: 0000000000000018 ---[ end trace 81629a40a9f90dca ]--- Kernel panic - not syncing: Fatal exception Pid: 1549, comm: async/2 Tainted: G D ---------------- 2.6.32-131.0.15.el6.x86_64 #1 Call Trace: [<ffffffff814dac28>] ? panic+0x78/0x143 [<ffffffff814dec74>] ? oops_end+0xe4/0x100 [<ffffffff81040cdb>] ? no_context+0xfb/0x260 [<ffffffff81040f65>] ? __bad_area_nosemaphore+0x125/0x1e0 [<ffffffff8112f009>] ? zone_statistics+0x99/0xc0 [<ffffffff81041033>] ? bad_area_nosemaphore+0x13/0x20 [<ffffffff8104170d>] ? __do_page_fault+0x31d/0x480 [<ffffffff8110fa25>] ? mempool_alloc_slab+0x15/0x20 [<ffffffff8110fb33>] ? mempool_alloc+0x63/0x140 [<ffffffff8125d53d>] ? cfq_service_tree_add+0x43d/0x530 [<ffffffff814e0c3e>] ? do_page_fault+0x3e/0xa0 [<ffffffff814ddfe5>] ? page_fault+0x25/0x30 [<ffffffff8110fb2c>] ? mempool_alloc+0x5c/0x140 [<ffffffff81356790>] ? scsi_init_sgtable+0x40/0x70 [<ffffffff81356a8a>] ? scsi_init_io+0x3a/0x170 [<ffffffffa018c371>] ? sd_prep_fn+0x761/0xea0 [sd_mod] [<ffffffff8110d3d0>] ? sync_page+0x0/0x50 [<ffffffff81249dd3>] ? blk_peek_request+0xd3/0x210 [<ffffffff81355f93>] ? scsi_request_fn+0x63/0x590 [<ffffffff8110d3d0>] ? sync_page+0x0/0x50 [<ffffffff81247b52>] ? __generic_unplug_device+0x32/0x40 [<ffffffff81247b8e>] ? generic_unplug_device+0x2e/0x50 [<ffffffff81242914>] ? blk_unplug+0x34/0x70 [<ffffffff81242962>] ? blk_backing_dev_unplug+0x12/0x20 [<ffffffff811a21be>] ? block_sync_page+0x3e/0x50 [<ffffffff8110d408>] ? sync_page+0x38/0x50 [<ffffffff814dc1ea>] ? __wait_on_bit_lock+0x5a/0xc0 [<ffffffff811a8d70>] ? blkdev_get_block+0x0/0x70 [<ffffffff8110d3a7>] ? __lock_page+0x67/0x70 [<ffffffff8108e1a0>] ? wake_bit_function+0x0/0x50 [<ffffffff8110f2aa>] ? do_read_cache_page+0xca/0x180 [<ffffffff811a9cf0>] ? blkdev_readpage+0x0/0x20 [<ffffffff8110f3a9>] ? read_cache_page_async+0x19/0x20 [<ffffffff8110f3be>] ? read_cache_page+0xe/0x20 [<ffffffff811e0910>] ? read_dev_sector+0x30/0xa0 [<ffffffff811e3651>] ? read_lba+0x101/0x110 [<ffffffff811e3a85>] ? find_valid_gpt+0xd5/0x6b0 [<ffffffff811e40df>] ? efi_partition+0x7f/0x360 [<ffffffff814dad34>] ? printk+0x41/0x45 [<ffffffff811e1635>] ? rescan_partitions+0x1a5/0x470 [<ffffffffa018b281>] ? sd_open+0x81/0x1f0 [sd_mod] [<ffffffff811aa456>] ? __blkdev_get+0x1b6/0x3c0 [<ffffffff811aa670>] ? blkdev_get+0x10/0x20 [<ffffffff811e0ad5>] ? register_disk+0x155/0x170 [<ffffffff81251526>] ? add_disk+0xa6/0x160 [<ffffffffa018e80b>] ? sd_probe_async+0x13b/0x210 [sd_mod] [<ffffffff8108e4c6>] ? add_wait_queue+0x46/0x60 [<ffffffff81096302>] ? async_thread+0x102/0x250 [<ffffffff8105dc60>] ? default_wake_function+0x0/0x20 [<ffffffff81096200>] ? async_thread+0x0/0x250 [<ffffffff8108ddf6>] ? kthread+0x96/0xa0 [<ffffffff8100c1ca>] ? child_rip+0xa/0x20 [<ffffffff8108dd60>] ? kthread+0x0/0xa0 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 Version-Release number of selected component (if applicable): Only Redhat enterprise Linux 6.1, 6.0 will not panic. 6.0 with 6.1 kernel will panic. How reproducible: Only can be reproduced on one machine now. Steps to Reproduce: The step to reproduced: 1 Install option cards on SUT. 2. Install RHEL6.1 on SUT. 3. Boot the OS. Configruation: Platform: X86_64 CPU: 4 x E7- 4860 DIMM: 304 GB OS: RHEL6.1 Option cards: slot0:x7281 slot1:Erie-Ext slot2:Erie-Int slot3:x4446 slot4:Pallene-Q slot6:CX2 slot8:Niantic slot9:Pallene-E Actual results: Panic and hang. Expected results: No panic. Additional info:
Since RHEL 6.2 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
hi,all I googled and got the information here. we unfortunatelly meet the similiar bug with similiar call trace panic. because there is no mod dependancy of mpt2sas.ko for sd_mod.ko, kernel will load the 2 modules concurrently. if we manually identify the dependancy as mpt2sas.ko : sd_mod.ko, the bug disappears. root cause is still under investigating
i see exactly the same problem when i upgraded my systems to 2.6.32-279.5.2.el6.x86_64 with the Stock 6.3 kernel 2.6.32-279.el6.x86_64 the problem doesn't exist .
I think this is a duplicate of 888417. Would the reporter of this problem please try the test kernel with a fix available at: http://people.redhat.com/emilne/RPMS/.bz888417/ and update this bug with whether the test kernel fixes the problem.
Sorry, I moved to other BU, so couldn't test the fix.
Closing per comment 11.