Bug 710382

Summary: Multiple "kernel BUG"s which seem to be related
Product: Red Hat Enterprise Linux 6 Reporter: Jiri Pirko <jpirko>
Component: kernelAssignee: Alex Williamson <alex.williamson>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 6.2CC: dhoward, fhrbata, jpirko, rkhan
Target Milestone: rcKeywords: Regression
Target Release: 6.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-03 15:04:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jiri Pirko 2011-06-03 09:17:15 UTC
Description of problem:
I'm hitting interesting panics on my testing Z400 machine. Seems to be related.

Version-Release number of selected component (if applicable):
2.6.32-155.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. boot system
2. wait
3. kaboom
  
Actual results:
panic

Expected results:
no panic

Additional info:

----------------------------------------------------------------
kernel BUG at mm/slab.c:3067!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1d.0/usb6/6-2/devnum
CPU 2
Modules linked in: ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf      bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ext3 jbd dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun kvm_intel kvm uinput wmi 8139too 8139cp r8169 mii   sky2 tg3 sg pl2303 usbserial microcode serio_raw i7core_edac edac_core iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t ahci nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mod [last unloaded: scsi_wait_scan]

Pid: 2411, comm: bash Not tainted 2.6.32-155.el6.x86_64 #1 Hewlett-Packard HP Z400 Workstation/0B4Ch 
RIP: 0010:[<ffffffff8115a664>]  [<ffffffff8115a664>] cache_alloc_refill+0x1e4/0x240
RSP: 0018:ffff880120a43508  EFLAGS: 00010046
RAX: 0000000000000034 RBX: ffff880216ba1f00 RCX: 000000000000003b
RDX: ffff88021a093140 RSI: ffff88021a093140 RDI: ffff8802163b7000
RBP: ffff880120a43568 R08: ffff88021a093140 R09: 0000000000000000
R10: 0000000000000020 R11: 0000000000000000 R12: ffff880219490000
R13: ffff88021a093140 R14: 0000000000000034 R15: ffff8802163b7000
FS:  00007fb59fb16700(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000001abe128 CR3: 0000000120b00000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 2411, threadinfo ffff880120a42000, task ffff880120a3f580)
Stack:
 ffff880120a43568 000000008129a782 ffff88021a093180 00041220153419d0
<0> ffff88021a093160 ffff88021a093150 0000000000000000 ffff880120a3f580
<0> 0000000000000020 ffff880216ba1f00 0000000000000020 0000000000000046
Call Trace:
 [<ffffffff8115b55f>] kmem_cache_alloc+0x15f/0x190
 [<ffffffff8129a379>] alloc_iova_mem+0x49/0x60
 [<ffffffff81297867>] alloc_iova+0x27/0x240
 [<ffffffff81271f80>] ? sg_init_table+0x30/0x50
 [<ffffffff812997b5>] intel_alloc_iova+0xb5/0xe0
 [<ffffffff8129c22c>] intel_map_sg+0x14c/0x2a0
 [<ffffffff81363cec>] ata_qc_issue+0x13c/0x340
 [<ffffffff8136be00>] ? ata_scsi_rw_xlat+0x0/0x1f0
 [<ffffffff81369817>] ata_scsi_translate+0xa7/0x180
 [<ffffffff8134eef0>] ? scsi_done+0x0/0x60
 [<ffffffff8134eef0>] ? scsi_done+0x0/0x60
 [<ffffffff8136cead>] ata_scsi_queuecmd+0xbd/0x2d0
 [<ffffffff8134f22c>] scsi_dispatch_cmd+0x1ac/0x340
 [<ffffffff81356c25>] scsi_request_fn+0x415/0x590
 [<ffffffff8125dc86>] ? cfq_service_tree_add+0x226/0x530
 [<ffffffff8124b817>] __blk_run_queue+0x77/0x160
 [<ffffffff8125fb3b>] cfq_insert_request+0x31b/0x600
 [<ffffffff81242689>] elv_insert+0x109/0x1a0
 [<ffffffff8124276a>] __elv_add_request+0x4a/0x90
 [<ffffffff8124b282>] __make_request+0x122/0x500
 [<ffffffff81249b3e>] generic_make_request+0x21e/0x5b0
 [<ffffffff811a8596>] ? bio_add_page+0x36/0x40
 [<ffffffff811ad4b0>] ? do_mpage_readpage+0x310/0x5f0
 [<ffffffff81249f5f>] submit_bio+0x8f/0x120
 [<ffffffff811ad017>] mpage_bio_submit+0x27/0x30
 [<ffffffff811ad915>] mpage_readpages+0x115/0x130
 [<ffffffffa01a52f0>] ? ext4_get_block+0x0/0x120 [ext4]
 [<ffffffffa01a52f0>] ? ext4_get_block+0x0/0x120 [ext4]
 [<ffffffff81154ada>] ? alloc_pages_current+0xaa/0x110
 [<ffffffffa01a085d>] ext4_readpages+0x1d/0x20 [ext4]
 [<ffffffff81122ea5>] __do_page_cache_readahead+0x185/0x210
 [<ffffffff81122f51>] ra_submit+0x21/0x30
 [<ffffffff811232c5>] ondemand_readahead+0x115/0x240
 [<ffffffff81180f0d>] ? do_lookup+0x7d/0x220
 [<ffffffff811234e3>] page_cache_sync_readahead+0x33/0x50
 [<ffffffff8110f288>] generic_file_aio_read+0x558/0x700
 [<ffffffff81172f0a>] do_sync_read+0xfa/0x140
 [<ffffffff8108e360>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811783d4>] ? cp_new_stat+0xe4/0x100
 [<ffffffff81205c76>] ? security_file_permission+0x16/0x20
 [<ffffffff81173905>] vfs_read+0xb5/0x1a0
 [<ffffffff81173a41>] sys_read+0x51/0x90
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Code: 89 ff e8 c0 92 11 00 eb 99 66 0f 1f 44 00 00 41 c7 45 60 01 00 00 00 4d 8b 7d 20 4c 39 7d c0 0f 85 f2 fe ff ff eb 84 0f 0b eb fe <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 eb f4 8b 55 ac 8b 75 bc 31
RIP  [<ffffffff8115a664>] cache_alloc_refill+0x1e4/0x240
 RSP <ffff880120a43508>
----------------------------------------------------------------
kernel BUG at drivers/pci/iova.c:155!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/vnet0/flags
CPU 2
Modules linked in: ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf      bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ext3 jbd dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun kvm_intel kvm uinput wmi 8139too 8139cp r8169 mii   sky2 tg3 sg pl2303 usbserial microcode serio_raw snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support i7core_edac edac_core shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t ahci nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mod [last unloaded: scsi_wait_scan]

Pid: 482, comm: jbd2/dm-0-8 Not tainted 2.6.32-155.el6.x86_64 #1 Hewlett-Packard HP Z400 Workstation/0B4Ch
RIP: 0010:[<ffffffff81297a04>]  [<ffffffff81297a04>] alloc_iova+0x1c4/0x240
RSP: 0018:ffff8802143b58a0  EFLAGS: 00010046
RAX: ffff8802163108c0 RBX: ffff880219128e80 RCX: 00000000000fff3a
RDX: ffff8802163108d0 RSI: ffff880216b9a368 RDI: 0000000000000001
RBP: ffff8802143b5900 R08: 0000000000000000 R09: 0000000000000028
R10: 00000000000000c0 R11: ffff880216d6f780 R12: ffff880216b9a360
R13: ffff8802163108c0 R14: 00000000000fffff R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000001505620 CR3: 0000000218c34000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process jbd2/dm-0-8 (pid: 482, threadinfo ffff8802143b4000, task ffff88021412b4c0)
Stack:
 ffff880216310380 00000000000fff3a 0000000000000046 0000000000000020
<0> 0000000000000000 ffff880216310380 ffff880216d6f7d8 0000fffffffff000
<0> 0000000000000001 ffff880216b9a360 ffff88021a66a090 0000000000000001
Call Trace:
 [<ffffffff812997b5>] intel_alloc_iova+0xb5/0xe0
 [<ffffffff8129c22c>] intel_map_sg+0x14c/0x2a0
 [<ffffffff81363cec>] ata_qc_issue+0x13c/0x340
 [<ffffffff8136be00>] ? ata_scsi_rw_xlat+0x0/0x1f0
 [<ffffffff81369817>] ata_scsi_translate+0xa7/0x180
 [<ffffffff8134eef0>] ? scsi_done+0x0/0x60
 [<ffffffff8134eef0>] ? scsi_done+0x0/0x60
 [<ffffffff8136cead>] ata_scsi_queuecmd+0xbd/0x2d0
 [<ffffffff8134f22c>] scsi_dispatch_cmd+0x1ac/0x340
 [<ffffffff81356c25>] scsi_request_fn+0x415/0x590
 [<ffffffff81242689>] ? elv_insert+0x109/0x1a0
 [<ffffffff81248592>] __generic_unplug_device+0x32/0x40
 [<ffffffff8124b2d0>] __make_request+0x170/0x500
 [<ffffffff81249b3e>] generic_make_request+0x21e/0x5b0
 [<ffffffffa017253e>] ? jbd2_journal_file_buffer+0x4e/0x90 [jbd2]
 [<ffffffff81249f5f>] submit_bio+0x8f/0x120
 [<ffffffff811a33e6>] submit_bh+0xf6/0x150
 [<ffffffffa0173c59>] jbd2_journal_commit_transaction+0x5a9/0x1490 [jbd2]
 [<ffffffff8107a31b>] ? try_to_del_timer_sync+0x7b/0xe0
 [<ffffffffa0179948>] kjournald2+0xb8/0x220 [jbd2]
 [<ffffffff8108e360>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa0179890>] ? kjournald2+0x0/0x220 [jbd2]
 [<ffffffff8108dff6>] kthread+0x96/0xa0
 [<ffffffff8100c10a>] child_rip+0xa/0x20
 [<ffffffff8108df60>] ? kthread+0x0/0xa0
 [<ffffffff8100c100>] ? child_rip+0x0/0x20
Code: ff 4d 3b 74 24 18 75 05 4d 89 6c 24 10 48 8b 75 b0 4c 89 e7 e8 ee 67 24 00 48 83 c4 38 4c 89 e8 5b 41 5c 41 5d 41 5e 41 5f c9 c3 <0f> 0b eb fe 48 0f bd c9 bb 01 00 00 00 83 c1 01 48 d3 e3 e9 6e
RIP  [<ffffffff81297a04>] alloc_iova+0x1c4/0x240
 RSP <ffff8802143b58a0>
----------------------------------------------------------------

Comment 1 Jiri Pirko 2011-06-03 14:12:51 UTC
Ok, so this is connected to Intel IOMMU. I have a guest on that machine and I'm passing following 2 NICs to that:
1c:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
28:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 19)

reverting following patch fixes my problem:
http://patchwork.usersys.redhat.com/patch/34142/

Comment 3 Alex Williamson 2011-06-03 15:04:07 UTC

*** This bug has been marked as a duplicate of bug 705441 ***