Bug 619455
Summary: | Host kernel oops after a series of virsh {attach,detach}-device | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jiri Denemark <jdenemar> | ||||||
Component: | kernel | Assignee: | Alex Williamson <alex.williamson> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 6.0 | CC: | alex.williamson, chayang, ddugger, ddutile, dwmw2, gcosta, jpirko, jyang, kzhang, michen, tburke, yang.z.zhang | ||||||
Target Milestone: | rc | Keywords: | Triaged | ||||||
Target Release: | 6.1 | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | kernel-2.6.32-128.el6 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 688646 (view as bug list) | Environment: | |||||||
Last Closed: | 2011-05-23 20:43:41 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 580566, 580951, 580954, 635500 | ||||||||
Attachments: |
|
Description
Jiri Denemark
2010-07-29 15:06:40 UTC
oops info coming... RBX = RDI = R9 = R11 = R15 = 0x0000000000000000 didn't take note of other registers RIP 0010:[<ffffffff812640ec>] list_del+0xc/0xa0 stack frame: [<ffffffff81288110>] domain_remove_dev_info+0x40/0xe0 [<ffffffff81289167>] domain_exit+0x27/0x190 [<ffffffff81288659>] ? iommu_attach_domain+0xb9/0xc0 [<ffffffff8128bada>] get_domain_for_dev.clone.3+0x31e/0x5d0 [<ffffffffa01e64d2>] ? tg3_nvram_read+0xc2/0x170 [tg3] [<ffffffff8128c38c>] __intel_map_single+0x19c/0x210 [<ffffffff8114e937>] ? alloc_pages_current+0x87/0xd0 [<ffffffff8128c4fe>] intel_alloc_coherent+0xae/0x120 [<ffffffffa01ea371>] ? tg3_read_mem+0xa1/0x120 [tg3] Code: 55 48 89 e5 53 48 89 fb 48 83 ec 08 <48> 8b 47 08 <ffffffff812640e0> 55 pushq %rbp <ffffffff812640e1> 48 89 e5 movq %rsp, %rbp <ffffffff812640e4> 53 pushq %rbx <ffffffff812640e5> 48 89 fb movq %rdi, %rbx <ffffffff812640e8> 48 83 ec 08 subq $8, %rsp <ffffffff812640ec> 48 8b 47 08 movq $8(%rdi), %rax This issue has been proposed when we are only considering blocker issues in the current Red Hat Enterprise Linux release. ** If you would still like this issue considered for the current release, ask your support representative to file as a blocker on your behalf. Otherwise ask that it be considered for the next Red Hat Enterprise Linux release. ** This time 82 iterations were enough to trigger the bug. Thank you for your bug report. This issue was evaluated for inclusion in the current release of Red Hat Enterprise Linux. Unfortunately, we are unable to address this request in the current release. Because we are in the final stage of Red Hat Enterprise Linux 6 development, only significant, release-blocking issues involving serious regressions and data corruption can be considered. If you believe this issue meets the release blocking criteria as defined and communicated to you by your Red Hat Support representative, please ask your representative to file this issue as a blocker for the current release. Otherwise, ask that it be evaluated for inclusion in the next minor release of Red Hat Enterprise Linux. Is this still reproducible? I'm using: kernel-2.6.32-71.4.1.el6.x86_64 qemu-kvm-0.12.1.2-2.128.el6.x86_64 libvirt-0.8.1-27.el6.x86_64 I've done well over 600 attach/detach cycles and haven't seen any issues. Does the device you're testing with perhaps have a PCI option ROM? (Please provide lspci -vvv of the host device being assigned) It looks like there may be a memory leak when dealing with option ROMs. If you can still reproduce this, please include the guest xml the xml for the added device, and the actual host oops message. I reproduced it after something like 150 iterations with kernel-2.6.32-72.el6.x86_64 qemu-kvm-0.12.1.2-2.113.el6.x86_64 libvirt-0.8.6-1.el6.x86_64 and after 255 iterations with kernel-2.6.32-94.el6.x86_64.rpm qemu-kvm-0.12.1.2-2.128.el6.x86_64.rpm libvirt-0.8.6-1.el6.x86_64 pci.xml: <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address bus='1' slot='0' function='0'/> </source> </hostdev> Domain XML and output of lspci -vvv will come as attachments. Unfortunately, I can't provide the actual oops message since it scrolls out of my screen and I don't have a serial cable to redirect the output elsewhere. Created attachment 472062 [details]
lspci -vvvs 1:0.0
Created attachment 472063 [details]
rhel6.xml
Thanks Jiri. One difference I see between our testing is that I specify the guest PCI slot, which is I think what libvirt would do too. My xml file looks like this: <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </hostdev> Do you notice in your test if the assigned device in the guest changes it's slot on each iteration? Could you do something similar to the above and see if you still get an oops? I also note that the tested device is a tg3, which really doesn't even work with device assignment until qemu-kvm-0.12.1.2-2.127.el6. This could have something to do with older versions failing with fewer iterations. It also has an option ROM, albeit small, so the guest process size will grow due to bz667188. Thanks. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. If you would like it considered as an exception in the current release, please ask your support representative. This request was erroneously denied for the current release of Red Hat Enterprise Linux. The error has been fixed and this request has been re-proposed for the current release. I modified the pci.xml to explicitly specify guest's PCI slot the device should be hotplugged to. The oops is still reproducible after 255 iterations. To be specific, the 255th detach causes the host to crash. I'm starting to get suspicious about this 255 magic number. Previously, the number of iterations needed to reproduce the issue was varying quite a lot but now it seems to constantly be 255. Jiri, can you confirm what your host system is? I see intel_alloc_coherent in the backtrace Paolo provided, which implies an Intel VT-d system. I just want to make sure that I shouldn't be looking for AMD IOMMU specific issues. Aha, this is a tg3 bug, I can reproduce using the following modified script (tg3 is 0000:05:00.0 on my system): i=1; while echo 0000:05:00.0 > /sys/bus/pci/drivers/tg3/unbind; do echo $i; i=$[i+1]; sleep 0.5; echo 0000:05:00.0 > /sys/bus/pci/drivers/tg3/bind; sleep 0.5; done Panic: tg3 0000:05:00.0: PME# enabled tg3 0000:05:00.0: PCI INT A disabled tg3 0000:05:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 tg3 0000:05:00.0: PME# disabled IOMMU: no free domain ids BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff81263bac>] list_del+0xc/0xa0 PGD 36ec79067 PUD 36ec6c067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/bus/pci/drivers/tg3/bind CPU 3 Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables nfs lockd fscache nfs_acl auth_rpcgss xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT iptable_filter ip_tables bridge stp llc autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log kvm_intel kvm uinput sg serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 i7core_edac edac_core ioatdma igb dca ext4 mbcache jbd2 raid1 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx sr_mod cdrom sd_mod crc_t10dif ahci dm_mod [last unloaded: microcode] Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables nfs lockd fscache nfs_acl auth_rpcgss xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT iptable_filter ip_tables bridge stp llc autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log kvm_intel kvm uinput sg serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 i7core_edac edac_core ioatdma igb dca ext4 mbcache jbd2 raid1 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx sr_mod cdrom sd_mod crc_t10dif ahci dm_mod [last unloaded: microcode] Pid: 3208, comm: bash Not tainted 2.6.32-94.el6.x86_64 #1 4157CTO RIP: 0010:[<ffffffff81263bac>] [<ffffffff81263bac>] list_del+0xc/0xa0 RSP: 0018:ffff88036a0e9b08 EFLAGS: 00010092 RAX: 0000000000000282 RBX: 0000000000000000 RCX: 0000000000003385 RDX: 0000000000000282 RSI: 0000000000000046 RDI: 0000000000000000 RBP: ffff88036a0e9b18 R08: ffffffff81b9f920 R09: 0000000000000000 R10: 0000000000000038 R11: 0000000000000000 R12: ffff8803587e9f40 R13: ffff8803587e9f50 R14: 0000000000000282 R15: 0000000000000000 FS: 00007f2c2718c700(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000003a86c744b0 CR3: 000000036b5ec000 CR4: 00000000000026e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process bash (pid: 3208, threadinfo ffff88036a0e8000, task ffff88036ba034e0) Stack: ffff88036a0e9b68 0000000000000000 ffff88036a0e9b48 ffffffff81287f30 <0> ffff8803587e9f40 ffff880371a5e000 ffff8803587e9f40 0000000000000000 <0> ffff88036a0e9ba8 ffffffff81288f87 ffffffffffffffff 0000000000000000 Call Trace: [<ffffffff81287f30>] domain_remove_dev_info+0x40/0xe0 [<ffffffff81288f87>] domain_exit+0x27/0x190 [<ffffffff81288479>] ? iommu_attach_domain+0xb9/0xc0 [<ffffffff8128b8fa>] get_domain_for_dev.clone.3+0x31a/0x5d0 [<ffffffffa01754d2>] ? tg3_nvram_read+0xc2/0x170 [tg3] [<ffffffff8128c1ac>] __intel_map_single+0x19c/0x210 [<ffffffff8114975a>] ? alloc_pages_current+0x9a/0x100 [<ffffffff8128c31e>] intel_alloc_coherent+0xae/0x120 [<ffffffffa0179371>] ? tg3_read_mem+0xa1/0x120 [tg3] [<ffffffffa018c52c>] tg3_init_one+0xa9a/0x1564 [tg3] [<ffffffff811da04e>] ? sysfs_addrm_finish+0x4e/0x290 [<ffffffff81271817>] local_pci_probe+0x17/0x20 [<ffffffff81272a01>] pci_device_probe+0x101/0x120 [<ffffffff8132a0a2>] ? driver_sysfs_add+0x62/0x90 [<ffffffff8132a240>] driver_probe_device+0xa0/0x2a0 [<ffffffff813293fa>] driver_bind+0xca/0x110 [<ffffffff8132877c>] drv_attr_store+0x2c/0x30 [<ffffffff811d84a5>] sysfs_write_file+0xe5/0x170 [<ffffffff81165e48>] vfs_write+0xb8/0x1a0 [<ffffffff810cca12>] ? audit_syscall_entry+0x272/0x2a0 [<ffffffff81166881>] sys_write+0x51/0x90 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b Code: 00 ff ff ff 89 95 fc fe ff ff e9 ab fd ff ff 4c 8b ad e8 fe ff ff e9 db fd ff ff 90 90 90 90 55 48 89 e5 53 48 89 fb 48 83 ec 08 <48> 8b 47 08 4c 8b 00 4c 39 c7 75 39 48 8b 03 4c 8b 40 08 4c 39 RIP [<ffffffff81263bac>] list_del+0xc/0xa0 RSP <ffff88036a0e9b08> CR2: 0000000000000008 ---[ end trace 0a9a95e2e4fa5fbc ]--- *** Bug 635682 has been marked as a duplicate of this bug. *** Adding David Woodhouse. It's become clear that this is an intel-iommu bug. Any device that does DMA will allocate a domain ID from the iommu. When the device is unbound from the driver, the domain ID is never freed and we eventually hit the limit of supported domain IDs. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. If you would like it considered as an exception in the current release, please ask your support representative. This request was erroneously denied for the current release of Red Hat Enterprise Linux. The error has been fixed and this request has been re-proposed for the current release. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Hit same issue with BCM5764M(whose driver is tg3) when using script in comment #17 Patch(es) available on kernel-2.6.32-128.el6 Verified on kernel 2.6.32-130.el6.x86_64 with BCM5764M using script in comment #17, unbind/bind over 1000 times, did not hit kernel panic. This bug has been fixed according to comment #25 and comment #29. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0542.html This introduced a regression. See bug 710382 for details. |