Bug 972381 - kernel panic when attach device to pcie switch
kernel panic when attach device to pcie switch
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.0
Unspecified Unspecified
medium Severity low
: rc
: ---
Assigned To: Radim Krčmář
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-08 23:16 EDT by Suqin Huang
Modified: 2013-08-26 10:21 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-26 10:21:18 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Suqin Huang 2013-06-08 23:16:30 EDT
Description of problem:
kernel panic when boot guest with device attached to switch

Version-Release number of selected component (if applicable):
qemu-kvm-1.5.0-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. boot guest with cmd:

/usr/libexec/qemu-kvm -M q35 -monitor stdio -vnc :0 \
-drive file=/root/RHEL-Server-7.0-64-virtio.qcow2,id=disk1,if=none,format=qcow2,media=disk,cache=none \
-device virtio-blk-pci,bus=pcie.0,id=virtio-disk1,addr=0x4,drive=disk1 \
-chardev socket,id=serial_info,path=/tmp/serial-rhel7,server,nowait \
-device isa-serial,chardev=serial_info \
-device x3130-upstream,bus=pcie.0,id=upstream,addr=0x5 \
-device xio3130-downstream,bus=upstream,id=downstream0,chassis=1 \
-device nec-usb-xhci,bus=downstream0,id=usb_controller \
-drive file=/root/usb-s.qcow2,if=none,format=qcow2,media=disk,id=usb_disk \
-device usb-storage,drive=usb_disk,id=usb_d,bus=usb_controller.0 \
-device virtio-net-pci,netdev=idmbEdhe,mac=9a:20:d8:63:50:40,id=ndev00idmbEdhe,bus=pcie.0,addr=0x3  \
-netdev tap,id=idmbEdhe,vhost=on,script=/etc/qemu-ifup -m 2048 -smp 2,cores=1,threads=1,sockets=2 -cpu SandyBridge -vga std -rtc base=utc,clock=host,driftfix=slew  -boot order=cdn,once=d,menu=off -no-kvm-pit-reinjection -no-shutdown -enable-kvm


2.
3.

Actual results:


Expected results:


Additional info:

serial info:

[    0.078750] BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
[    0.079000] IP: [<ffffffff8131dbfa>] pcie_aspm_init_link_state+0x30a/0x7b0
[    0.079000] PGD 0 
[    0.079000] Oops: 0000 [#1] SMP 
[    0.079000] Modules linked in:
[    0.079000] CPU 0 
[    0.079000] Pid: 1, comm: swapper/0 Not tainted 3.7.0-0.36.el7.x86_64 #1 Bochs Bochs
[    0.079000] RIP: 0010:[<ffffffff8131dbfa>]  [<ffffffff8131dbfa>] pcie_aspm_init_link_state+0x30a/0x7b0
[    0.079000] RSP: 0000:ffff880005d25928  EFLAGS: 00010246
[    0.079000] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[    0.079000] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880005f0ecf8
[    0.079000] RBP: ffff880005d259b8 R08: 0000000000016de0 R09: ffff880005f0ecc0
[    0.079000] R10: 0000000000000000 R11: 00000000000000c9 R12: ffff880005f0ecc0
[    0.079000] R13: ffff880005f20000 R14: ffff880005f0ecd8 R15: 0000000000000000
[    0.079000] FS:  0000000000000000(0000) GS:ffff880007c00000(0000) knlGS:0000000000000000
[    0.079000] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    0.079000] CR2: 0000000000000088 CR3: 00000000018c3000 CR4: 00000000000006f0
[    0.079000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.079000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    0.079000] Process swapper/0 (pid: 1, threadinfo ffff880005d24000, task ffff880005d80000)
[    0.079000] Stack:
[    0.079000]  ffff880005d25958 ffff880005f22000 ffff880005f28000 ffff880005f22000
[    0.079000]  ffff880005f28000 0000000000000000 ffff880005d25978 ffffffff81310143
[    0.079000]  ffff880005f22000 ffff880005f28000 ffff880005d259b8 ffffffff815cd287
[    0.079000] Call Trace:
[    0.079000]  [<ffffffff81310143>] ? pci_device_add+0xf3/0x100
[    0.079000]  [<ffffffff815cd287>] ? pci_scan_single_device+0xa7/0xc0
[    0.079000]  [<ffffffff8130efb0>] ? next_trad_fn+0x20/0x20
[    0.079000]  [<ffffffff81310295>] pci_scan_slot+0x145/0x160
[    0.079000]  [<ffffffff815cfe90>] pci_scan_child_bus+0x4d/0x123
[    0.079000]  [<ffffffff815cfadf>] pci_scan_bridge+0x1c1/0x525
[    0.079000]  [<ffffffff815cff1a>] pci_scan_child_bus+0xd7/0x123
[    0.079000]  [<ffffffff815cfadf>] pci_scan_bridge+0x1c1/0x525
[    0.079000]  [<ffffffff815cd244>] ? pci_scan_single_device+0x64/0xc0
[    0.079000]  [<ffffffff81310776>] ? pci_create_root_bus+0x326/0x3f0
[    0.079000]  [<ffffffff815cff1a>] pci_scan_child_bus+0xd7/0x123
[    0.079000]  [<ffffffff815d5ac7>] pci_acpi_scan_root+0x43c/0x4e6
[    0.079000]  [<ffffffff815d21e9>] acpi_pci_root_add+0x19d/0x45d
[    0.079000]  [<ffffffff8134bde3>] acpi_device_probe+0x50/0x11d
[    0.079000]  [<ffffffff813d463b>] driver_probe_device+0x8b/0x390
[    0.079000]  [<ffffffff813d4940>] ? driver_probe_device+0x390/0x390
[    0.079000]  [<ffffffff813d49eb>] __driver_attach+0xab/0xb0
[    0.079000]  [<ffffffff813d4940>] ? driver_probe_device+0x390/0x390
[    0.079000]  [<ffffffff813d26c5>] bus_for_each_dev+0x55/0x90
[    0.079000]  [<ffffffff813d3fae>] driver_attach+0x1e/0x20
[    0.079000]  [<ffffffff813d3be0>] bus_add_driver+0x1a0/0x290
[    0.079000]  [<ffffffff81a0bd2a>] ? find_dock+0x22/0x22
[    0.079000]  [<ffffffff81a0bd2a>] ? find_dock+0x22/0x22
[    0.079000]  [<ffffffff813d50b7>] driver_register+0x77/0x170
[    0.079000]  [<ffffffff81a0bd2a>] ? find_dock+0x22/0x22
[    0.079000]  [<ffffffff8134c58d>] acpi_bus_register_driver+0x3e/0x48
[    0.079000]  [<ffffffff81a0bd4f>] acpi_pci_root_init+0x25/0x2d
[    0.079000]  [<ffffffff8100216a>] do_one_initcall+0x12a/0x180
[    0.079000]  [<ffffffff815c9d8c>] kernel_init+0x2cc/0x450
[    0.079000]  [<ffffffff819d8614>] ? do_early_param+0x8c/0x8c
[    0.079000]  [<ffffffff815c9ac0>] ? rest_init+0x80/0x80
[    0.079000]  [<ffffffff815fc1ac>] ret_from_fork+0x7c/0xb0
[    0.079000]  [<ffffffff815c9ac0>] ? rest_init+0x80/0x80
[    0.079000] Code: ff e9 02 fe ff ff 41 83 e6 01 b9 01 00 00 00 e9 53 ff ff ff 41 bc 10 00 00 00 e9 3e fe ff ff 49 8b 45 10 48 8b 40 10 48 8b 40 38 <48> 8b 80 88 00 00 00 48 85 c0 0f 84 84 04 00 00 49 89 44 24 10 
[    0.079000] RIP  [<ffffffff8131dbfa>] pcie_aspm_init_link_state+0x30a/0x7b0
[    0.079000]  RSP <ffff880005d25928>
[    0.079000] CR2: 0000000000000088
[    0.079005] ---[ end trace a0ff03ecb1cf5882 ]---
[    0.080008] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    0.080008]
Comment 2 Radim Krčmář 2013-06-25 10:02:17 EDT
PCIe specification does not allow direct connection of upstream port to the root hub (complex).

We have to create root port and connect throught it:
  -M q35 -device ioh3420,bus=pcie.0,id=root.0 \
  -device x3130-upstream,bus=root.0,id=upstream \
  -device xio3130-downstream,bus=upstream,id=downstream,chassis=1

Upstream kernel is not happy with a check for misconfigured qemu, so it should be avoided/prevented it in userspace.
(Qemu allows even more nonsensical topologies, where downstream port is not connected to upstream port.)

Was this command generated by libvirt?
Comment 3 zhonglinzhang 2013-07-04 06:37:26 EDT
(In reply to Radim Krčmář from comment #2)
> PCIe specification does not allow direct connection of upstream port to the
> root hub (complex).
> 
> We have to create root port and connect throught it:
>   -M q35 -device ioh3420,bus=pcie.0,id=root.0 \
>   -device x3130-upstream,bus=root.0,id=upstream \
>   -device xio3130-downstream,bus=upstream,id=downstream,chassis=1
> 
Re-tested this issue by using the command you provided, hit the same panic

---snip commandline of mine---
/usr/libexec/qemu-kvm -M q35 -device ioh3420,bus=pcie.0,id=root.0 -device x3130-upstream,bus=root.0,addr=0x4,id=upstream -device xio3130-downstream,bus=upstream,id=downstream0,chassis=1 -drive file=/home/rhel7_switch.qcow,if=none,id=drive-system-disk,media=disk,format=qcow2,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,bus=downstream0,drive=drive-system-disk,id=system-disk,bootindex=1

Hi Radim,

Would you please have a look again? Any further testing, please let me know.

> Upstream kernel is not happy with a check for misconfigured qemu, so it
> should be avoided/prevented it in userspace.
> (Qemu allows even more nonsensical topologies, where downstream port is not
> connected to upstream port.)
> 
> Was this command generated by libvirt?
Comment 4 Radim Krčmář 2013-07-04 10:07:34 EDT
The kernel boots without "addr=0x4", or with "addr=0x0".
Also the backtrace now goes through "pci_subsys_init" and not "acpi_init", so the problem is a bit different.

How are the addresses chosen?
Comment 5 zhonglinzhang 2013-07-04 22:24:47 EDT
Re-tested this issue without "addr=0x4", or with "addr=0x0". guest boot successfully and no kernel panic. 

About comment3, Is it a new issue?  Do I need open a bug to track it?
Comment 6 Radim Krčmář 2013-08-26 10:21:18 EDT
Upstream kernel decided to drop simple fix for this issue, hoping someone will rewrite aspm support instead.

Modeling hardware configurations is a qemu feature, so we won't be fixing this.
(I don't have enough information on source of these parameters to open new bugs)

Note You need to log in before you can comment on or make changes to this bug.