Bug 1030952 - unable to handle kernel NULL pointer dereference when booting qemu guest with switch [NEEDINFO]
Summary: unable to handle kernel NULL pointer dereference when booting qemu guest with...
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 19
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-15 11:48 UTC by Lukáš Doktor
Modified: 2013-11-29 12:43 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-11-18 15:09:49 UTC
Type: Bug
ldoktor: needinfo?


Attachments (Terms of Use)

Description Lukáš Doktor 2013-11-15 11:48:47 UTC
Description of problem:
When I try to boot a VM with x3130 switch and device inside it on addr=0, kernel raises:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
[    0.152000] IP: [<ffffffff81335471>] pcie_aspm_init_link_state+0x6f1/0x7c0
[    0.152000] PGD 0 
[    0.152000] Oops: 0000 [#1] SMP 
[    0.152000] Modules linked in:
[    0.152000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.11.7-200.fc19.x86_64 #1
[    0.152000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    0.152000] task: ffff88003da68000 ti: ffff88003da62000 task.ti: ffff88003da62000
[    0.152000] RIP: 0010:[<ffffffff81335471>]  [<ffffffff81335471>] pcie_aspm_init_link_state+0x6f1/0x7c0
[    0.152000] RSP: 0000:ffff88003da639c8  EFLAGS: 00010246
[    0.152000] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    0.152000] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003d4a2578
[    0.152000] RBP: ffff88003da63a20 R08: 0000000000016f40 R09: ffff88003e001800
[    0.152000] R10: ffffffff81334eb2 R11: 0000000000000000 R12: ffff88003d4a2540
[    0.152000] R13: ffff88003d4c6000 R14: ffff88003d4a2558 R15: ffff88003d439000
[    0.152000] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[    0.152000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.152000] CR2: 0000000000000088 CR3: 0000000001c0c000 CR4: 00000000000406f0
[    0.152000] Stack:
[    0.152000]  ffff88003da639f0 ffffffff81323870 ffff88003d4c7000 ffff88003d439400
[    0.152000]  0000000000008000 ffff88003da63a20 0000000000000000 ffff88003d439400
[    0.152000]  0000000000000001 0000000000000000 ffff88003d439000 ffff88003da63a50
[    0.152000] Call Trace:
[    0.152000]  [<ffffffff81323870>] ? pci_device_add+0x120/0x150
[    0.152000]  [<ffffffff813239ad>] pci_scan_slot+0x10d/0x150
[    0.152000]  [<ffffffff8132473d>] pci_scan_child_bus+0x3d/0x150
[    0.152000]  [<ffffffff8132453b>] pci_scan_bridge+0x46b/0x630
[    0.152000]  [<ffffffff813247b6>] pci_scan_child_bus+0xb6/0x150
[    0.152000]  [<ffffffff8132453b>] pci_scan_bridge+0x46b/0x630
[    0.152000]  [<ffffffff8163fb54>] ? pci_scan_single_device+0x54/0xc0
[    0.152000]  [<ffffffff813247b6>] pci_scan_child_bus+0xb6/0x150
[    0.152000]  [<ffffffff8152e3dd>] pci_acpi_scan_root+0x37d/0x540
[    0.152000]  [<ffffffff8136894a>] acpi_pci_root_add+0x2d7/0x3c5
[    0.152000]  [<ffffffff813641f0>] acpi_bus_device_attach+0x7d/0xcd
[    0.152000]  [<ffffffff8137e491>] acpi_ns_walk_namespace+0xc8/0x17f
[    0.152000]  [<ffffffff81364173>] ? acpi_bus_type_and_status+0x91/0x91
[    0.152000]  [<ffffffff81364173>] ? acpi_bus_type_and_status+0x91/0x91
[    0.152000]  [<ffffffff8137e980>] acpi_walk_namespace+0x95/0xc5
[    0.152000]  [<ffffffff81364ec7>] acpi_bus_scan+0x8b/0x9e
[    0.152000]  [<ffffffff81d4fe10>] acpi_scan_init+0x5e/0x15b
[    0.152000]  [<ffffffff81d4fc30>] acpi_init+0x25d/0x2a6
[    0.152000]  [<ffffffff81d4f9d3>] ? acpi_sleep_proc_init+0x2a/0x2a
[    0.152000]  [<ffffffff810020fa>] do_one_initcall+0xfa/0x1b0
[    0.152000]  [<ffffffff81086795>] ? parse_args+0x225/0x400
[    0.152000]  [<ffffffff81d0f078>] kernel_init_freeable+0x177/0x1ff
[    0.152000]  [<ffffffff81d0e898>] ? do_early_param+0x88/0x88
[    0.152000]  [<ffffffff8163da70>] ? rest_init+0x80/0x80
[    0.152000]  [<ffffffff8163da7e>] kernel_init+0xe/0x190
[    0.152000]  [<ffffffff816568ec>] ret_from_fork+0x7c/0xb0
[    0.152000]  [<ffffffff8163da70>] ? rest_init+0x80/0x80
[    0.152000] Code: 80 4c 24 49 70 48 8b 45 b0 4c 8b 68 28 4d 39 f5 0f 85 51 ff ff ff 4d 8b 2c 24 e9 ed fa ff ff 49 8b 45 10 48 8b 40 10 48 8b 40 38 <48> 8b 80 88 00 00 00 48 85 c0 0f 84 af 00 00 00 49 89 44 24 10 
[    0.152000] RIP  [<ffffffff81335471>] pcie_aspm_init_link_state+0x6f1/0x7c0
[    0.152000]  RSP <ffff88003da639c8>
[    0.152000] CR2: 0000000000000088
[    0.152021] ---[ end trace b7bd76ee625d0cdb ]---



Version-Release number of selected component (if applicable):
Host: Fedora 19, qemu-kvm-1.4.2-11.fc19.x86_64 and upstream qemu v1.7.0-rc0
Guest: Fedora 19, kernel-3.9.5-301 and kernel-3.11.7-200

How reproducible:
Always

Steps to Reproduce:
1. Execute qemu with switch:
MALLOC_PERTURB_=1  /usr/local/bin/qemu-system-x86_64  \
    -name 'virt-tests-vm1'  \
    -sandbox off  \
    -M q35  \
    -nodefaults  \
    -vga std \
    -device x3130-upstream,id=pci_switch1,bus=pcie.0,addr=02  \
    -chardev socket,id=hmp_id_hmp1,path=/tmp/monitor-hmp1-20131115-124049-pY0ctkmd,server,nowait \
    -mon chardev=hmp_id_hmp1,mode=readline  \
    -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20131115-124049-pY0ctkmd,server,nowait \
    -device isa-serial,chardev=serial_id_serial1  \
    -chardev socket,id=seabioslog_id_20131115-124049-pY0ctkmd,path=/tmp/seabios-20131115-124049-pY0ctkmd,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20131115-124049-pY0ctkmd,iobase=0x402 \
    -device ich9-usb-uhci1,id=usb1,bus=pcie.0,addr=03 \
    -device xio3130-downstream,bus=pci_switch1,id=pci_switch1.0,addr=00,chassis=1 \
    -device nec-usb-xhci,id=test_xhci1,addr=00,bus=pci_switch1.0 \
    -drive id=drive_image1,if=none,file=/home/medic/Work/Projekty/autotest/autotest-ldoktor/client/tests/virt/shared/data/images/f19-64.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pcie.0,addr=04 \
    -device virtio-net-pci,mac=9a:16:17:18:19:1a,id=idMBXaWf,netdev=idRlL0LS,bus=pcie.0,addr=05  \
    -netdev user,id=idRlL0LS,hostfwd=tcp::5000-:22  \
    -m 1024  \
    -smp 4,cores=1,threads=1,sockets=4  \
    -cpu 'SandyBridge' \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :1  \
    -rtc base=utc,clock=host,driftfix=none  \
    -boot order=cdn,once=c,menu=off \
    -enable-kvm


Actual results:
Guest fail to boot with Kernel raises: BUG: unable to handle kernel NULL pointer dereference at 0000000000000088

Expected results:
Guest boots properly with one device inside the switch

Additional info:
I tested many variants (pci.0->switch, pci.0->root->switch, pci.0->bridge->switch, ...) without success. When I use different addr than 0x0  qemu boots successfully.

Comment 1 Justin M. Forbes 2013-11-18 15:09:49 UTC
Connecting the switch to the PCIe root complex is not possible outside of qemu, so the kernel does not expect it, or check for it. ASPM init just chokes. There was a patch floating around to fix this, but upstream rejected it as the code was fairly complex and it would only solve the qemu issue.  That fix just ignored the invalid topology, instead of a NULL dereference, so you still couldn't *use* it.  For now, the answer is just don't do that.  In the future it is possible that the ASPM code will be reworked, but there is no timeframe on it.

Comment 2 Lukáš Doktor 2013-11-19 09:54:28 UTC
OK, thank you for the explanation. I tested the bridge->switch and ioh3420->switch which works fine.

Comment 3 Lukáš Doktor 2013-11-29 12:43:39 UTC
Hi Justin,

I found another setup, which is causing this bug and the switch is inside pci bridge:

->pci_root1->(test_devices)
           ->pci_bridge1->(test_devices)
           ->pci_bridge2->pci_switch1->(test_devices)

MALLOC_PERTURB_=1  /usr/local/bin/qemu-system-x86_64 \
    -S  \
    -name 'virt-tests-vm1'  \
    -sandbox off  \
    -M q35  \
    -nodefaults  \
    -vga std \
    -device ioh3420,id=pci_root1,bus=pcie.0,addr=02 \
    -device i82801b11-bridge,id=pci_bridge1,bus=pci_root1,addr=00 \
    -device i82801b11-bridge,id=pci_bridge2,bus=pci_root1,addr=01 \
    -device x3130-upstream,id=pci_switch1,bus=pci_bridge2,addr=01  \
    -chardev socket,id=hmp_id_hmp1,path=/tmp/monitor-hmp1-20131128-123533-ewpOwFW0,server,nowait \
    -mon chardev=hmp_id_hmp1,mode=readline  \
    -chardev socket,id=serial_id_serial1,path=/tmp/serial-serial1-20131128-123533-ewpOwFW0,server,nowait \
    -device isa-serial,chardev=serial_id_serial1  \
    -chardev socket,id=seabioslog_id_20131128-123533-ewpOwFW0,path=/tmp/seabios-20131128-123533-ewpOwFW0,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20131128-123533-ewpOwFW0,iobase=0x402 \
    -device ich9-usb-uhci1,id=usb1,bus=pcie.0,addr=03 \
    -device nec-usb-xhci,id=test_xhci1,addr=02,bus=pci_root1 \
    -device nec-usb-xhci,id=test_xhci2,addr=01,bus=pci_bridge1 \
    -device xio3130-downstream,bus=pci_switch1,id=pci_switch1.0,addr=00,chassis=1 \
    -device nec-usb-xhci,id=test_xhci3,addr=00,bus=pci_switch1.0 \
    -drive id=drive_image1,if=none,file=/home/medic/Work/Projekty/autotest/autotest-ldoktor/client/tests/virt/shared/data/images/f19-64.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pcie.0,addr=04 \
    -device virtio-net-pci,mac=9a:ea:eb:ec:ed:ee,id=idufbIuT,netdev=idZ9mrkG,bus=pcie.0,addr=05  \
    -netdev tap,id=idZ9mrkG,fd=10  \
    -m 1024  \
    -smp 4,cores=1,threads=1,sockets=4  \
    -cpu 'SandyBridge' \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=none  \
    -boot order=cdn,once=c,menu=off \
    -enable-kvm



BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
[    0.173000] IP: [<ffffffff81335391>] pcie_aspm_init_link_state+0x701/0x7d0
[    0.173000] PGD 0 
[    0.173000] Oops: 0000 [#1] SMP 
[    0.173000] Modules linked in:
[    0.173000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.11.1-200.fc19.x86_64 #1
[    0.173000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    0.173000] task: ffff88003da68000 ti: ffff88003da62000 task.ti: ffff88003da62000
[    0.173000] RIP: 0010:[<ffffffff81335391>]  [<ffffffff81335391>] pcie_aspm_init_link_state+0x701/0x7d0
[    0.173000] RSP: 0000:ffff88003da63b88  EFLAGS: 00010246
[    0.173000] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    0.173000] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003d4b1ab8
[    0.173000] RBP: ffff88003da63be0 R08: 0000000000016f40 R09: ffff88003e001800
[    0.173000] R10: ffffffff81334dc9 R11: 0000000000000000 R12: ffff88003d4b1a80
[    0.173000] R13: ffff88003d4cb000 R14: ffff88003d4b1a98 R15: ffff88003d4d4400
[    0.173000] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[    0.173000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.173000] CR2: 0000000000000088 CR3: 0000000001c0c000 CR4: 00000000000406f0
[    0.173000] Stack:
[    0.173000]  ffff88003da63bb0 ffffffff81320162 ffff88003d4cc000 ffff88003d4d4800
[    0.173000]  0000000000008000 ffff88003da63be0 0000000000000000 ffff88003d4d4800
[    0.173000]  0000000000000001 0000000000000000 ffff88003d4d4400 ffff88003da63c10
[    0.173000] Call Trace:
[    0.173000]  [<ffffffff81320162>] ? __rdmsr_on_cpu+0x42/0x50
[    0.173000]  [<ffffffff8132384d>] pci_scan_slot+0x10d/0x150
[    0.173000]  [<ffffffff813245dd>] pci_scan_child_bus+0x3d/0x150
[    0.173000]  [<ffffffff813243db>] pci_scan_bridge+0x46b/0x630
[    0.173000]  [<ffffffff81324656>] pci_scan_child_bus+0xb6/0x150
[    0.173000]  [<ffffffff813243db>] pci_scan_bridge+0x46b/0x630
[    0.173000]  [<ffffffff8163fa54>] ? pci_scan_single_device+0x54/0xc0
[    0.173000]  [<ffffffff81324656>] pci_scan_child_bus+0xb6/0x150
[    0.173000]  [<ffffffff813248f0>] pci_scan_root_bus+0xa0/0xb0
[    0.173000]  [<ffffffff8152ffbc>] pci_scan_bus_on_node+0x7c/0xd0
[    0.173000]  [<ffffffff8152e927>] pcibios_scan_specific_bus+0x97/0xa0
[    0.173000]  [<ffffffff81d60d14>] ? pci_legacy_init+0x37/0x37
[    0.173000]  [<ffffffff81d60d4a>] pci_subsys_init+0x36/0x48
[    0.173000]  [<ffffffff810020fa>] do_one_initcall+0xfa/0x1b0
[    0.173000]  [<ffffffff810866f5>] ? parse_args+0x225/0x400
[    0.173000]  [<ffffffff81d0f078>] kernel_init_freeable+0x177/0x1fa
[    0.173000]  [<ffffffff81d0e898>] ? do_early_param+0x88/0x88
[    0.173000]  [<ffffffff8163d970>] ? rest_init+0x80/0x80
[    0.173000]  [<ffffffff8163d97e>] kernel_init+0xe/0x190
[    0.173000]  [<ffffffff816567ac>] ret_from_fork+0x7c/0xb0
[    0.173000]  [<ffffffff8163d970>] ? rest_init+0x80/0x80
[    0.173000] Code: 80 4c 24 49 70 48 8b 45 b0 4c 8b 68 28 4d 39 f5 0f 85 51 ff ff ff 4d 8b 2c 24 e9 eb fa ff ff 49 8b 45 10 48 8b 40 10 48 8b 40 38 <48> 8b 80 88 00 00 00 48 85 c0 0f 84 af 00 00 00 49 89 44 24 10 
[    0.173000] RIP  [<ffffffff81335391>] pcie_aspm_init_link_state+0x701/0x7d0
[    0.173000]  RSP <ffff88003da63b88>
[    0.173000] CR2: 0000000000000088
[    0.173007] ---[ end trace af03b6c981d0a760 ]---


Would you please tell me what is wrong on this setup?

Kind regards,
Lukáš


Note You need to log in before you can comment on or make changes to this bug.