Created attachment 377199 [details] Hand transcribed oops Description of problem: Using KVM on a new Dell Precision T7500 has resulted in great instability, to the extent that the system has been hanging every 10 minutes or so. At first the NVIDIA driver was loaded, so I switched to Nouveau and the crashes still happened. I'm using the version of the kernel installed when I used the commands suggested at: https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes Not sure why it chose an unreleased kernel. Anyway, I wasn't able to get kdump to work so have hand-transcribed this morning's crash. The system was idle, other than the KVM guest. Version-Release number of selected component (if applicable): 2.6.31.6-162.fc12.x86_64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: 00:00.0 Host bridge: Intel Corporation X58 I/O Hub to ESI Port (rev 13) 00:01.0 PCI bridge: Intel Corporation X58 I/O Hub PCI Express Root Port 1 (rev 13) 00:03.0 PCI bridge: Intel Corporation X58 I/O Hub PCI Express Root Port 3 (rev 13) 00:07.0 PCI bridge: Intel Corporation X58 I/O Hub PCI Express Root Port 7 (rev 13) 00:14.0 PIC: Intel Corporation X58 I/O Hub System Management Registers (rev 13) 00:14.1 PIC: Intel Corporation X58 I/O Hub GPIO and Scratch Pad Registers (rev 13) 00:14.2 PIC: Intel Corporation X58 I/O Hub Control Status and RAS Registers (rev 13) 00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 00:1a.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5 00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6 00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller 00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 1 00:1c.5 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 6 00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) 00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller 00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller 00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller 01:00.0 PCI bridge: Pericom Semiconductor PCI Express to PCI-XPI7C9X130 PCI-X Bridge (rev 04) 03:00.0 VGA compatible controller: nVidia Corporation Device 0659 (rev a1) 05:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08) 06:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5761 Gigabit Ethernet PCIe (rev 10) 07:0a.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link) 20:03.0 PCI bridge: Intel Corporation X58 I/O Hub PCI Express Root Port 3 (rev 13) 20:07.0 PCI bridge: Intel Corporation X58 I/O Hub PCI Express Root Port 7 (rev 13) 20:09.0 PCI bridge: Intel Corporation X58 I/O Hub PCI Express Root Port 9 (rev 13) 20:14.0 PIC: Intel Corporation X58 I/O Hub System Management Registers (rev 13) 20:14.1 PIC: Intel Corporation X58 I/O Hub GPIO and Scratch Pad Registers (rev 13) 20:14.2 PIC: Intel Corporation X58 I/O Hub Control Status and RAS Registers (rev 13) 23:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)
Created attachment 378558 [details] kernel oops http://www.redhat.com/archives/fedora-virt/2009-December/msg00041.html I am seeing a kernel oops and panic on a host running 2.6.31.6-166.fc12.x86_64 (and also at least 2.6.31.6-145) when I autostart a f12-x86_64 qemu-kvm guest. I'm running on a quad core AMD with: qemu-kvm-0.11.0-12.fc12.x86_64 kernel-2.6.31.6-166.fc12.x86_64 libvirt-0.7.1-15.fc12.x86_64 If I flag a guest as autoboot and reboot the host then the host starts, the guest starts, and some seconds later (presumably when the guest is fully up or just before) the host OOPs and hangs. The full error log is attached, it begins as: BUG: unable to handle kernel paging request at 0000000000200200 IP: [<ffffffff8139aad7>] destroy_conntrack+0x82/0x11f
Thanks for the report.
In my case, it seems to be related to IPv6. If I switch off the ip6tables service and ensure the relevant modules aren't loaded, the system seems stable.
Same problem for me. For me it's enough to disable ip6tables for bridge by adding: net.bridge.bridge-nf-call-ip6tables = 0 to /etc/sysctl.conf No need to remove ip6tables firewall completelly.
That rule alone do not fix it for me, and they are set by default it would seem on this box. I can reproduce this 100% reliably with recent F12 kernels and upstream 2.6.32 and 2.6.33-rc5.
Do the crashes happen if you blacklist the ipv6/ip6tables modules as well?
People, you may want to check out https://bugzilla.redhat.com/show_bug.cgi?id=533087 too, as there is similar and potentially relevant information there.
I spent the whole weekend learning the netfilter code and SLUB debugging to find this problem: http://lkml.org/lkml/2010/2/2/272 There should be a patch soon.
*** This bug has been marked as a duplicate of bug 533087 ***