| Summary: | [3.13.0-0.rc4.git0.1.fc21] Rebooting/booting of an L2 guest result in L1 (KVM guest) crash with "soft lockup" | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Kashyap Chamarthy <kchamart> | ||||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
| Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | rawhide | CC: | gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2013-12-18 15:52:23 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
Created attachment 838374 [details]
Full dmesg output from L1 guest.
Created attachment 838375 [details]
dmidecode output on L1 guest
1/ Full stack trace with 3.13.0-0.rc4.git0.1.fc21: ================ [ 268.134004] Code: 67 c2 00 89 c2 39 f0 0f 8d 33 fe ff ff 48 98 48 8b 0b 48 03 0c c5 20 77 cf 81 f6 41 20 01 74 cd 0f 1f 44 00 00 f3 90 f6 41 20 01 <75> f8 48 63 35 d1 66 c2 00 eb b7 0f b6 4d b4 48 8b 75 c0 4c 89 [ 296.134004] BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:39] [ 296.134004] Modules linked in: vhost_net vhost macvtap macvlan tun nf_conntrack_ipv6 nf_defrag_ipv6 xt_mac xt_physdev veth ipt_MASQUERADE iptable_n at nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc ip6table_filter ip6_tables ebtable_nat ebtables openvswitch vxlan ip_tunnel gre libcrc 32c nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack kvm_intel kvm virtio_net ppdev crc32c_intel i2c_piix4 virtio_balloon parport_pc i2c_cor e parport serio_raw microcode nfsd auth_rpcgss nfs_acl lockd sunrpc virtio_blk virtio_pci virtio_ring virtio ata_generic pata_acpi [ 296.134004] CPU: 3 PID: 39 Comm: kworker/3:1 Not tainted 3.13.0-0.rc4.git0.1.fc21.x86_64 #1 [ 296.134004] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 296.134004] Workqueue: events jump_label_update_timeout [ 296.134004] task: ffff8801fce41140 ti: ffff8801fce68000 task.ti: ffff8801fce68000 [ 296.134004] RIP: 0010:[<ffffffff810d560a>] [<ffffffff810d560a>] smp_call_function_many+0x25a/0x2c0 [ 296.134004] RSP: 0018:ffff8801fce69c98 EFLAGS: 00000202 [ 296.134004] RAX: 0000000000000001 RBX: 0000000000000282 RCX: ffff880208098178 [ 296.134004] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 0000000000000000 [ 296.134004] RBP: ffff8801fce69ce8 R08: ffff8801fd388c00 R09: ffff880208196ee0 [ 296.134004] R10: ffffea0007e9f0c0 R11: ffffffff8130d829 R12: ffff8801fce69c50 [ 296.134004] R13: 0000000000000fca R14: 0000000000000000 R15: ffff8801fcc6e240 [ 296.134004] FS: 0000000000000000(0000) GS:ffff880208180000(0000) knlGS:0000000000000000 [ 296.134004] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 296.134004] CR2: 00000000020d4830 CR3: 00000001fc3d2000 CR4: 00000000000027e0 [ 268.134004] Call Trace: [ 268.134004] [<ffffffff81019bb0>] ? setup_data_read+0x90/0x90 [ 268.134004] [<ffffffffa01a9e7a>] ? apic_has_pending_timer+0x1a/0x90 [kvm] [ 268.134004] [<ffffffff81019bb0>] ? setup_data_read+0x90/0x90 [ 268.134004] [<ffffffffa01a9e7b>] ? apic_has_pending_timer+0x1b/0x90 [kvm] [ 268.134004] [<ffffffff810d56cd>] on_each_cpu+0x2d/0x60 [ 268.134004] [<ffffffffa01a9e7a>] ? apic_has_pending_timer+0x1a/0x90 [kvm] [ 268.134004] [<ffffffff8101a38b>] text_poke_bp+0x5b/0xc0 [ 268.134004] [<ffffffffa01a9e7a>] ? apic_has_pending_timer+0x1a/0x90 [kvm] [ 268.134004] [<ffffffff8101789c>] arch_jump_label_transform+0x8c/0x100 [ 268.134004] [<ffffffff811438ff>] __jump_label_update+0x5f/0x80 [ 268.134004] [<ffffffff811439bd>] jump_label_update+0x9d/0xb0 [ 268.134004] [<ffffffff81143c1d>] __static_key_slow_dec+0x4d/0xb0 [ 268.134004] [<ffffffff81143d36>] jump_label_update_timeout+0x16/0x20 [ 268.134004] [<ffffffff810829a6>] process_one_work+0x176/0x430 [ 268.134004] [<ffffffff810835db>] worker_thread+0x11b/0x3a0 [ 268.134004] [<ffffffff810834c0>] ? rescuer_thread+0x350/0x350 [ 268.134004] [<ffffffff8108a062>] kthread+0xd2/0xf0 [ 268.134004] [<ffffffff81089f90>] ? insert_kthread_work+0x40/0x40 [ 268.134004] [<ffffffff816849bc>] ret_from_fork+0x7c/0xb0 [ 268.134004] [<ffffffff81089f90>] ? insert_kthread_work+0x40/0x40 [ 268.134004] Code: 67 c2 00 89 c2 39 f0 0f 8d 33 fe ff ff 48 98 48 8b 0b 48 03 0c c5 20 77 cf 81 f6 41 20 01 74 cd 0f 1f 44 00 00 f3 90 f6 41 20 01 <75> f8 48 63 35 d1 66 c2 00 eb b7 0f b6 4d b4 48 8b 75 c0 4c 89 ================ 2/ Stack trace with 3.11.8-300.fc20 (noticed on L1 guest's serial console): ================ [ 1140.060007] Code: f0 ff 4b 24 0f 94 c1 84 c9 44 89 e2 74 12 8b 43 20 8b 4b 10 8d 50 01 89 4b 24 89 53 20 44 89 e2 83 fa 04 74 20 f3 90 44 8b 63 20 <41> 39 d4 74 f0 41 83 fc 02 75 bf fa 66 66 90 66 66 90 eb c1 66 [ 1152.076005] BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:12] [ 1152.076005] Modules linked in: vhost_net vhost macvtap macvlan tun nf_conntrack_ipv6 nf_defrag_ipv6 xt_mac xt_physdev veth ipt_MASQUERADE iptable_n at nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc ip6table_filter ip6_tables ebtable_nat ebtables openvswitch gre nf_conntrack_ipv4 nf_d efrag_ipv4 xt_conntrack nf_conntrack joydev virtio_balloon kvm_intel i2c_piix4 i2c_core ppdev kvm virtio_net serio_raw crc32c_intel parport_pc parport mperf microcode nfsd auth_rpcgss nfs_acl lockd sunrpc virtio_blk virtio_pci virtio_ring ata_generic virtio pata_acpi [ 1152.076005] CPU: 1 PID: 12 Comm: migration/1 Not tainted 3.11.8-300.fc20.x86_64 #1 [ 1152.076005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 1152.076005] task: ffff8801fd396ac0 ti: ffff8801fcc24000 task.ti: ffff8801fcc24000 [ 1152.076005] RIP: 0010:[<ffffffff810dec76>] [<ffffffff810dec76>] stop_machine_cpu_stop+0x86/0xe0 [ 1152.076005] RSP: 0018:ffff8801fcc25d90 EFLAGS: 00000293 [ 1152.076005] RAX: ffffffff81cec880 RBX: ffff8801fcc60540 RCX: dead000000200200 [ 1152.076005] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8801f9f21cd8 [ 1168.099005] RBP: ffff8801fcca9db0 R08: ffff88020818e900 R09: 0000000000000000 [ 1168.099005] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8801fcca9ddc [ 1168.099005] R13: 0000000000014180 R14: 0000000200000003 R15: ffffffff810a2a33 [ 1168.099005] FS: 0000000000000000(0000) GS:ffff880208180000(0000) knlGS:0000000000000000 [ 1168.099005] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1168.099005] CR2: 000000000047ca40 CR3: 00000001f9d77000 CR4: 00000000000027e0 [ 1168.099005] Stack: [ 1168.099005] ffff8801f9f21c58 ffff88020818e930 ffff8801f9f21cd8 ffffffff810debf0 [ 1168.099005] ffff8801fcca9e78 ffffffff810deb42 ffff88020818e938 0000000000000000 [ 1168.099005] 0000000000000000 ffff880208194180 ffff8801fbf32680 0000000000000000 [ 1168.099005] Call Trace: [ 1168.099005] [<ffffffff810debf0>] ? cpu_stopper_thread+0x130/0x130 [ 1168.099005] [<ffffffff810deb42>] cpu_stopper_thread+0x82/0x130 [ 1168.099005] [<ffffffff81647f3a>] ? __schedule+0x2ba/0x750 [ 1168.099005] [<ffffffff8109085f>] smpboot_thread_fn+0xff/0x1a0 [ 1168.099005] [<ffffffff81090760>] ? lg_local_lock_cpu+0x40/0x40 [ 1168.099005] [<ffffffff81088660>] kthread+0xc0/0xd0 [ 1168.099005] [<ffffffff810885a0>] ? insert_kthread_work+0x40/0x40 [ 1168.099005] [<ffffffff816528ac>] ret_from_fork+0x7c/0xb0 [ 1168.099005] [<ffffffff810885a0>] ? insert_kthread_work+0x40/0x40 [ 1168.099005] Code: 84 ed 75 5d f0 ff 4b 24 0f 94 c1 84 c9 44 89 e2 74 12 8b 43 20 8b 4b 10 8d 50 01 89 4b 24 89 53 20 44 89 e2 83 fa 04 74 20 f3 90 <44> 8b 63 20 41 39 d4 74 f0 41 83 fc 02 75 bf fa 66 66 90 66 66 ================ The soft lockups are basically the result of the host not scheduling the guest to run in a sufficient period of time. At least as far as we can tell. We have a patch to prevent them from triggering on KVM guests, but it wasn't working. It was trying to do a DMI match against "Bochs Bochs" for the Vendor, but the Vendor is just "Bochs". I've fixed the patch in F19-rawhide now and I see the message I would expect on an F20 KVM guest: [ 0.030000] watchdog: disabled (inside virtual machine) Should be in tomorrow's rawhide. (In reply to Josh Boyer from comment #4) > The soft lockups are basically the result of the host not scheduling the > guest to run in a sufficient period of time. At least as far as we can > tell. We have a patch to prevent them from triggering on KVM guests, but it > wasn't working. > > It was trying to do a DMI match against "Bochs Bochs" for the Vendor, but > the Vendor is just "Bochs". I've fixed the patch in F19-rawhide now and I > see the message I would expect on an F20 KVM guest: > > [ 0.030000] watchdog: disabled (inside virtual machine) > > > Should be in tomorrow's rawhide. Thanks for the patch, Josh. It'd be nicer if the bug was closed *after* the successful build lands in Rawhide and note the NVR in "Fixed In Version". Ok, that's the version Josh is referring to in his previous comment: kernel-3.13.0-0.rc4.git2.1.fc21 |
Description of problem: ----------------------- I don't really have a reliable reproducer to obtain this soft lockup. Let me describe the two occurrences I noticed it. On first occurrence -- this was with kernel-3.11.8-300.fc20 -- I noticed (on serial console) the soft lockup on L1 crash when I tried to reboot an L2 KVM guest (i.e. nested guest). On second occurrence -- this was with kernel-3.13.0-0.rc4.git0.1.fc21 -- L1 crashed with the soft lockup when I tried to just *boot* (not reboot) an L2 KVM guest. $ rpm -q kernel --changelog | grep "Disable soft lockup detector on virtual machines" - Disable soft lockup detector on virtual machines. (rhbz 971139) - Disable soft lockup detector on virtual machines. (rhbz 971139) - Disable soft lockup detector on virtual machines. (rhbz 971139) However, in the guest (that crashed with soft lockup): $ dmesg | grep watchdog [ 0.061707] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. And this is indeed running inside a virtual machine: $ virt-what kvm $ file /dev/kvm /dev/kvm: character special $ cat /proc/cpuinfo | grep vmx flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon rep_good nopl pni vmx ssse3 cx16 sse4_1 sse4_2 x2apic popcnt hypervisor lahf_lm vnmi ept tsc_adjust Just to note: the Fedora Kernel patch which tries to disable the watchdog in a VM describes the symptom as "for various reasons" http://pkgs.fedoraproject.org/cgit/kernel.git/tree/nowatchdog-on-virt.patch Version: -------- On L0 (physical host): $ uname -r; rpm -q libvirt-daemon-kvm qemu-system-x86 3.13.0-0.rc4.git0.1.fc21.x86_64 libvirt-daemon-kvm-1.2.0-1.fc20.x86_64 qemu-system-x86-1.7.0-1.fc20.x86_64 On L1 (KVM guest): $ uname -r; rpm -q libvirt-daemon-kvm qemu-system-x86 3.13.0-0.rc4.git0.1.fc21.x86_64 libvirt-daemon-kvm-1.1.3.2-1.fc20.x86_64 qemu-system-x86-1.6.1-2.fc20.x86_64 How reproducible: ----------------- At-least twice. Steps to Reproduce: ------------------- In a nested KVM environment, reboot or boot an L2 KVM guest Details to setup nested KVM environment https://github.com/kashyapc/nvmx-haswell/blob/master/SETUP-nVMX.rst (Note: the above outlines for Haswell, but, it should apply for any other recent CPU models) Actual results: --------------- KVM guest (L1) crash with a stack trace (from serial console): =============== . . . [ 296.134004] BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:39] [ 296.134004] Modules linked in: vhost_net vhost macvtap macvlan tun nf_conntrack_ipv6 nf_defrag_ipv6 xt_mac xt_physdev veth . . . ipt_MASQUERADE iptable_n at nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc ip6table_filter ip6_tables ebtable_nat ebtables openvswitch vxlan ip_tunnel gre libcrc 32c nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack kvm_intel kvm virtio_net ppdev crc32c_intel i2c_piix4 virtio_balloon parport_pc i2c_cor e parport serio_raw microcode nfsd auth_rpcgss nfs_acl lockd sunrpc virtio_blk virtio_pci virtio_ring virtio ata_generic pata_acpi [ 296.134004] CPU: 3 PID: 39 Comm: kworker/3:1 Not tainted 3.13.0-0.rc4.git0.1.fc21.x86_64 #1 [ 296.134004] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 296.134004] Workqueue: events jump_label_update_timeout [ 296.134004] task: ffff8801fce41140 ti: ffff8801fce68000 task.ti: ffff8801fce68000 [ 296.134004] RIP: 0010:[<ffffffff810d560a>] [<ffffffff810d560a>] smp_call_function_many+0x25a/0x2c0 [ 296.134004] RSP: 0018:ffff8801fce69c98 EFLAGS: 00000202 [ 296.134004] RAX: 0000000000000001 RBX: 0000000000000282 RCX: ffff880208098178 [ 296.134004] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 0000000000000000 [ 296.134004] RBP: ffff8801fce69ce8 R08: ffff8801fd388c00 R09: ffff880208196ee0 [ 296.134004] R10: ffffea0007e9f0c0 R11: ffffffff8130d829 R12: ffff8801fce69c50 [ 296.134004] R13: 0000000000000fca R14: 0000000000000000 R15: ffff8801fcc6e240 [ 296.134004] FS: 0000000000000000(0000) GS:ffff880208180000(0000) knlGS:0000000000000000 [ 296.134004] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 296.134004] CR2: 00000000020d4830 CR3: 00000001fc3d2000 CR4: 00000000000027e0 [ 268.134004] Call Trace: [ 268.134004] [<ffffffff81019bb0>] ? setup_data_read+0x90/0x90 [ 268.134004] [<ffffffffa01a9e7a>] ? apic_has_pending_timer+0x1a/0x90 [kvm] [ 268.134004] [<ffffffff81019bb0>] ? setup_data_read+0x90/0x90 [ 268.134004] [<ffffffffa01a9e7b>] ? apic_has_pending_timer+0x1b/0x90 [kvm] [ 268.134004] [<ffffffff810d56cd>] on_each_cpu+0x2d/0x60 [ 268.134004] [<ffffffffa01a9e7a>] ? apic_has_pending_timer+0x1a/0x90 [kvm] [ 268.134004] [<ffffffff8101a38b>] text_poke_bp+0x5b/0xc0 [ 268.134004] [<ffffffffa01a9e7a>] ? apic_has_pending_timer+0x1a/0x90 [kvm] [ 268.134004] [<ffffffff8101789c>] arch_jump_label_transform+0x8c/0x100 [ 268.134004] [<ffffffff811438ff>] __jump_label_update+0x5f/0x80 [ 268.134004] [<ffffffff811439bd>] jump_label_update+0x9d/0xb0 [ 268.134004] [<ffffffff81143c1d>] __static_key_slow_dec+0x4d/0xb0 [ 268.134004] [<ffffffff81143d36>] jump_label_update_timeout+0x16/0x20 [ 268.134004] [<ffffffff810829a6>] process_one_work+0x176/0x430 [ 268.134004] [<ffffffff810835db>] worker_thread+0x11b/0x3a0 [ 268.134004] [<ffffffff810834c0>] ? rescuer_thread+0x350/0x350 [ 268.134004] [<ffffffff8108a062>] kthread+0xd2/0xf0 [ 268.134004] [<ffffffff81089f90>] ? insert_kthread_work+0x40/0x40 [ 268.134004] [<ffffffff816849bc>] ret_from_fork+0x7c/0xb0 [ 268.134004] [<ffffffff81089f90>] ? insert_kthread_work+0x40/0x40 =============== Expected results: ----------------- L1 (KVM guest) do not crash with a softlockup. Additional info: ---------------- (0) Related bug: https://bugzilla.redhat.com/show_bug.cgi?id=971139 -- BUG: soft lockup - CPU#3 stuck for 27s! [mc:39870] (1) dmesg output from guest attached. (2) dmidecode output from guest attached. (3) QEMU command-line of (L1) guest: $ ps -ef | grep qemu qemu 3958 1 99 15:36 ? 00:54:41 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name ostack-compute -S -machine pc-i440fx-1.6,accel=kvm,usb=off -cpu host -m 7813 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 3db96f52-11b0-41f5-bdb0-3a4605efaee5 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/ostack-compute.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/export/ostack-images/ostack-compute-sparse.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=26 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:58:76:13,bus=pci.0,addr=0x2 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 (4) The host was never under a heavy load. It's just running another Fedora guest, just a few OpenStack daemons (no heavy network or disk I/O). (5) I checked on Fedora's Kernelmaster sources, nowatchdog-on-virt.patch is indeed applied, but not taking effect? $ fedpkg clone -B kernel $ grep nowatchdog kernel.spec Patch15000: nowatchdog-on-virt.patch ApplyPatch nowatchdog-on-virt.patch