Description of problem: Host crash when suspending to memory while kvm guest running. Version-Release number of selected component (if applicable): Kernel-2.6.18-233.el5 kvm-83-219.el5 How reproducible: 98% Steps to Reproduce: 1.Start a guest with command: /usr/libexec/qemu-kvm -rtc-td-hack -usbdevice tablet -no-hpet -drive file=/root/win2003_x64/win2k3_64_virtio.raw,if=virtio,boot=on,werror=stop,cache=none,format=raw,media=disk -cpu qemu64,+sse2 -smp 4 -m 8G -net nic,macaddr=00:32:34:5f:d6:2e,model=virtio,vlan=0 -net tap,script=/etc/qemu-ifup,vlan=0 -fda /root/virtio-drivers-1.0.0-45801.vfd -uuid `uuidgen` -vnc :1 -boot c -balloon none -monitor stdio 2.Suspend host to memory with command:echo mem >/sys/power/state Actual results: Host gets panic : Expected results: Additional info: Detail log is as following: Disabling non-boot CPUs ... Breaking affinity for irq 4 CPU 1 is now offline CPU1 is down BUG: soft lockup - CPU#3 stuck for 60s! [qemu-kvm:3628] CPU 3: Modules linked in: tun radeon drm autofs4 hidp rfcomm l2cap bluetooth lockd sund Pid: 3628, comm: qemu-kvm Tainted: G 2.6.18-233.el5 #1 RIP: 0010:[<ffffffff80077371>] [<ffffffff80077371>] __smp_call_function_many+0c RSP: 0018:ffff810202ea5b78 EFLAGS: 00000297 RAX: 0000000000000002 RBX: 0000000000000003 RCX: 0000000000000282 RDX: 00000000000008fc RSI: ffff810202ea5c18 RDI: 00000000000000fc RBP: 0000000100000000 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000002 R12: 00000002000280d2 R13: ffff81000001dc10 R14: 0000004400000000 R15: 0000000000000000 FS: 0000000043dd3940(0063) GS:ffff81021fc1c640(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aae573aaaf0 CR3: 000000021cb0f000 CR4: 00000000000006e0 Call Trace: [<ffffffff884170f8>] :kvm:ack_flush+0x0/0x1 [<ffffffff884170f8>] :kvm:ack_flush+0x0/0x1 [<ffffffff80077473>] smp_call_function_many+0x38/0x4c [<ffffffff884187fe>] :kvm:make_all_cpus_request+0x8f/0xa4 [<ffffffff88418828>] :kvm:kvm_flush_remote_tlbs+0xb/0x17 [<ffffffff884221be>] :kvm:kvm_mmu_zap_page+0x202/0x3ca [<ffffffff88423373>] :kvm:mmu_set_spte+0x255/0x3ca [<ffffffff88423a5e>] :kvm:direct_map_entry+0x5a/0xf6 [<ffffffff8842134b>] :kvm:walk_shadow+0x96/0xc3 [<ffffffff884213af>] :kvm:__direct_map+0x36/0x43 [<ffffffff88423a04>] :kvm:direct_map_entry+0x0/0xf6 [<ffffffff8842520a>] :kvm:tdp_page_fault+0xdf/0x11f [<ffffffff88422449>] :kvm:mmu_free_roots+0x8a/0x152 [<ffffffff8841e6f3>] :kvm:kvm_arch_vcpu_ioctl_run+0x1d3/0x61e [<ffffffff88419e57>] :kvm:kvm_vcpu_ioctl+0xf2/0x448 [<ffffffff80022242>] __up_read+0x19/0x7f [<ffffffff8006723e>] do_page_fault+0x4fe/0x874 [<ffffffff80042268>] do_ioctl+0x21/0x6b [<ffffffff80030262>] vfs_ioctl+0x457/0x4b9 [<ffffffff8004c737>] sys_ioctl+0x59/0x78 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 BUG: soft lockup - CPU#0 stuck for 60s! [events/0:14] CPU 0: Modules linked in: tun radeon drm autofs4 hidp rfcomm l2cap bluetooth lockd sund Pid: 14, comm: events/0 Tainted: G 2.6.18-233.el5 #1 RIP: 0010:[<ffffffff80064bbc>] [<ffffffff80064bbc>] .text.lock.spinlock+0x2/0x0 RSP: 0018:ffff81021fa0fd88 EFLAGS: 00000286 RAX: 0000000000000000 RBX: ffffffff80313a08 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffff8007382d RDI: ffffffff80314728 RBP: 0000000000000000 R08: 0000000000000001 R09: ffff81021fa0fdc0 R10: ffff81021b5ffa00 R11: 0000000000000206 R12: ffff81021fce4000 R13: 0000000000000000 R14: ffff8100090058a0 R15: 00000000072651d8 FS: 0000000041222940(0000) GS:ffffffff80424000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002aaaaacd5000 CR3: 000000020dc26000 CR4: 00000000000006e0 Call Trace: [<ffffffff8007745f>] smp_call_function_many+0x24/0x4c [<ffffffff8007382d>] mcheck_check_cpu+0x0/0x30 [<ffffffff80077564>] smp_call_function+0x4e/0x5e [<ffffffff8007382d>] mcheck_check_cpu+0x0/0x30 [<ffffffff80072af2>] mcheck_timer+0x0/0x6c [<ffffffff80095b37>] on_each_cpu+0x10/0x22 [<ffffffff80072b0e>] mcheck_timer+0x1c/0x6c [<ffffffff8004d7aa>] run_workqueue+0x99/0xf6 [<ffffffff80049ff2>] worker_thread+0x0/0x122 [<ffffffff8004a0e2>] worker_thread+0xf0/0x122 [<ffffffff8008e41e>] default_wake_function+0x0/0xe [<ffffffff80032968>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8003286a>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11
Seems like a serious bug in a scenario that probably never happens in real life, but since we did not encounter it since, it is probably not hurting anybody (or fixed), so I am closing for RHEL5.8 I suspend my laptop (FC14) with a running VM, so it was probably fixed since.