Description of problem: "BUG: soft lockup - CPU#1 stuck for 13s" happens after save internal snapshot. Version-Release number of selected component (if applicable): kvm-83-207.el5 How reproducible: 100% Steps to Reproduce: 1.run rhel5.5.z guest /usr/libexec/qemu-kvm -M rhel5.6.0 -m 4G -smp 4 -name RHEL5.5-64 -uuid 123465d2-2032-848d-bda0-de7adb141234 -boot cdn -drive file=/dev/vgtest/lvtest1,if=virtio,boot=on,bus=0,unit=0,format=qcow2,cache=off,werror=stop -net nic,macaddr=54:52:00:27:12:15,vlan=0,model=virtio -net tap,vlan=0,script=/etc/qemu-ifup -serial pty -parallel none -usb -usbdevice tablet -monitor stdio -spice host=0,ic=on,port=5937,disable-ticketing -qxl 1 2.when guest boot ok, save internal snapshot from monitor (qemu)savevm s1 3.after savevm finish, check dmesg,message "BUG: soft lockup - CPU#1 stuck for 13s" shows up, and then loadvm (qemu)loadvm s1 4.catch dmesg info Actual results: There are "BUG: soft lockup - CPU#1 stuck for 13s" shows up, please see attached file for detail dmesg info. BUG: soft lockup - CPU#2 stuck for 25s! [swapper:0] CPU 2: Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables loop dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac ipv6 xfrm_nalgo crypto_api parport_pc lp parport floppy joydev serio_raw ide_cd virtio_net virtio_balloon i2c_piix4 cdrom i2c_core pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Not tainted 2.6.18-194.26.1.el5 #1 RIP: 0010:[<ffffffff80064b50>] [<ffffffff80064b50>] _spin_unlock_irqrestore+0x8/0x9 RSP: 0018:ffff81010476be00 EFLAGS: 00000292 RAX: 0000000000000236 RBX: ffff81013f2bb5c0 RCX: 000000000000000c RDX: 0000000000000060 RSI: 0000000000000292 RDI: ffffffff80348e58 RBP: ffff81010476bd80 R08: 0000000000000003 R09: ffff810104767e48 R10: 0000000000000001 R11: 0000000000000080 R12: ffffffff8005dc8e R13: 000000000000001d R14: ffffffff80078225 R15: ffff81010476bd80 FS: 00002b4790fcd1f0(0000) GS:ffff81010471cec0(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002b4a0cf43090 CR3: 0000000128e37000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff8020a34f>] i8042_interrupt+0x92/0x1e9 [<ffffffff80010c3a>] handle_IRQ_event+0x51/0xa6 [<ffffffff800bafae>] __do_IRQ+0xa4/0x103 [<ffffffff8006ca0d>] do_IRQ+0xe7/0xf5 [<ffffffff8005d615>] ret_from_intr+0x0/0xa [<ffffffff8001240b>] __do_softirq+0x51/0x133 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006cb8a>] do_softirq+0x2c/0x85 [<ffffffff8006b342>] default_idle+0x0/0x50 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff8006b36b>] default_idle+0x29/0x50 [<ffffffff8004923a>] cpu_idle+0x95/0xb8 [<ffffffff80077991>] start_secondary+0x498/0x4a7 Expected results: No this call trace shows. Additional info:
Created attachment 459713 [details] dmesg info
This issue also exist when guest become paused because of no space/input/output error. Attach dmesg info for reference.
Created attachment 460470 [details] dmesg info when guest become paused because of no space/input/output error
I don't think this is a bug. Yes, the CPU stops when you pause the guest, and doesn't get interrupts. Looks like we just need Glauber's patches to avoid softlockup warnings on the 5.5z guest. I don't think I have bug privs to do this, but the patches should already be in 5.7.
Zach, Unless I am understanding something wrong, the softlockup happens after the savevm, but before loadvm. I'd agree it is not a bug if we were stopped for a while, then resumed. Just issuing a savevm does not sound like a reason for a softlockup, so I am assuming it is a bug. Could the reporter clarify ?
(In reply to comment #6) > Zach, > > Unless I am understanding something wrong, the softlockup happens after the > savevm, but before loadvm. I'd agree it is not a bug if we were stopped for a > while, then resumed. > > Just issuing a savevm does not sound like a reason for a softlockup, so I am > assuming it is a bug. > > Could the reporter clarify ? Softlockup happens after savevm (before loadvm), and more softlockup happens after loadvm.
This has been reported on the same kernel version previously and is now verified. *** This bug has been marked as a duplicate of bug 583059 ***