Bug 652187 - BUG: soft lockup - CPU#1 stuck for 13s after saving internal snapshot
BUG: soft lockup - CPU#1 stuck for 13s after saving internal snapshot
Status: CLOSED DUPLICATE of bug 583059
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm (Show other bugs)
5.5.z
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Zachary Amsden
Virtualization Bugs
:
Depends On:
Blocks: Rhel5KvmTier3
  Show dependency treegraph
 
Reported: 2010-11-11 05:12 EST by Shirley Zhou
Modified: 2015-03-04 19:52 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-24 11:57:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
dmesg info (22.33 KB, text/plain)
2010-11-11 05:13 EST, Shirley Zhou
no flags Details
dmesg info when guest become paused because of no space/input/output error (12.99 KB, text/plain)
2010-11-15 00:00 EST, Shirley Zhou
no flags Details

  None (edit)
Description Shirley Zhou 2010-11-11 05:12:57 EST
Description of problem:
"BUG: soft lockup - CPU#1 stuck for 13s" happens after save internal snapshot.

Version-Release number of selected component (if applicable):
kvm-83-207.el5

How reproducible:
100%

Steps to Reproduce:
1.run rhel5.5.z guest
/usr/libexec/qemu-kvm  -M rhel5.6.0 -m 4G -smp 4 -name RHEL5.5-64 -uuid 123465d2-2032-848d-bda0-de7adb141234 -boot cdn -drive file=/dev/vgtest/lvtest1,if=virtio,boot=on,bus=0,unit=0,format=qcow2,cache=off,werror=stop -net nic,macaddr=54:52:00:27:12:15,vlan=0,model=virtio -net tap,vlan=0,script=/etc/qemu-ifup -serial pty -parallel none -usb -usbdevice tablet   -monitor stdio  -spice host=0,ic=on,port=5937,disable-ticketing -qxl 1

2.when guest boot ok, save internal snapshot from monitor
(qemu)savevm s1

3.after savevm finish, check dmesg,message "BUG: soft lockup - CPU#1 stuck for 13s" shows up, and then loadvm
(qemu)loadvm s1

4.catch dmesg info

  
Actual results:
There are "BUG: soft lockup - CPU#1 stuck for 13s" shows up, please see attached file for detail dmesg info.
BUG: soft lockup - CPU#2 stuck for 25s! [swapper:0]
CPU 2:
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables loop dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac ipv6 xfrm_nalgo crypto_api parport_pc lp parport floppy joydev serio_raw ide_cd virtio_net virtio_balloon i2c_piix4 cdrom i2c_core pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 0, comm: swapper Not tainted 2.6.18-194.26.1.el5 #1
RIP: 0010:[<ffffffff80064b50>]  [<ffffffff80064b50>] _spin_unlock_irqrestore+0x8/0x9
RSP: 0018:ffff81010476be00  EFLAGS: 00000292
RAX: 0000000000000236 RBX: ffff81013f2bb5c0 RCX: 000000000000000c
RDX: 0000000000000060 RSI: 0000000000000292 RDI: ffffffff80348e58
RBP: ffff81010476bd80 R08: 0000000000000003 R09: ffff810104767e48
R10: 0000000000000001 R11: 0000000000000080 R12: ffffffff8005dc8e
R13: 000000000000001d R14: ffffffff80078225 R15: ffff81010476bd80
FS:  00002b4790fcd1f0(0000) GS:ffff81010471cec0(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002b4a0cf43090 CR3: 0000000128e37000 CR4: 00000000000006e0

Call Trace:
 <IRQ>  [<ffffffff8020a34f>] i8042_interrupt+0x92/0x1e9
 [<ffffffff80010c3a>] handle_IRQ_event+0x51/0xa6
 [<ffffffff800bafae>] __do_IRQ+0xa4/0x103
 [<ffffffff8006ca0d>] do_IRQ+0xe7/0xf5
 [<ffffffff8005d615>] ret_from_intr+0x0/0xa
 [<ffffffff8001240b>] __do_softirq+0x51/0x133
 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
 [<ffffffff8006cb8a>] do_softirq+0x2c/0x85
 [<ffffffff8006b342>] default_idle+0x0/0x50
 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c
 <EOI>  [<ffffffff8006b36b>] default_idle+0x29/0x50
 [<ffffffff8004923a>] cpu_idle+0x95/0xb8
 [<ffffffff80077991>] start_secondary+0x498/0x4a7


Expected results:
No this call trace shows.

Additional info:
Comment 1 Shirley Zhou 2010-11-11 05:13:30 EST
Created attachment 459713 [details]
dmesg info
Comment 2 Shirley Zhou 2010-11-14 23:59:53 EST
This issue also exist when guest become paused because of no space/input/output error. Attach dmesg info for reference.
Comment 3 Shirley Zhou 2010-11-15 00:00:36 EST
Created attachment 460470 [details]
dmesg info when guest become paused because of no space/input/output error
Comment 5 Zachary Amsden 2010-11-18 19:40:12 EST
I don't think this is a bug.

Yes, the CPU stops when you pause the guest, and doesn't get interrupts.

Looks like we just need Glauber's patches to avoid softlockup warnings on the 5.5z guest.  I don't think I have bug privs to do this, but the patches should already be in 5.7.
Comment 6 Glauber Costa 2010-11-19 06:00:36 EST
Zach,

Unless I am understanding something wrong, the softlockup happens after the savevm, but before loadvm. I'd agree it is not a bug if we were stopped for a while, then resumed.

Just issuing a savevm does not sound like a reason for a softlockup, so I am assuming it is a bug.

Could the reporter clarify ?
Comment 7 Shirley Zhou 2010-11-19 06:48:02 EST
(In reply to comment #6)
> Zach,
> 
> Unless I am understanding something wrong, the softlockup happens after the
> savevm, but before loadvm. I'd agree it is not a bug if we were stopped for a
> while, then resumed.
> 
> Just issuing a savevm does not sound like a reason for a softlockup, so I am
> assuming it is a bug.
> 
> Could the reporter clarify ?

Softlockup happens after savevm (before loadvm), and more softlockup happens after loadvm.
Comment 8 Zachary Amsden 2010-11-24 11:57:56 EST
This has been reported on the same kernel version previously and is now verified.

*** This bug has been marked as a duplicate of bug 583059 ***

Note You need to log in before you can comment on or make changes to this bug.