Bug 524642 - Using KVM monitor, Savevm produce a soft lockup in the guest.
Summary: Using KVM monitor, Savevm produce a soft lockup in the guest.
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.4
Hardware: x86_64
OS: Linux
low
high
Target Milestone: rc
: ---
Assignee: Eduardo Habkost
QA Contact: Lawrence Lim
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-09-21 15:33 UTC by Benjamin Cleyet-Marrel
Modified: 2014-03-26 01:02 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-10-29 22:46:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Benjamin Cleyet-Marrel 2009-09-21 15:33:17 UTC
Description of problem:


RHEL 5u4 
kmod-kvm-83-105.el5
kvm-83-105.el5



How reproducible:

always





Steps to Reproduce:
1. create a vm with a qcow2 image file formate
2. Install a guest (RHEL 5.3 in my test)
3. connect to the kvm monitor
4. and issue savevm1.
  
Actual results:
The guest gets an infamous CPU soft lockup and get stuck for a few seconds.


Expected results:

The guest does not get a CPU soft lockup and continue performing as if nothing happened.

Additional info:

here is the dmesg output


Sep 21 17:19:36 dhcp155 kernel: BUG: soft lockup - CPU#0 stuck for 10s! [swapper:0]
Sep 21 17:19:36 dhcp155 kernel: 
Sep 21 17:19:36 dhcp155 kernel: Pid: 0, comm:              swapper
Sep 21 17:19:36 dhcp155 kernel: EIP: 0060:[<c044d219>] CPU: 0
Sep 21 17:19:36 dhcp155 kernel: EIP is at handle_IRQ_event+0x39/0x8c
Sep 21 17:19:36 dhcp155 kernel:  EFLAGS: 00000246    Not tainted  (2.6.18-128.el5 #1)
Sep 21 17:19:36 dhcp155 kernel: EAX: 00000001 EBX: c06e6f00 ECX: f7d7fc00 EDX: c0754f9c
Sep 21 17:19:36 dhcp155 kernel: ESI: f7d7fc00 EDI: 00000001 EBP: 00000000 DS: 007b ES: 007b
Sep 21 17:19:36 dhcp155 kernel: CR0: 8005003b CR2: b7fdb000 CR3: 3779d000 CR4: 000006d0
Sep 21 17:19:36 dhcp155 kernel:  [<c044d2f0>] __do_IRQ+0x84/0xd6
Sep 21 17:19:36 dhcp155 kernel:  [<c04074e5>] do_IRQ+0xb0/0xc3
Sep 21 17:19:36 dhcp155 kernel:  [<c0405946>] common_interrupt+0x1a/0x20
Sep 21 17:19:36 dhcp155 kernel:  [<c044d219>] handle_IRQ_event+0x39/0x8c
Sep 21 17:19:36 dhcp155 kernel:  [<c044d2f0>] __do_IRQ+0x84/0xd6
Sep 21 17:19:36 dhcp155 kernel:  [<c044d26c>] __do_IRQ+0x0/0xd6
Sep 21 17:19:36 dhcp155 kernel:  [<c04074ce>] do_IRQ+0x99/0xc3
Sep 21 17:19:36 dhcp155 kernel:  [<c0405946>] common_interrupt+0x1a/0x20
Sep 21 17:19:36 dhcp155 kernel:  [<c0428ba7>] __do_softirq+0x57/0x114
Sep 21 17:19:36 dhcp155 kernel:  [<c04073eb>] do_softirq+0x52/0x9c
Sep 21 17:19:36 dhcp155 kernel:  [<c04059d7>] apic_timer_interrupt+0x1f/0x24
Sep 21 17:19:36 dhcp155 kernel:  [<c0403bb0>] default_idle+0x0/0x59
Sep 21 17:19:36 dhcp155 kernel:  [<c0403be1>] default_idle+0x31/0x59
Sep 21 17:19:36 dhcp155 kernel:  [<c0403ca8>] cpu_idle+0x9f/0xb9
Sep 21 17:19:36 dhcp155 kernel:  [<c06f59ee>] start_kernel+0x379/0x380
Sep 21 17:19:36 dhcp155 kernel:  =======================

Comment 1 Benjamin Cleyet-Marrel 2009-09-22 14:41:56 UTC
Hi, 
I have reproduce the problem
using a 32 bit guest
and a RHEL 5.4 as a guest

savevm is not working
I thought is was related to the shared storage but same thing on local file.
the more snapshot you do the worst it gets.

Please let me know if I can provide more input.

Comment 2 Dor Laor 2009-10-29 22:43:36 UTC
Kevin, is that a bug? Does savevm requires a 'stop' command to be consistent?
Anyway I guess the IO slowed down time guest and that's caused the oops.

Comment 3 RHEL Program Management 2009-10-29 22:46:53 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Comment 4 Kevin Wolf 2009-10-30 08:47:29 UTC
(In reply to comment #2)
> Kevin, is that a bug? Does savevm requires a 'stop' command to be consistent?
> Anyway I guess the IO slowed down time guest and that's caused the oops.  

No, a 'stop' shouldn't be needed, 'savevm' stops the VM by itself. However, we are losing about 30 seconds, so probably this is confusing the guest? Though in this case you would likely observe the oops also with a simple stop/cont.


Note You need to log in before you can comment on or make changes to this bug.