Bug 542954 - Guest suffers kernel panic when save snapshot then restart guest
Summary: Guest suffers kernel panic when save snapshot then restart guest
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.5
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 578869 (view as bug list)
Depends On:
Blocks: 556823 Rhel5KvmTier2 604188
TreeView+ depends on / blocked
 
Reported: 2009-12-01 10:10 UTC by Qunfang Zhang
Modified: 2013-01-09 22:05 UTC (History)
11 users (show)

Fixed In Version: kvm-83-179.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-01-13 23:11:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Guest kernel panic when reboot (21.77 KB, image/png)
2009-12-01 10:10 UTC, Qunfang Zhang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0028 0 normal SHIPPED_LIVE Low: kvm security and bug fix update 2011-01-13 11:03:39 UTC

Description Qunfang Zhang 2009-12-01 10:10:52 UTC
Created attachment 375015 [details]
Guest kernel panic when reboot

Description of problem:
After saved the snapshot of guest, then restart it,guest failed to boot and suffers kernel panic.(Attachment will be uploaded.)
This issue only happens when guest using virtio block,ide block is ok.
This issue happens on both RHEL and windows guests.

CLI:
 /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -smp 2 -m 2G -drive file=/root/RHEL5.4-64-64K.qcow2,media=disk,if=virtio,index=0,boot=on,snapshot=on -net nic,vlan=0,macaddr=10:1a:4a:10:20:4d,model=virtio -net tap,vlan=0,script=/etc/qemu-ifup -uuid `uuidgen` -cpu qemu64,+sse2 -boot c -balloon none -monitor stdio -vnc :10

Version-Release number of selected component (if applicable):
#uname -r
2.6.18-175.el5
# rpm -qa | grep kvm
kvm-debuginfo-83-136.el5
kmod-kvm-83-136.el5
kvm-qemu-img-83-136.el5
kvm-83-136.el5
kvm-tools-83-136.el5

How reproducible:
100%

Steps to Reproduce:
1.Boot a vm with virtio block using the command above.
2.(qemu)savevm test
3.Restart the guest, "system_reset" and reboot inside guest both can reproduce it. 
  
Actual results:
Guest suffers kernel panic.

Expected results:
Guest should restart sucessfully.

Additional info:
Host info:
processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz
stepping	: 10
cpu MHz		: 2826.231
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm
bogomips	: 5652.48
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

Comment 1 Kevin Wolf 2010-04-22 14:23:15 UTC
I think I can reproduce this one. However, I don't even need the savevm, just starting the VM with a virtio-blk disk with snapshot=on is enough. This doesn't seem to happen on a manually created external snapshot - the differrence here is that with snapshot=on writeback caching is used, and indeed when patching this part out, it seems to work with snapshot=on as well.

"Seems to work" obviously doesn't mean that no bug exists there, it may just be hidden, for example by timing differences, so that it's much less likely to trigger it.

I can confirm that the image metadata is correct, but data gets corrupted (possibly by writes to the wrong virtual disk offset?). I haven't identified yet when exactly the fatal write is happening, but it seems to hit the start of the partition relatively often, sometimes corrupting my superblock.

Comment 5 Qunfang Zhang 2010-05-14 06:12:19 UTC
Verified on kvm-83-180.el5, pass. With the same steps and command line.

Comment 6 Qunfang Zhang 2010-05-17 07:50:07 UTC
Sorry for I make a operation mistake and change the status, now change it back.

Comment 7 Kevin Wolf 2010-06-15 12:40:56 UTC
*** Bug 578869 has been marked as a duplicate of this bug. ***

Comment 15 Ying Cui 2010-06-18 11:07:43 UTC
  I can reproduce this bug on RHEL5.5 64 server with old KVM: 83-161.el5 and Kernel: 2.6.18-194.el5.
  I add above comments just to record the test process.

Comment 17 Miya Chen 2010-11-02 04:34:34 UTC
Based on comment#5, change status to verified.

Comment 19 errata-xmlrpc 2011-01-13 23:11:57 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0028.html


Note You need to log in before you can comment on or make changes to this bug.