Bug 542954

Summary: Guest suffers kernel panic when save snapshot then restart guest
Product: Red Hat Enterprise Linux 5 Reporter: Qunfang Zhang <qzhang>
Component: kvmAssignee: Kevin Wolf <kwolf>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 5.5CC: apevec, cpelland, llim, mgoldboi, michen, moli, ovirt-maint, tburke, virt-maint, ycui, ykaul
Target Milestone: rcKeywords: Reopened, ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kvm-83-179.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-13 23:11:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 556823, 580948, 604188    
Attachments:
Description Flags
Guest kernel panic when reboot none

Description Qunfang Zhang 2009-12-01 10:10:52 UTC
Created attachment 375015 [details]
Guest kernel panic when reboot

Description of problem:
After saved the snapshot of guest, then restart it,guest failed to boot and suffers kernel panic.(Attachment will be uploaded.)
This issue only happens when guest using virtio block,ide block is ok.
This issue happens on both RHEL and windows guests.

CLI:
 /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -smp 2 -m 2G -drive file=/root/RHEL5.4-64-64K.qcow2,media=disk,if=virtio,index=0,boot=on,snapshot=on -net nic,vlan=0,macaddr=10:1a:4a:10:20:4d,model=virtio -net tap,vlan=0,script=/etc/qemu-ifup -uuid `uuidgen` -cpu qemu64,+sse2 -boot c -balloon none -monitor stdio -vnc :10

Version-Release number of selected component (if applicable):
#uname -r
2.6.18-175.el5
# rpm -qa | grep kvm
kvm-debuginfo-83-136.el5
kmod-kvm-83-136.el5
kvm-qemu-img-83-136.el5
kvm-83-136.el5
kvm-tools-83-136.el5

How reproducible:
100%

Steps to Reproduce:
1.Boot a vm with virtio block using the command above.
2.(qemu)savevm test
3.Restart the guest, "system_reset" and reboot inside guest both can reproduce it. 
  
Actual results:
Guest suffers kernel panic.

Expected results:
Guest should restart sucessfully.

Additional info:
Host info:
processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz
stepping	: 10
cpu MHz		: 2826.231
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm
bogomips	: 5652.48
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

Comment 1 Kevin Wolf 2010-04-22 14:23:15 UTC
I think I can reproduce this one. However, I don't even need the savevm, just starting the VM with a virtio-blk disk with snapshot=on is enough. This doesn't seem to happen on a manually created external snapshot - the differrence here is that with snapshot=on writeback caching is used, and indeed when patching this part out, it seems to work with snapshot=on as well.

"Seems to work" obviously doesn't mean that no bug exists there, it may just be hidden, for example by timing differences, so that it's much less likely to trigger it.

I can confirm that the image metadata is correct, but data gets corrupted (possibly by writes to the wrong virtual disk offset?). I haven't identified yet when exactly the fatal write is happening, but it seems to hit the start of the partition relatively often, sometimes corrupting my superblock.

Comment 5 Qunfang Zhang 2010-05-14 06:12:19 UTC
Verified on kvm-83-180.el5, pass. With the same steps and command line.

Comment 6 Qunfang Zhang 2010-05-17 07:50:07 UTC
Sorry for I make a operation mistake and change the status, now change it back.

Comment 7 Kevin Wolf 2010-06-15 12:40:56 UTC
*** Bug 578869 has been marked as a duplicate of this bug. ***

Comment 15 Ying Cui 2010-06-18 11:07:43 UTC
  I can reproduce this bug on RHEL5.5 64 server with old KVM: 83-161.el5 and Kernel: 2.6.18-194.el5.
  I add above comments just to record the test process.

Comment 17 Miya Chen 2010-11-02 04:34:34 UTC
Based on comment#5, change status to verified.

Comment 19 errata-xmlrpc 2011-01-13 23:11:57 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0028.html