Bug 698926

Summary: segment fault when boot several snapshots at the same time
Product: Red Hat Enterprise Linux 6 Reporter: Suqin Huang <shuang>
Component: qemu-kvmAssignee: Kevin Wolf <kwolf>
Status: CLOSED CANTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1CC: ddutile, gcosta, juzhang, kwolf, michen, mkenneth, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-11 13:19:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Suqin Huang 2011-04-22 10:09:51 UTC
Description of problem:
create several snapshots with the same base image, and run the snapshots at the same time. kill qemu-kvm processes during guests running, then boot one snapshot again, guest segment fault

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.159.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create 8 snapshots
qemu-img create -f qcow2 -b rhel6.32-virtio.qcow2 sn1.qcow2 .... sn8.qcow2

2. Running these 8 snapshots one by one
/usr/libexec/qemu-kvm -drive file=/home/images/sn1.qcow2,index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idbX4B4n,mac=9a:48:bd:9e:94:5c,netdev=idbX4B4n,id=ndev00idbX4B4n,bus=pci.0,addr=0x3 -netdev tap,id=idbX4B4n,script='/home/scripts/qemu-ifup',downscript='no' -m 1024 -smp 2,cores=1,threads=1,sockets=2 -rtc base=utc,clock=host,driftfix=none  -boot order=cdn,once=c,menu=off -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm -vnc :1 -monitor stdio

3. kill qemu-kvm processes (repeat step2 & step3 several time if you don't reproduce)

4. boot one snapshot again.
  
Actual results:
snapshot is broken, segment fault during guest running

Expected results:


Additional info:

1. host
kernel: 2.6.32-131.0.1.el6.x86_64 

cpuinfo:
processor	: 3
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 2
model name	: AMD Phenom(tm) 9600B Quad-Core Processor
stepping	: 3
cpu MHz		: 1150.000
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs npt lbrv svm_lock
bogomips	: 4587.41
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate


2. guest:
rhel6.0-32

3. 

(gdb) bt
#0  0x000000000049526a in alloc_refcount_block (bs=0x2847010, offset=72057596790714880, length=<value optimized out>, addend=-1) at block/qcow2-refcount.c:334
#1  update_refcount (bs=0x2847010, offset=72057596790714880, length=<value optimized out>, addend=-1) at block/qcow2-refcount.c:459
#2  0x0000000000495ae0 in qcow2_free_clusters (bs=0x2847010, offset=72057596790714880, size=65536) at block/qcow2-refcount.c:639
#3  0x00000000004971ee in qcow2_alloc_cluster_link_l2 (bs=0x2847010, m=<value optimized out>) at block/qcow2-cluster.c:672
#4  0x00000000004923a8 in qcow2_aio_write_cb (opaque=0x2b7dc20, ret=0) at block/qcow2.c:642
#5  0x000000000048421a in qemu_laio_process_completion (s=<value optimized out>, laiocb=0x2d5d9a0) at linux-aio.c:68
#6  0x000000000048442f in qemu_laio_enqueue_completed (opaque=0x2844e60) at linux-aio.c:107
#7  qemu_laio_completion_cb (opaque=0x2844e60) at linux-aio.c:144
#8  0x000000000040ba2f in main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4430
#9  0x000000000042b52a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2164
#10 0x000000000040ef55 in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4640
#11 main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6845

4. snapshot serial info

Checking all file systems.
[/sbin/fsck.ext4 (1) -- /] fsck.ext4 -a /dev/mapper/VolGroup-lv_root
/dev/mapper/VolGroup-lv_root contains a file system with errors, check forced.
end_request: I/O error, dev vda, sector 30388480
Buffer I/O error on device dm-0, logical block 3670048
Buffer I/O error on device dm-0, logical block 3670049
Buffer I/O error on device dm-0, logical block 3670050
Buffer I/O error on device dm-0, logical block 3670051
Buffer I/O error on device dm-0, logical block 3670052
Buffer I/O error on device dm-0, logical block 3670053
Buffer I/O error on device dm-0, logical block 3670054
Buffer I/O error on device dm-0, logical block 3670055
end_request: I/O error, dev vda, sector 30388480
Buffer I/O error on device dm-0, logical block 3670048
end_request: I/O error, dev vda, sector 30388480
Buffer I/O error on device dm-0, logical block 3670048
Error reading block 3670048 (Attempt to read block from filesystem resulted in
short read) while getting next inode from scan.


5. qemu output

block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
qcow2_free_clusters failed: Invalid argument

Comment 2 Kevin Wolf 2011-06-08 08:23:14 UTC
(In reply to comment #0)
> 3. kill qemu-kvm processes (repeat step2 & step3 several time if you don't
> reproduce)

Does this mean kill -9?

Can you provide the qemu-img check output after this step?

Comment 3 Suqin Huang 2011-08-01 10:26:57 UTC
try several times, didn't reproduce it, con to test it