Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 698926

Summary: segment fault when boot several snapshots at the same time
Product: Red Hat Enterprise Linux 6 Reporter: Suqin Huang <shuang>
Component: qemu-kvmAssignee: Kevin Wolf <kwolf>
Status: CLOSED CANTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1CC: ddutile, gcosta, juzhang, kwolf, michen, mkenneth, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-11 13:19:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Suqin Huang 2011-04-22 10:09:51 UTC
Description of problem:
create several snapshots with the same base image, and run the snapshots at the same time. kill qemu-kvm processes during guests running, then boot one snapshot again, guest segment fault

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.159.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create 8 snapshots
qemu-img create -f qcow2 -b rhel6.32-virtio.qcow2 sn1.qcow2 .... sn8.qcow2

2. Running these 8 snapshots one by one
/usr/libexec/qemu-kvm -drive file=/home/images/sn1.qcow2,index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idbX4B4n,mac=9a:48:bd:9e:94:5c,netdev=idbX4B4n,id=ndev00idbX4B4n,bus=pci.0,addr=0x3 -netdev tap,id=idbX4B4n,script='/home/scripts/qemu-ifup',downscript='no' -m 1024 -smp 2,cores=1,threads=1,sockets=2 -rtc base=utc,clock=host,driftfix=none  -boot order=cdn,once=c,menu=off -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm -vnc :1 -monitor stdio

3. kill qemu-kvm processes (repeat step2 & step3 several time if you don't reproduce)

4. boot one snapshot again.
  
Actual results:
snapshot is broken, segment fault during guest running

Expected results:


Additional info:

1. host
kernel: 2.6.32-131.0.1.el6.x86_64 

cpuinfo:
processor	: 3
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 2
model name	: AMD Phenom(tm) 9600B Quad-Core Processor
stepping	: 3
cpu MHz		: 1150.000
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs npt lbrv svm_lock
bogomips	: 4587.41
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate


2. guest:
rhel6.0-32

3. 

(gdb) bt
#0  0x000000000049526a in alloc_refcount_block (bs=0x2847010, offset=72057596790714880, length=<value optimized out>, addend=-1) at block/qcow2-refcount.c:334
#1  update_refcount (bs=0x2847010, offset=72057596790714880, length=<value optimized out>, addend=-1) at block/qcow2-refcount.c:459
#2  0x0000000000495ae0 in qcow2_free_clusters (bs=0x2847010, offset=72057596790714880, size=65536) at block/qcow2-refcount.c:639
#3  0x00000000004971ee in qcow2_alloc_cluster_link_l2 (bs=0x2847010, m=<value optimized out>) at block/qcow2-cluster.c:672
#4  0x00000000004923a8 in qcow2_aio_write_cb (opaque=0x2b7dc20, ret=0) at block/qcow2.c:642
#5  0x000000000048421a in qemu_laio_process_completion (s=<value optimized out>, laiocb=0x2d5d9a0) at linux-aio.c:68
#6  0x000000000048442f in qemu_laio_enqueue_completed (opaque=0x2844e60) at linux-aio.c:107
#7  qemu_laio_completion_cb (opaque=0x2844e60) at linux-aio.c:144
#8  0x000000000040ba2f in main_loop_wait (timeout=1000) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4430
#9  0x000000000042b52a in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2164
#10 0x000000000040ef55 in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4640
#11 main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6845

4. snapshot serial info

Checking all file systems.
[/sbin/fsck.ext4 (1) -- /] fsck.ext4 -a /dev/mapper/VolGroup-lv_root
/dev/mapper/VolGroup-lv_root contains a file system with errors, check forced.
end_request: I/O error, dev vda, sector 30388480
Buffer I/O error on device dm-0, logical block 3670048
Buffer I/O error on device dm-0, logical block 3670049
Buffer I/O error on device dm-0, logical block 3670050
Buffer I/O error on device dm-0, logical block 3670051
Buffer I/O error on device dm-0, logical block 3670052
Buffer I/O error on device dm-0, logical block 3670053
Buffer I/O error on device dm-0, logical block 3670054
Buffer I/O error on device dm-0, logical block 3670055
end_request: I/O error, dev vda, sector 30388480
Buffer I/O error on device dm-0, logical block 3670048
end_request: I/O error, dev vda, sector 30388480
Buffer I/O error on device dm-0, logical block 3670048
Error reading block 3670048 (Attempt to read block from filesystem resulted in
short read) while getting next inode from scan.


5. qemu output

block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
block I/O error in device 'drive-virtio-disk1': Invalid argument (22)
qcow2_free_clusters failed: Invalid argument

Comment 2 Kevin Wolf 2011-06-08 08:23:14 UTC
(In reply to comment #0)
> 3. kill qemu-kvm processes (repeat step2 & step3 several time if you don't
> reproduce)

Does this mean kill -9?

Can you provide the qemu-img check output after this step?

Comment 3 Suqin Huang 2011-08-01 10:26:57 UTC
try several times, didn't reproduce it, con to test it