Hide Forgot
Description of problem: qemu-kvm core dumps when being killed Version-Release number of selected component (if applicable): qemu-kvm-0.12.1.2-2.231.el6.x86_64 How reproducible: met it twice out of 20 attempts Steps to Reproduce: 1.install a win7.32 guest by: qemu-kvm -monitor stdio -chardev socket,id=serial_id_20120206-132300-0aWu,path=/tmp/serial-20120206,server,nowait -device isa-serial,chardev=serial_id_20120206-132300-0aWu -drive file='win7-32.qcow2',index=0,if=none,id=drive-ide0-0-0,media=disk,cache=none,format=qcow2,aio=native -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -device rtl8139,netdev=idf3e0Fy,mac=9a:38:bf:b1:ae:f9,id=ndev00idf3e0Fy,bus=pci.0,addr=0x3 -netdev tap,id=idf3e0Fy,fd=21 -m 4G -smp 4,cores=1,threads=2,sockets=2 -drive file='619077.iso',index=1,if=none,id=drive-ide0-0-1,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive file='winutils.iso',index=2,if=none,id=drive-ide0-1-0,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file='virtio-win.iso',index=3,if=none,id=drive-ide0-1-1,media=cdrom,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -cpu host -fda 'answer.vfd' -spice port=8000,disable-ticketing -vga qxl -rtc base=localtime,clock=host,driftfix=slew -M rhel6.3.0 -boot order=cdn,once=d,menu=off -enable-kvm 2.kill the qemu-kvm process 3. qemu-img check win7-32.qcow2 ... Leaked cluster 40039 refcount=1 reference=0 Leaked cluster 40040 refcount=1 reference=0 ERROR cluster 40041 refcount=1 reference=6 Actual results: qemu-kvm core dumps and disk corrupted Expected results: qemu-kvm killed without core dump. Additional info: kernel-2.6.32-220.el6.x86_64 qemu-kvm-0.12.1.2-2.231.el6.x86_64 qemu-img-0.12.1.2-2.231.el6.x86_64
Created attachment 563879 [details] gdb thread apply all bt full
"killing" qemu means sending SIGTERM here? At which point during the installation do you kill it? How did you create the image file (size, options)?
(In reply to comment #3) > "killing" qemu means sending SIGTERM here? At which point during the kill -9 $qemu-pid > installation do you kill it? How did you create the image file (size, options)? I encountered once / twice manually, on starting the copy process. images created by: qemu-img create -f qcow2 xxx.qcow2 20G
How can qemu-kvm abort and generate a core dump when you used SIGKILL?
(In reply to comment #5) > How can qemu-kvm abort and generate a core dump when you used SIGKILL? Oops, maybe is SIGTERM / kill -15. I found that can be reproduced sometimes when install guest by autotest, and then ctrl+c to end the autotest process, that might pass -15 to qemu-kvm instead of -9.
This may be related to bug 798857. Can you please try if the following scratch build fixes the problem? https://brewweb.devel.redhat.com/taskinfo?taskID=4281327
(In reply to comment #7) > This may be related to bug 798857. > > Can you please try if the following scratch build fixes the problem? > https://brewweb.devel.redhat.com/taskinfo?taskID=4281327 Sure, will update then.
(In reply to comment #7) > This may be related to bug 798857. > > Can you please try if the following scratch build fixes the problem? > https://brewweb.devel.redhat.com/taskinfo?taskID=4281327 still able to reproduce with qemu-kvm-0.12.1.2-2.272.el6.kwolf_drain_on_close_3.x86_64, 1) but this time I am using kill -6, 2) tried 10+ installation killed with ctrl-C didn't reproduce. btw, I re-check the attachment gdb output of this bug, seems it was killed by signal 6 when reporting, sorry for incorrect info in comment 6.
Signal 6 is SIGABRT, a core dump is expected there. What shouldn't happen is corrupted images. Do you still get messages like "ERROR cluster 40041 refcount=1 reference=6" in qemu-img check?
Oh, and in the original case the SIGABRT is not what you did. I believe you really did a SIGTERM and qemu tried to shut down in response. It's just that during the shutdown something went wrong (an assertion failed) and qemu called abort(), which uses SIGABRT internally.
(In reply to comment #10) > Signal 6 is SIGABRT, a core dump is expected there. > > What shouldn't happen is corrupted images. Do you still get messages like > "ERROR cluster 40041 refcount=1 reference=6" in qemu-img check? is shows "Leaked clusters were noticed during image check. No data integrity problem was found though." this time.
Thanks for testing. It seems to be fixed by this patch then. *** This bug has been marked as a duplicate of bug 798857 ***