Red Hat Bugzilla – Bug 1299875
system_reset should clear pending request for error (IDE)
Last modified: 2017-08-01 13:46:48 EDT
+++ This bug was initially created as a clone of Bug #1281713 +++ Description of problem: qemu-kvm quit with Segmentation fault after execute system_reset when no space left on host. Version-Release number of selected component (if applicable): qemu-img-0.12.1.2-2.481.el6.x86_64 qemu-kvm-tools-0.12.1.2-2.481.el6.x86_64 qemu-kvm-0.12.1.2-2.481.el6.x86_64 qemu-guest-agent-0.12.1.2-2.481.el6.x86_64 qemu-kvm-debuginfo-0.12.1.2-2.481.el6.x86_64 2.6.32-583.el6.x86_64 How reproducible: 70% Steps to Reproduce: 1.Create a 25G win2012.qcow2 image and install a windows2012r2 guest. 2.In guest located filesystem, make it out of space by copy guest image several times until no space left on device prompt. Launch guest by qemu-kvm command: /usr/libexec/qemu-kvm -name win2012 -m 2048 \ -cpu Opteron_G4 \ -smp 1,cores=1,threads=2,sockets=2,maxcpus=4 \ -vga qxl\ -serial unix:/tmp/m,server,nowait \ -drive file=win2012-64r2-virtio-scsi.qcow2,if=none,id=drive-scsi-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0 -device scsi-hd,drive=drive-scsi-disk0,bus=scsi0.0,scsi-id=0,lun=0,id=scsi-disk0,bootindex=1 \ -monitor stdio \ -usb -device usb-kbd,id=input0 \ -vnc :1 3. Interact with guest by browsing internet or other things until you see "block I/O error in device 'ide0-hd0': No space left on device (28)" prompt from qemu-kvm monitor(Prompt usually happen within 5 minutes), input system_reset in qemu monitor. And Segmentation fault will happen. Actual results: qemu-kvm quit with Segmentation fault after execute system_reset Expected results: qemu-kvm process should still alive and guest system can be reset without error Additional info: Stack info: Core was generated by `/usr/libexec/qemu-kvm -name win2012 -m 2048 -cpu SandyBridge -smp 2,cores=1,thr'. Program terminated with signal 11, Segmentation fault. #0 0x00007f85d1b04a90 in ?? () (gdb) bt #0 0x00007f85d1b04a90 in ?? () #1 0x00007f85d03f5aee in bdrv_aio_cancel (acb=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:3842 #2 0x00007f85d052d46a in ide_dma_cancel (bm=0x7f85d26e1160) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/core.c:2395 #3 0x00007f85d052d499 in ide_dma_reset (bm=0x7f85d26e1160) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/core.c:2408 #4 0x00007f85d05335ad in piix3_reset (opaque=0x7f85d26e0010) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/piix.c:124 #5 0x00007f85d03b71d2 in qemu_system_reset (report=true) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:3417 #6 0x00007f85d03dd050 in qemu_kvm_system_reset (report=true) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1992 #7 0x00007f85d03dd253 in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2272 #8 0x00007f85d03be317 in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4273 #9 main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6731 Qemu-kvm won't quit with Segmentation fault on Opteron_G5 host but windows guest cannot be reset after system_reset. --- Additional comment from Markus Armbruster on 2015-11-24 14:37:15 BRST --- Can you reproduce this with a qemu-kvm built with --enable-debug? --- Additional comment from Guo, Zhiyi on 2015-11-27 00:00:55 BRST --- Hi, I guess you may want to see the function call ?? or argument value has been optimized. Stack trace still the same as reported in description. I have enabled --enable-debug option and rebuild the qemu-kvm. -g option has been added to compile procedure from configure file: + ../configure --target-list=x86_64-softmmu '--extra-ldflags=-Wl,--build-id -pie -Wl,-z,relro -Wl,-z,now' '--extra-cflags=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIE -DPIE' --with-pkgversion=qemu-kvm-0.12.1.2-2.479.el6.2 --prefix=/usr --localstatedir=/var --sysconfdir=/etc --disable-strip --disable-xen --block-drv-rw-whitelist=qcow2,raw,file,host_device,host_cdrom,qed,gluster,rbd --block-drv-ro-whitelist=vmdk,vpc --disable-debug-tcg --disable-sparse --disable-sdl --disable-curses --disable-curl --disable-check-utests --disable-bluez --enable-docs --disable-vde --disable-spice --trace-backend=nop --enable-smartcard --disable-smartcard-nss --enable-mixemu Install prefix /usr BIOS directory /usr/share/qemu binary directory /usr/bin local state directory /var Manual directory /usr/share/man ELF interp prefix /usr/gnemul/qemu-%M Source path /root/rpmbuild/BUILD/qemu-kvm-0.12.1.2 C compiler gcc Host C compiler gcc CFLAGS -O2 -g BR/ Zhiyi --- Additional comment from Markus Armbruster on 2015-11-27 05:17:53 BRST --- I can't see --enable-debug in your configure line. I can see -O2. You need to get one roughly like this: ../configure --target-list=x86_64-softmmu '--extra-ldflags=-Wl,--build-id -pie -Wl,-z,relro -Wl,-z,now' '--extra-cflags=-g -pipe -Wall -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIE -DPIE' --with-pkgversion=qemu-kvm-0.12.1.2-2.479.el6.2 --prefix=/usr --localstatedir=/var --sysconfdir=/etc --disable-strip --disable-xen --block-drv-rw-whitelist=qcow2,raw,file,host_device,host_cdrom,qed,gluster,rbd --block-drv-ro-whitelist=vmdk,vpc --disable-debug-tcg --disable-sparse --disable-sdl --disable-curses --disable-curl --disable-check-utests --disable-bluez --enable-docs --disable-vde --disable-spice --trace-backend=nop --enable-smartcard --disable-smartcard-nss --enable-mixemu Please try again :) --- Additional comment from Guo, Zhiyi on 2015-11-27 09:40:59 BRST --- (In reply to Markus Armbruster from comment #4) > I can't see --enable-debug in your configure line. I can see -O2. You need > to get one roughly like this: > > ../configure --target-list=x86_64-softmmu '--extra-ldflags=-Wl,--build-id > -pie -Wl,-z,relro -Wl,-z,now' '--extra-cflags=-g -pipe -Wall -fexceptions > -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIE -DPIE' > --with-pkgversion=qemu-kvm-0.12.1.2-2.479.el6.2 --prefix=/usr > --localstatedir=/var --sysconfdir=/etc --disable-strip --disable-xen > --block-drv-rw-whitelist=qcow2,raw,file,host_device,host_cdrom,qed,gluster, > rbd --block-drv-ro-whitelist=vmdk,vpc --disable-debug-tcg --disable-sparse > --disable-sdl --disable-curses --disable-curl --disable-check-utests > --disable-bluez --enable-docs --disable-vde --disable-spice > --trace-backend=nop --enable-smartcard --disable-smartcard-nss > --enable-mixemu > > Please try again :) Stack trace with none optimized code: (gdb) bt #0 0x00007f1e571cfb10 in ?? () #1 0x00007f1e55ebd5ed in bdrv_aio_cancel_async (acb=0x7f1e571cfc10) at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:3876 #2 0x00007f1e55ebd499 in bdrv_aio_cancel (acb=0x7f1e571cfc10) at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:3842 #3 0x00007f1e56008f37 in ide_dma_cancel (bm=0x7f1e57dac160) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/core.c:2395 #4 0x00007f1e56008f5d in ide_dma_reset (bm=0x7f1e57dac160) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/core.c:2408 #5 0x00007f1e5600c755 in piix3_reset (opaque=0x7f1e57dab010) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/piix.c:124 #6 0x00007f1e55e6765b in qemu_system_reset (report=true) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:3417 #7 0x00007f1e55e990ad in qemu_kvm_system_reset (report=true) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1992 #8 0x00007f1e55e9997d in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2272 #9 0x00007f1e55e683ba in main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4273 #10 0x00007f1e55e6d451 in main (argc=24, argv=0x7fff7f33ac18, envp=0x7fff7f33ace0) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6731 --- Additional comment from Markus Armbruster on 2015-11-27 11:17:42 BRST --- Aha! acb->aiocb_info->cancel_async seems to be garbage. Hunch: use after free? Chimes with your report that Opteron_G5 fails differently... Please reproduce with your debug build of qemu-kvm under valgrind, and capture valgrind's report. --- Additional comment from Guo, Zhiyi on 2015-12-01 06:32 BRST --- Log generated on Valgrind 3.11.0, Valgrind 3.8.1 will core dump under same steps --- Additional comment from Guo, Zhiyi on 2015-12-01 07:18 BRST --- --- Additional comment from Markus Armbruster on 2015-12-01 08:01:26 BRST --- valgrind is reporting a huge number of unrelated issues, probably in part because we lack upstream patches to suppress false positives. It hits a cutoff and stops reporting some time before the crash. Please try again with --error-limit=no. Additional question: is qemu-kvm-rhev affected as well? --- Additional comment from Guo, Zhiyi on 2015-12-02 06:09 BRST --- Issue also can be reproduced on rhel7.2 intel skylake host with rhev: kernel:3.10.0-334.el7.x86_64 qemu-kvm-rhev-2.3.0-31.el7.x86_64 qemu-kvm-rhev-debuginfo-2.3.0-31.el7.x86_64 qemu-img-rhev-2.3.0-31.el7.x86_64 qemu-kvm-tools-rhev-2.3.0-31.el7.x86_64 qemu-kvm-common-rhev-2.3.0-31.el7.x86_64 Attachment include valgrind log reproduced on rhel6.7 and rhel7.2. rhev packages have been compiled with -g and without -O2 optimize. valgrind log generate with option --error-limit=no --- Additional comment from Guo, Zhiyi on 2015-12-02 06:12:38 BRST --- Command used to reproduce the issue and capture valgrind log: valgrind --log-file=valgrind.txt --error-limit=no /usr/libexec/qemu-kvm -name win2012 -m 2048 -smp 4 -cpu host -vga qxl -vnc :1 -monitor stdio -hda win2012.qcow2 --- Additional comment from Guo, Zhiyi on 2015-12-02 06:27 BRST --- Mistake valgrind log on rhel6.7 please ignore attachment in comment 10 and use log in this comment.
Moving back to ASSIGNED as we decided to delay this to 7.4, at least for now. See comment 5 on #1299876 --js
For reference, this is the cluster of BZ related to this issue: bug 1281713, bug 1299876, bug 1299875, bug 1361487, bug 1361490, bug 1361488, bug 1375520
The issue still exist in RHEL7.4-3.10.0-618+qemu-kvm-1.5.3-134.
I'm having trouble with our build root at the moment, so I cannot re-post the patch currently. Moving back to ASSIGNED so I can re-post the patch once the build root problem is addressed. Thanks.
There.
Fix included in qemu-kvm-1.5.3-137.el7
Verified,the problem has been resolved, so change its status to "Verified". Test Version: kernel version:3.10.0-657.el7.x86_64 qemu-kvm version:qemu-kvm-1.5.3-137.el7.x86_64 Test Steps: 1.Full write host disk 2.Start guest with qemu cmds bellow: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -vga std \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20161219-042734-6fVMWCMz,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20161219-042734-6fVMWCMz,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/rhel74-64-virtio.qcow2 \ -device ide-hd,id=image1,drive=drive_image1,bootindex=0,bus=ide.0 \ -device virtio-net-pci,mac=9a:f2:f3:f4:f5:f6,id=id30uvBS,vectors=4,netdev=idADyVP5,bus=pci.0,addr=04 \ -netdev tap,id=idADyVP5,vhost=on \ -m 2048 \ -smp 16,maxcpus=16,cores=8,threads=1,sockets=2 \ -cpu host \ -vnc :0 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=d,menu=off,strict=off \ -enable-kvm \ -spice port=3000,ipv4,disable-ticketing \ -monitor stdio \ 3.Copy files to guest until qemu error report: (qemu) block I/O error in device 'drive_image1': No space left on device (28) 4.Check vm status (qemu)info status --> VM status:paused(io-error) 5.Reset vm and continue it (qemu)system_reset (qemu)c --> VM restart successfully.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:1856