Red Hat Bugzilla – Bug 1461561
virtio-blk: drain block before cleanup missing
Last modified: 2017-08-02 00:43:29 EDT
Description of problem: the following problem was reported against QEMU 2.7: https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02996.html 1. Customer commands Disk hot-unplug via Web-based application. 2. The application sends "device_del" that sometimes takes unpredictable time. 3. There are some inflight IOs 4. Customer does reboot or shutdown the guest OS 4. Qemu process generates segfault. If a disk is unplugged with "device_del" and "drive_del" commands, qemu does not generate problem. But if only "device_del" command are finished and guest os reboots, qemu generates segfault. I can reproduce that case easily with following commands 1. generate heavy IO on disk in Guest OS 2. run "device_del" on the qemu monitor 3. run "system_reboot" on the qemu monitor According to AdressSanitizer, VirtQueue is freed by virtio_cleanup but dereferenced by asynchronous request. I think asynchronous requests should be finished before virtio_cleanup, so I added blk_drain() before virtio_cleanup. Following is configure options I used to build qemu with AddressSanitizer. ./configure --target-list=x86_64-softmmu --extra-cflags="-fsanitize=address -fno-omit-frame-pointer" --enable-debug Looking at code, it seems to affect potentially all of 7.0-7.3.
I can reproduce it on kernel:3.10.0-514.el7.x86_64+qemu-kvm-rhev:qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64. Test Steps: 1.Start guest with qemu cmd: #cat blk_test.txt /usr/libexec/qemu-kvm \ -sandbox off \ -machine pc \ -nodefaults \ -vga cirrus \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20170614-233639-etu9X2zc,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20170614-233639-etu9X2zc,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idhq2DAN \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20170614-233639-etu9X2zc,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20170614-233639-etu9X2zc,path=/var/tmp/seabios-20170614-233639-etu9X2zc,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20170614-233639-etu9X2zc,iobase=0x402 \ -device ich9-usb-ehci1,id=usb1,addr=0x1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.4,firstport=4,bus=pci.0 \ -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel73-64-virtio.qcow2 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=0x3 \ -drive id=data,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/data_disk.img \ -device virtio-blk-pci,id=data1,drive=data,bus=pci.0 \ -device virtio-net-pci,mac=9a:43:44:45:46:47,id=idvMp6XX,vectors=4,netdev=id9qJxPT,bus=pci.0 \ -netdev tap,id=id9qJxPT,vhost=on \ -m 4096 \ -smp 6,cores=2,threads=1,sockets=3 \ -cpu host \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot order=cdn,once=d,menu=off,strict=off \ -no-shutdown \ -enable-kvm \ -monitor stdio \ -spice ipv4,port=5000,disable-ticketing \ 2. Run IO test in guest: (guest)# dd if=/dev/zero of=/dev/vdb bs=1M count=40000000 3. Delete vdb before IO test finished in step2. (qemu)device_del data1 4. Reboot guest before IO test finished. (qemu)system_reset Test Result: blk_test.txt: line 36: 16879 Floating point exception(core dumped)
Has verified, the problem has been resolved, so set bug's status to "Verified". Test Version: kernel version:3.10.0-682.el7.x86_64 qemu-kvm-rhev:qemu-kvm-rhev-2.9.0-10.el7.x86_64 Test Steps: 1. Start guest with qemu cmd: /usr/libexec/qemu-kvm \ -sandbox off \ -machine pc \ -nodefaults \ -vga cirrus \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20170614-233639-etu9X2zc,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20170614-233639-etu9X2zc,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=idhq2DAN \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20170614-233639-etu9X2zc,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -chardev socket,id=seabioslog_id_20170614-233639-etu9X2zc,path=/var/tmp/seabios-20170614-233639-etu9X2zc,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20170614-233639-etu9X2zc,iobase=0x402 \ -device ich9-usb-ehci1,id=usb1,addr=0x1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.4,firstport=4,bus=pci.0 \ -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel74-64-virtio.qcow2 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=0x3 \ -drive id=data,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/data_disk.img \ -device virtio-blk-pci,id=data1,drive=data,bus=pci.0 \ -device virtio-net-pci,mac=9a:43:44:45:46:47,id=idvMp6XX,vectors=4,netdev=id9qJxPT,bus=pci.0 \ -netdev tap,id=id9qJxPT,vhost=on \ -m 4096 \ -smp 6,cores=2,threads=1,sockets=3 \ -cpu host \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot order=cdn,once=d,menu=off,strict=off \ -no-shutdown \ -enable-kvm \ -monitor stdio \ -spice ipv4,port=5000,disable-ticketing \ 2. Run dd in guest. (guest)# dd if=/dev/zero of=/dev/vdb bs=1M count=40000000 . Delete vdb before IO test finished in step2. (qemu)device_del data1 4. Reboot guest before IO test finished. (qemu)system_reset Test Result: Guest restart successfully.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392