RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1677105 - No "DEVICE_DELETED" event in qmp after "device_del"
Summary: No "DEVICE_DELETED" event in qmp after "device_del"
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.7
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: ---
Assignee: Gal Hammer
QA Contact: Xueqiang Wei
URL:
Whiteboard:
Depends On:
Blocks: 1678290 1678311 1744438 1754756
TreeView+ depends on / blocked
 
Reported: 2019-02-14 03:50 UTC by aihua liang
Modified: 2020-04-29 21:38 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1678290 (view as bug list)
Environment:
Last Closed: 2019-07-22 20:31:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description aihua liang 2019-02-14 03:50:20 UTC
Description of problem:
 No "DEVICE_DELETED" event in qmp after "device_del"  when test with q35+multi-disks

Version-Release number of selected component (if applicable):
 kernel version:3.10.0-993.el7.x86_64
 qemu-kvm-rhev version:qemu-kvm-rhev-2.12.0-23.el7.x86_64

How reproducible:
 25%

Steps to Reproduce:
1.Start guest with qemu cmds:
  /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1' \
    -machine q35  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x1  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20190213-022516-JgRgI5FT,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20190213-022516-JgRgI5FT,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=id9Pmk38  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20190213-022516-JgRgI5FT,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20190213-022516-JgRgI5FT,path=/var/tmp/seabios-20190213-022516-JgRgI5FT,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20190213-022516-JgRgI5FT,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pcie.0,addr=0x2 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/win10-32-virtio.qcow2 \
    -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pcie.0-root-port-3,addr=0x0 \
    -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:e9:ea:eb:ec:ed,id=idqu708y,vectors=4,netdev=idBbh4yv,bus=pcie.0-root-port-4,addr=0x0  \
    -netdev tap,id=idBbh4yv,vhost=on \
    -m 11264  \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu 'Penryn',+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time \
    -drive id=drive_cd1,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,file=/home/kvm_autotest_root/iso/windows/winutils.iso \
    -device ide-cd,id=cd1,drive=drive_cd1,bootindex=1,bus=ide.0,unit=0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,strict=off,order=cdn,once=c \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -device pcie-root-port,id=pcie_extra_root_port_1,slot=6,chassis=6,addr=0x6,bus=pcie.0 \
    -device pcie-root-port,id=pcie_extra_root_port_2,slot=7,chassis=7,addr=0x7,bus=pcie.0 \
    -monitor stdio \

2.After guest boot up, stop vm
 (qemu)stop

3.Create two images for hotplug/unplug.
 #qemu-img create -f qcow2 storage0.qcow2 1G
 #qemu-img create -f qcow2 storage1.qcow2 1G

4.Hotplug two disks and check them in qtree.
  4.1 hotplug stg0
    {"execute": "__com.redhat_drive_add", "arguments": {"id": "drive_stg0", "snapshot": "off", "aio": "threads", "cache": "none", "format": "qcow2", "file": "/home/kvm_autotest_root/images/storage0.qcow2"}, "id": "cF34oU0K"}
{"return": {}, "id": "cF34oU0K"}
{"execute": "device_add", "arguments": {"driver": "virtio-blk-pci", "id": "stg0", "drive": "drive_stg0", "bus": "pcie_extra_root_port_0"}, "id": "P5lnRhh9"}
{"return": {}, "id": "P5lnRhh9"}
  4.2 check stg0 in qtree
    (qemu)info qtree
      bus: pcie_extra_root_port_0
          type PCIE
          dev: virtio-blk-pci, id "stg0"
  4.3 hotplug stg1
     {"execute": "__com.redhat_drive_add", "arguments": {"id": "drive_stg1", "snapshot": "off", "aio": "threads", "cache": "none", "format": "qcow2", "file": "/home/kvm_autotest_root/images/storage1.qcow2"}, "id": "HzoPqeaP"}
{"return": {}, "id": "HzoPqeaP"}
{"execute": "device_add", "arguments": {"driver": "virtio-blk-pci", "id": "stg1", "drive": "drive_stg1", "bus": "pcie_extra_root_port_1"}, "id": "F59m9x9F"}
  4.4 check stg1 in qtree
     (qemu)info qtree
       bus: pcie_extra_root_port_1
          type PCIE
          dev: virtio-blk-pci, id "stg1"

5.Resume vm
 (qemu)cont

6.In guest, format disks and run iozone on them.
  (guest)#diskpart
   (diskpart)#select disk 2
             #create partition primary size=1014
             #select partition 1
             #assign
             #detail disk
             #select volume E  
             #format fs=ntfs quick
             #select disk 1
             #create partition primary size=1014
             #select partition 1
             #assign
             #detail disk
             #select volume F  
             #format fs=ntfs quick
             #exit
   (guest)#D:\Iozone\iozone.exe -azR -r 64k -n 125M -g 512M -M -i 0 -i 1 -b E:\iozone_test -f E:\testfile
          #D:\Iozone\iozone.exe -azR -r 64k -n 125M -g 512M -M -i 0 -i 1 -b F:\iozone_test -f F:\testfile

7.After iozone finished,unplug disks
  {"execute": "device_del", "arguments": {"id": "stg1"}, "id": "Bpg1ll6I"}
{"return": {}, "id": "Bpg1ll6I"}
{"timestamp": {"seconds": 1550047646, "microseconds": 162609}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/stg1/virtio-backend"}}
{"timestamp": {"seconds": 1550047646, "microseconds": 169155}, "event": "DEVICE_DELETED", "data": {"device": "stg1", "path": "/machine/peripheral/stg1"}}
{"execute": "device_del", "arguments": {"id": "stg0"}, "id": "3YclNbSW"}
{"return": {}, "id": "3YclNbSW"}

Actual results:
 After step7, no "DEVICE_DELETED" event sent out for stg0.

Expected results:
 After step7, stg0 should be unplugged successfully.

Additional info:
 When to reproduce, run at least four times.
 Pstack info:
  # pstack 2803
Thread 6 (Thread 0x7fcb9ddcd700 (LWP 2804)):
#0  0x00007fcba55981c9 in syscall () at /lib64/libc.so.6
#1  0x000055650766d4d0 in qemu_event_wait ()
#2  0x000055650767da6e in call_rcu_thread ()
#3  0x00007fcba5874dd5 in start_thread () at /lib64/libpthread.so.0
#4  0x00007fcba559dead in clone () at /lib64/libc.so.6
Thread 5 (Thread 0x7fcb9ccca700 (LWP 2837)):
#0  0x00007fcba55932cf in ppoll () at /lib64/libc.so.6
#1  0x00005565076694cb in qemu_poll_ns ()
#2  0x000055650766b247 in aio_poll ()
#3  0x000055650743a8ae in iothread_run ()
#4  0x00007fcba5874dd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fcba559dead in clone () at /lib64/libc.so.6
Thread 4 (Thread 0x7fcb9c4c9700 (LWP 2838)):
#0  0x00007fcba55948d7 in ioctl () at /lib64/libc.so.6
#1  0x00005565073724c5 in kvm_vcpu_ioctl ()
#2  0x0000556507372593 in kvm_cpu_exec ()
#3  0x000055650734fb06 in qemu_kvm_cpu_thread_fn ()
#4  0x00007fcba5874dd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fcba559dead in clone () at /lib64/libc.so.6
Thread 3 (Thread 0x7fcb9bcc8700 (LWP 2840)):
#0  0x00007fcba55948d7 in ioctl () at /lib64/libc.so.6
#1  0x00005565073724c5 in kvm_vcpu_ioctl ()
#2  0x0000556507372593 in kvm_cpu_exec ()
#3  0x000055650734fb06 in qemu_kvm_cpu_thread_fn ()
#4  0x00007fcba5874dd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fcba559dead in clone () at /lib64/libc.so.6
Thread 2 (Thread 0x7fc8d99ff700 (LWP 2903)):
#0  0x00007fcba5878965 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x000055650766d0a9 in qemu_cond_wait_impl ()
#2  0x000055650758786f in vnc_worker_thread_loop ()
#3  0x0000556507587e38 in vnc_worker_thread ()
#4  0x00007fcba5874dd5 in start_thread () at /lib64/libpthread.so.0
#5  0x00007fcba559dead in clone () at /lib64/libc.so.6
Thread 1 (Thread 0x7fcbbec84dc0 (LWP 2803)):
#0  0x00007fcba55932cf in ppoll () at /lib64/libc.so.6
#1  0x00005565076694a9 in qemu_poll_ns ()
#2  0x000055650766a38e in main_loop_wait ()
#3  0x000055650730c147 in main ()

Comment 1 aihua liang 2019-02-14 06:10:57 UTC
for machine type:pc, don't have this issue.

Comment 4 Xueqiang Wei 2019-02-25 05:57:24 UTC
Hit this issue by automation. Not hit it on linux guest.


Details:

Host:
kernel-3.10.0-1009.el7.x86_64
qemu-kvm-rhev-2.12.0-23.el7
spice-server-0.14.0-6.el7.x86_64
seabios-bin-1.11.0-2.el7.noarch
seavgabios-bin-1.11.0-2.el7.noarch

Guest:
windows10 x86_64
virtio-win-prewhql-0.1-163.iso


auto cmd lines:
python ConfigTest.py --testcase=block_hotplug..block_virtio..fmt_qcow2..with_plug..with_repetition..multi_pci,block_hotplug..block_virtio..fmt_qcow2..with_plug..with_reboot..multi_pci,block_hotplug..block_virtio..fmt_raw..with_plug..with_reboot..one_pci --imageformat=qcow2 --guestname=Win10 --driveformat=virtio_scsi --nicmodel=virtio_net --platform=x86_64 --clone=yes --nrepeat=50


qemu cmd lines:

/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1' \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_YO9hUp/monitor-qmpmonitor1-20190222-023950-hEeRk4hO,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/avocado_YO9hUp/monitor-catch_monitor-20190222-023950-hEeRk4hO,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idGIrlXB  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/avocado_YO9hUp/serial-serial0-20190222-023950-hEeRk4hO,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20190222-023950-hEeRk4hO,path=/var/tmp/avocado_YO9hUp/seabios-20190222-023950-hEeRk4hO,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20190222-023950-hEeRk4hO,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/win10-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1,bootindex=0 \
    -device virtio-net-pci,mac=9a:30:31:32:33:34,id=idXbpfQY,vectors=4,netdev=id73430R,bus=pci.0,addr=0x5  \
    -netdev tap,id=id73430R,vhost=on,vhostfd=20,fd=19 \
    -m 15360  \
    -smp 12,maxcpus=12,cores=6,threads=1,sockets=2  \
    -cpu 'Westmere',+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time \
    -drive id=drive_cd1,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,file=/home/kvm_autotest_root/iso/windows/winutils.iso \
    -device scsi-cd,id=cd1,drive=drive_cd1 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,strict=off,order=cdn,once=c \
    -enable-kvm \
    -monitor stdio \



Not receive "DEVICE_DELETED" event, after execute "device_del".

e.g. (should revice below message)
2019-02-13 06:49:57: {"timestamp": {"seconds": 1550058592, "microseconds": 888245}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/stg1/virtio-backend"}}
2019-02-13 06:49:57: {"timestamp": {"seconds": 1550058592, "microseconds": 893544}, "event": "DEVICE_DELETED", "data": {"device": "stg1", "path": "/machine/peripheral/stg1"}}



logs:

{"execute": "device_del", "arguments": {"id": "stg1"}, "id": "66pzh8b6"}
{"return": {}, "id": "66pzh8b6"}
{"execute": "human-monitor-command", "arguments": {"command-line": "info qtree"}, "id": "DibEIDiZ"}

stg1 in qtree,

30 seconds later, stg1 is still existent in qtree and not receive "DEVICE_DELETED" event.

120 seconds later, stg1 is still existent in qtree and not receive "DEVICE_DELETED" event.


both q35+blockdev and pc+drive all hit this issue.

Comment 5 Gal Hammer 2019-03-05 09:27:27 UTC
(In reply to Xueqiang Wei from comment #4)
> Hit this issue by automation. Not hit it on linux guest.

Are the automation scripts umount the disks before trying to remove them using the "device_del" command? Because the device_del command should fail if the device is in use.

Comment 6 Xueqiang Wei 2019-03-05 11:07:46 UTC
(In reply to Gal Hammer from comment #5)
> (In reply to Xueqiang Wei from comment #4)
> > Hit this issue by automation. Not hit it on linux guest.
> 
> Are the automation scripts umount the disks before trying to remove them
> using the "device_del" command? Because the device_del command should fail
> if the device is in use.


No, in guest don't umount disk before unplug it. I think that's OK. Unplug disk may fail when doing i/o. In this case, unplug disk after the i/o operation has completed. And I confirmed with virtio-win team, they also don't umount disk before unplug it. Is it necessary before unplug disk by "device_del" ?


The difference between linux guest and windows in this case:

linux: unplug disk after dd test

dd if=/dev/vda of=/dev/null bs=1k count=1000 iflag=direct && dd if=/dev/zero of=/dev/vda bs=1k count=1000 oflag=direct
dd if=/dev/vdb of=/dev/null bs=1k count=1000 iflag=direct && dd if=/dev/zero of=/dev/vdb bs=1k count=1000 oflag=direct


windows: unplug disk after iozone test 

D:\Iozone\iozone.exe -azR -r 64k -n 125M -g 512M -M -i 0 -i 1 -b I:\iozone_test -f I:\testfile
D:\Iozone\iozone.exe -azR -r 64k -n 125M -g 512M -M -i 0 -i 1 -b J:\iozone_test -f J:\testfile

Do not umount I and J, then unplug disk by "device_del"

Comment 7 Gal Hammer 2019-03-05 13:08:25 UTC
(In reply to Xueqiang Wei from comment #6)
> (In reply to Gal Hammer from comment #5)
> > (In reply to Xueqiang Wei from comment #4)
> > > Hit this issue by automation. Not hit it on linux guest.
> > 
> > Are the automation scripts umount the disks before trying to remove them
> > using the "device_del" command? Because the device_del command should fail
> > if the device is in use.
> 
> 
> No, in guest don't umount disk before unplug it. I think that's OK. Unplug
> disk may fail when doing i/o. In this case, unplug disk after the i/o
> operation has completed. And I confirmed with virtio-win team, they also
> don't umount disk before unplug it. Is it necessary before unplug disk by
> "device_del" ?
> 
> 
> The difference between linux guest and windows in this case:
> 
> linux: unplug disk after dd test
> 
> dd if=/dev/vda of=/dev/null bs=1k count=1000 iflag=direct && dd if=/dev/zero
> of=/dev/vda bs=1k count=1000 oflag=direct
> dd if=/dev/vdb of=/dev/null bs=1k count=1000 iflag=direct && dd if=/dev/zero
> of=/dev/vdb bs=1k count=1000 oflag=direct
> 
> 
> windows: unplug disk after iozone test 
> 
> D:\Iozone\iozone.exe -azR -r 64k -n 125M -g 512M -M -i 0 -i 1 -b
> I:\iozone_test -f I:\testfile
> D:\Iozone\iozone.exe -azR -r 64k -n 125M -g 512M -M -i 0 -i 1 -b
> J:\iozone_test -f J:\testfile
> 
> Do not umount I and J, then unplug disk by "device_del"

So what you describe here are two different tests. On Linux you don't mount the disk and work directly with the block device (e.g. /dev/vda) while on Windows it appears that the disk is mounted (=have an assigned a driver letter) and the test is running on a file.

A PCI device can't be unplugged without the guest operating system approval. So Windows might ignore the device removal request because the disk is still in use. To test that you can try to unplug the device in Linux while the dd test is still running or make sure the disk is umounted (mountvol command, I think) on Windows after the iozone test is completed.

Comment 8 Xueqiang Wei 2019-03-06 07:55:12 UTC
(In reply to Gal Hammer from comment #7)
> (In reply to Xueqiang Wei from comment #6)
> > (In reply to Gal Hammer from comment #5)
> > > (In reply to Xueqiang Wei from comment #4)
> > > > Hit this issue by automation. Not hit it on linux guest.
> > > 
> > > Are the automation scripts umount the disks before trying to remove them
> > > using the "device_del" command? Because the device_del command should fail
> > > if the device is in use.
> > 
> > 
> > No, in guest don't umount disk before unplug it. I think that's OK. Unplug
> > disk may fail when doing i/o. In this case, unplug disk after the i/o
> > operation has completed. And I confirmed with virtio-win team, they also
> > don't umount disk before unplug it. Is it necessary before unplug disk by
> > "device_del" ?
> > 
> > 
> > The difference between linux guest and windows in this case:
> > 
> > linux: unplug disk after dd test
> > 
> > dd if=/dev/vda of=/dev/null bs=1k count=1000 iflag=direct && dd if=/dev/zero
> > of=/dev/vda bs=1k count=1000 oflag=direct
> > dd if=/dev/vdb of=/dev/null bs=1k count=1000 iflag=direct && dd if=/dev/zero
> > of=/dev/vdb bs=1k count=1000 oflag=direct
> > 
> > 
> > windows: unplug disk after iozone test 
> > 
> > D:\Iozone\iozone.exe -azR -r 64k -n 125M -g 512M -M -i 0 -i 1 -b
> > I:\iozone_test -f I:\testfile
> > D:\Iozone\iozone.exe -azR -r 64k -n 125M -g 512M -M -i 0 -i 1 -b
> > J:\iozone_test -f J:\testfile
> > 
> > Do not umount I and J, then unplug disk by "device_del"
> 
> So what you describe here are two different tests. On Linux you don't mount
> the disk and work directly with the block device (e.g. /dev/vda) while on
> Windows it appears that the disk is mounted (=have an assigned a driver
> letter) and the test is running on a file.
> 
> A PCI device can't be unplugged without the guest operating system approval.
> So Windows might ignore the device removal request because the disk is still
> in use. To test that you can try to unplug the device in Linux while the dd
> test is still running or make sure the disk is umounted (mountvol command, I
> think) on Windows after the iozone test is completed.


I will try unplug disk after umounted it on Windows.  I am confused about why unplug virtio-scsi disk without umount disk works well, but unplug virtio-blk disk without umount disk can not work randomly (unplug successfully some times). They should have the same behavior.

https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c3
https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c5

Comment 9 Gal Hammer 2019-03-06 08:52:57 UTC
(In reply to Xueqiang Wei from comment #8)

> I will try unplug disk after umounted it on Windows.  I am confused about
> why unplug virtio-scsi disk without umount disk works well, but unplug
> virtio-blk disk without umount disk can not work randomly (unplug
> successfully some times). They should have the same behavior.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c3
> https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c5

Thanks, I'm waiting for updates.

Bug 1523017 claims that both virtio-scsi and virtio-blk might fail when trying to unplug the device. The unplug process depends on the guest operating system and Windows doesn't allow to unplug a device while the device is in use. When the test randomly fails, it seems like a case where the VM wasn't fast enough to complete all pending io requests before it received the unplug request.

Comment 10 Xueqiang Wei 2019-03-06 13:23:42 UTC
(In reply to Gal Hammer from comment #9)
> (In reply to Xueqiang Wei from comment #8)
> 
> > I will try unplug disk after umounted it on Windows.  I am confused about
> > why unplug virtio-scsi disk without umount disk works well, but unplug
> > virtio-blk disk without umount disk can not work randomly (unplug
> > successfully some times). They should have the same behavior.
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c3
> > https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c5
> 
> Thanks, I'm waiting for updates.
> 
> Bug 1523017 claims that both virtio-scsi and virtio-blk might fail when
> trying to unplug the device. The unplug process depends on the guest
> operating system and Windows doesn't allow to unplug a device while the
> device is in use. When the test randomly fails, it seems like a case where
> the VM wasn't fast enough to complete all pending io requests before it
> received the unplug request.


Unplug disk after umount it on Windows10. also hit this issue.

1. umount disk I, J
 mountvol I: /D
 mountvol J: /D

2. unplug disk stg1, stg0
{'execute': 'device_del', 'arguments': {'id': 'stg1'}
{'execute': 'device_del', 'arguments': {'id': 'stg0'}

after 2,
unplug disk stg1 successfully and receive "DEVICE_DELETED" event
unplug disk stg0 unsuccessfully and not receive "DEVICE_DELETED" event


logs:

07:58:52 INFO | --------umount disk I
07:58:52 DEBUG| Sending command: mountvol I: /D
07:58:52 DEBUG| Sending command: echo %errorlevel%
07:58:53 INFO | --------umount disk J
07:58:53 DEBUG| Sending command: mountvol J: /D
07:58:53 DEBUG| Sending command: echo %errorlevel%
07:58:53 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1) Sending command 'device_del'
07:58:53 DEBUG| Send command: {'execute': 'device_del', 'arguments': {'id': 'stg1'}, 'id': 'DJyY0G91'}
07:58:55 DEBUG| STREAM b'IHDR' 16 13
07:58:55 DEBUG| STREAM b'IDAT' 41 1216
07:59:00 DEBUG| STREAM b'IHDR' 16 13
07:59:00 DEBUG| STREAM b'IDAT' 41 1216
07:59:04 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1) Sending command 'device_del'
07:59:04 DEBUG| Send command: {'execute': 'device_del', 'arguments': {'id': 'stg0'}, 'id': 'TZNsUHyi'}
08:00:42 ERROR| avocado.core.exceptions.TestFail: Failed to unplug device 'stg0'.Output:
08:00:42 ERROR| {}

Comment 11 Gal Hammer 2019-03-06 14:22:24 UTC
(In reply to Xueqiang Wei from comment #10)
> (In reply to Gal Hammer from comment #9)
> > (In reply to Xueqiang Wei from comment #8)
> > 
> > > I will try unplug disk after umounted it on Windows.  I am confused about
> > > why unplug virtio-scsi disk without umount disk works well, but unplug
> > > virtio-blk disk without umount disk can not work randomly (unplug
> > > successfully some times). They should have the same behavior.
> > > 
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c3
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c5
> > 
> > Thanks, I'm waiting for updates.
> > 
> > Bug 1523017 claims that both virtio-scsi and virtio-blk might fail when
> > trying to unplug the device. The unplug process depends on the guest
> > operating system and Windows doesn't allow to unplug a device while the
> > device is in use. When the test randomly fails, it seems like a case where
> > the VM wasn't fast enough to complete all pending io requests before it
> > received the unplug request.
> 
> 
> Unplug disk after umount it on Windows10. also hit this issue.
> 
> 1. umount disk I, J
>  mountvol I: /D
>  mountvol J: /D

Can you please try with "mountvol I: /P"?
 
> 2. unplug disk stg1, stg0
> {'execute': 'device_del', 'arguments': {'id': 'stg1'}
> {'execute': 'device_del', 'arguments': {'id': 'stg0'}
> 
> after 2,
> unplug disk stg1 successfully and receive "DEVICE_DELETED" event
> unplug disk stg0 unsuccessfully and not receive "DEVICE_DELETED" event
> 
> 
> logs:
> 
> 07:58:52 INFO | --------umount disk I
> 07:58:52 DEBUG| Sending command: mountvol I: /D
> 07:58:52 DEBUG| Sending command: echo %errorlevel%

Is it possible to see the results of the commands? How can you tell from this log if the mountvol was even executed?

> 07:58:53 INFO | --------umount disk J
> 07:58:53 DEBUG| Sending command: mountvol J: /D
> 07:58:53 DEBUG| Sending command: echo %errorlevel%
> 07:58:53 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1) Sending command
> 'device_del'
> 07:58:53 DEBUG| Send command: {'execute': 'device_del', 'arguments': {'id':
> 'stg1'}, 'id': 'DJyY0G91'}
> 07:58:55 DEBUG| STREAM b'IHDR' 16 13
> 07:58:55 DEBUG| STREAM b'IDAT' 41 1216
> 07:59:00 DEBUG| STREAM b'IHDR' 16 13
> 07:59:00 DEBUG| STREAM b'IDAT' 41 1216
> 07:59:04 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1) Sending command
> 'device_del'
> 07:59:04 DEBUG| Send command: {'execute': 'device_del', 'arguments': {'id':
> 'stg0'}, 'id': 'TZNsUHyi'}
> 08:00:42 ERROR| avocado.core.exceptions.TestFail: Failed to unplug device
> 'stg0'.Output:
> 08:00:42 ERROR| {}

Comment 12 Xueqiang Wei 2019-03-07 09:01:49 UTC
 
> Can you please try with "mountvol I: /P"?

try with "mountvol I: /P", also hit this issue.

> Is it possible to see the results of the commands? How can you tell from
> this log if the mountvol was even executed?

Get drive letter before and after unmount disk.
And if command "mountvol" failed, the program will exit and prompt message. 


logs:

03:29:52 DEBUG| Sending mmand: wmic logicaldisk where "drivetype=3" get name
03:29:53 INFO | before unmount disk: 
Name
C:
I:
J:
S:

03:29:53 INFO | --------umount disk I
03:29:53 DEBUG| Sending command: mountvol I: /P
03:29:53 DEBUG| Sending command: echo %errorlevel%
03:29:53 DEBUG| Sending command: wmic logicaldisk where "drivetype=3" get name
03:29:53 INFO | after unmount disk: 
Name
C:
J:
S:

03:29:53 DEBUG| Sending command: wmic logicaldisk where "drivetype=3" get name
03:29:54 INFO | before unmount disk: 
Name
C:
J:
S:

03:29:54 INFO | --------umount disk J
03:29:54 DEBUG| Sending command: mountvol J: /P
03:29:54 DEBUG| Sending command: echo %errorlevel%
03:29:54 DEBUG| Sending command: wmic logicaldisk where "drivetype=3" get name
03:29:55 INFO | after unmount disk: 
Name
C:
S:

03:29:55 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1) Sending command 'device_del'
03:29:55 DEBUG| Send command: {'execute': 'device_del', 'arguments': {'id': 'stg1'}, 'id': '6o6Nsy7C'}
03:29:58 DEBUG| STREAM b'IHDR' 16 13
03:29:58 DEBUG| STREAM b'IDAT' 41 1216
03:30:04 DEBUG| STREAM b'IHDR' 16 13
03:30:04 DEBUG| STREAM b'IDAT' 41 1216
03:30:06 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1) Sending command 'device_del'
03:30:06 DEBUG| Send command: {'execute': 'device_del', 'arguments': {'id': 'stg0'}, 'id': 'r3UHrxkZ'}
03:30:37 ERROR| avocado.core.exceptions.TestFail: Failed to unplug device 'stg0'.Output:
03:30:37 ERROR| {}

Comment 13 John Ferlan 2019-04-02 13:17:17 UTC
NB: Since Gal got assigned bz 1678290, I've reassigned this and bz 1678311 to him in order to keep the 3 bugs "together" (similarly altered the priority as well).

Comment 15 Gal Hammer 2019-06-11 09:13:25 UTC
It seems that the two unrelated(?) commits 3d2fc923ecab and aff39be0ed97 for correct reference counting solve this issue.

Xueqiang? Can you please verify it as well? Thanks.

Comment 16 Xueqiang Wei 2019-06-25 02:40:53 UTC
(In reply to Gal Hammer from comment #15)
> It seems that the two unrelated(?) commits 3d2fc923ecab and aff39be0ed97 for
> correct reference counting solve this issue.
> 
> Xueqiang? Can you please verify it as well? Thanks.


According to https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c1, retested on upstream qemu with commits 3d2fc923ecab and aff39be0ed97, also hit this issue.


Details:

Host:
kernel-4.18.0-85.el8.x86_64
QEMU emulator version 4.0.50 (upstream-474f3938d79ab36b9231c9ad3b5a9314c2aeacde)

it contains below two commits:

commit 3d2fc923ecab576db1b398c565816e54b73f4a2d
Author: Philippe Mathieu-Daudé <philmd>
Date:   Tue May 7 18:34:03 2019 +0200

    hw/virtio: Use object_initialize_child for correct reference counting
    
    As explained in commit aff39be0ed97:
   ......

commit aff39be0ed9753c9c323f64a14f5533dd5c43525
Author: Thomas Huth <thuth>
Date:   Tue Apr 30 21:15:52 2019 +0200

    hw/pci-host: Use object_initialize_child for correct reference counting
    
    Both functions, object_initialize() and object_property_add_child() increase
    the reference counter of the new object, so one of the references has to be
    dropped afterwards to get the reference counting right. Otherwise the child
    object might not be properly cleaned up when the parent gets destroyed.
    Some functions of the pci-host devices miss to drop one of the references.
    Fix it by using object_initialize_child() instead, which takes care of
    calling object_initialize(), object_property_add_child() and object_unref()
    in the right order.
    
    Suggested-by: Eduardo Habkost <ehabkost>
    Message-Id: <20190430191552.4027-1-thuth>
    Reviewed-by: Philippe Mathieu-Daudé <philmd>
    Tested-by: Philippe Mathieu-Daudé <philmd>
    Signed-off-by: Thomas Huth <thuth>


Guest:
win2019 with virtio-win-prewhql-0.1-171.iso



Not receive "DEVICE_DELETED" event, after execute "device_del".


logs:

2019-06-24 19:08:37: {"execute": "device_del", "arguments": {"id": "stg1"}, "id": "qXNhVE25"}
2019-06-24 19:08:37: {"return": {}, "id": "qXNhVE25"}
2019-06-24 19:08:38: {"execute": "human-monitor-command", "arguments": {"command-line": "info qtree"}, "id": "uEOnpZrQ"}


stg1 in qtree,

30 seconds later, stg1 is still existent in qtree and not receive "DEVICE_DELETED" event.

Comment 17 Gal Hammer 2019-06-25 08:54:57 UTC
Thanks for the testing. I'm not sure but I think my setup doesn't include a scsi-hd device and I just tested for the virtio-scsi pci device unplug. I'll check again and update when I'll have some more information.

Comment 18 Amnon Ilan 2019-06-27 08:16:04 UTC
Maybe related comment from Vadim here: https://bugzilla.redhat.com/show_bug.cgi?id=1706759#c13

Comment 19 Xueqiang Wei 2019-07-03 06:01:38 UTC
Change the priority to high since it's a normal use case and easy to reproduce.

Related comment from Markus here: https://bugzilla.redhat.com/show_bug.cgi?id=1678290#c9


Note You need to log in before you can comment on or make changes to this bug.