Bug 1607891 - Hotplug events are sometimes lost with virtio-scsi + iothread
Summary: Hotplug events are sometimes lost with virtio-scsi + iothread
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Stefan Hajnoczi
QA Contact: Xueqiang Wei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-24 13:49 UTC by Stefan Hajnoczi
Modified: 2018-11-01 11:15 UTC (History)
15 users (show)

Fixed In Version: qemu-kvm-rhev-2.12.0-9.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-01 11:13:00 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3443 None None None 2018-11-01 11:15:19 UTC

Description Stefan Hajnoczi 2018-07-24 13:49:31 UTC
Description of problem:

There is a race condition in SCSI device hotplug with virtio-scsi when iothread is used.  The guest may receive the hotplug event and send a command to the new LUN before hotplug is completely finished.  Towards the end of hotplug the SCSI device is reset and pending commands are cancelled.  Any INQUIRY command could be lost and the guest fails to detect the new SCSI device.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.12.0-7.el7

How reproducible:
Non-deterministic.  This bug may take multiple attempts to reproduce.

Steps to Reproduce:
I debugged this using printfs when the bug was reported on the qemu-devel mailing list and have not tried to reproduce it myself.  I hope these instructions work for you, otherwise it may be necessary to verify via code inspection.

1. Start a Linux guest with -object iothread -device virtio-scsi-pci,iothread=iothread0
2. Hotplug a disk using drive_add + device_add scsi-hd (or equivalent QMP commands)
3. If the disk appears inside the guest (check /dev/sdX), go to Step 2 again

Actual results:
Sometimes the guest does not detect the new SCSI device.

Expected results:
The guest should always detect new SCSI devices.

Comment 4 Miroslav Rezanina 2018-08-01 17:18:30 UTC
Fix included in qemu-kvm-rhev-2.12.0-9.el7

Comment 6 Xueqiang Wei 2018-08-02 05:43:10 UTC
Try 2000 times, not hit this issue.

Details as below:

Host:
kernel-3.10.0-918.el7.x86_64
qemu-kvm-rhev-2.12.0-7.el7
Guest:
kernel-3.10.0-918.el7.x86_64


1. boot guest with "object iothread,id=iothread0"
/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_auaALh/monitor-qmpmonitor1-20180801-085320-iplJpwMp,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/avocado_auaALh/monitor-catch_monitor-20180801-085320-iplJpwMp,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idgWxBuj  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/avocado_auaALh/serial-serial0-20180801-085320-iplJpwMp,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20180801-085320-iplJpwMp,path=/var/tmp/avocado_auaALh/seabios-20180801-085320-iplJpwMp,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20180801-085320-iplJpwMp,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -object iothread,id=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel76-64-virtio.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=0x4 \
    -device virtio-net-pci,mac=9a:34:35:36:37:38,id=ideUbf8k,vectors=4,netdev=idcvx3yp,bus=pci.0,addr=0x5  \
    -netdev tap,id=idcvx3yp,vhost=on,vhostfd=19,fd=10 \
    -m 4096  \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu 'Nehalem',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,strict=off,order=cdn,once=c \
    -enable-kvm

2. Hotplug a disk

{"execute":"qmp_capabilities"}

{"execute":"device_add","arguments":{"driver":"virtio-scsi-pci","id":"virtio_scsi_pci0","iothread":"iothread0"}}

{"execute":"__com.redhat_drive_add", "arguments":
{"file":"/home/kvm_autotest_root/images/storage0.raw","format":"raw","id":"drive_stg0","snapshot":"off","aio":"threads","cache":"none"}}

{"execute":"device_add","arguments":{"driver":"scsi-hd","drive":"drive_stg0","id":"stg0"}}

3. dd test on new added disk

# dd if=/dev/sda of=/dev/null bs=1k count=1000 iflag=direct && dd if=/dev/zero of=/dev/sda bs=1k count=1000 oflag=direct

4. unplug the disk

{"execute": "device_del", "arguments": {"id": "stg0"}}
{"execute": "device_del", "arguments": {"id": "virtio_scsi_pci0"}}

5. repeat 2-4 2000 times

After step 5, not hit this issue. 


For reproduce bug, the test steps are OK ? 
Is it necessary to set system disk to virtio-scsi, in my steps I used virtio-blk for system disk.

Comment 7 Xueqiang Wei 2018-08-06 02:33:37 UTC
Use the same environment mentioned in Comment 6, try 5000 times with below steps, not hit this issue.


1. Start a Linux guest with -object iothread -device virtio-scsi-pci,iothread=iothread0
2. Hotplug a disk using drive_add + device_add scsi-hd (or equivalent QMP commands)
3. If the disk appears inside the guest (check /dev/sdX), go to Step 2 again

Comment 10 Xueqiang Wei 2018-08-20 01:53:43 UTC
Try 10000 times with below steps, not hit this issue. So verify the bug.

Version:
kernel-3.10.0-933.el7.x86_64
qemu-kvm-rhev-2.12.0-10.el7

Steps:
1. Start a Linux guest with -object iothread -device virtio-scsi-pci,iothread=iothread0
2. Hotplug a disk using drive_add + device_add scsi-hd (or equivalent QMP commands)
3. If the disk appears inside the guest (check /dev/sdX), go to Step 2 again

Comment 11 errata-xmlrpc 2018-11-01 11:13:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443


Note You need to log in before you can comment on or make changes to this bug.