Bug 1607891

Summary: Hotplug events are sometimes lost with virtio-scsi + iothread
Product: Red Hat Enterprise Linux 7 Reporter: Stefan Hajnoczi <stefanha>
Component: qemu-kvm-rhevAssignee: Stefan Hajnoczi <stefanha>
Status: CLOSED ERRATA QA Contact: Xueqiang Wei <xuwei>
Severity: high Docs Contact:
Priority: high    
Version: 7.6CC: aliang, areis, chayang, coli, juzhang, michen, mrezanin, ngu, pbonzini, phou, qzhang, stefanha, virt-maint, xuwei, yhong
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-9.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-01 11:13:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Stefan Hajnoczi 2018-07-24 13:49:31 UTC
Description of problem:

There is a race condition in SCSI device hotplug with virtio-scsi when iothread is used.  The guest may receive the hotplug event and send a command to the new LUN before hotplug is completely finished.  Towards the end of hotplug the SCSI device is reset and pending commands are cancelled.  Any INQUIRY command could be lost and the guest fails to detect the new SCSI device.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.12.0-7.el7

How reproducible:
Non-deterministic.  This bug may take multiple attempts to reproduce.

Steps to Reproduce:
I debugged this using printfs when the bug was reported on the qemu-devel mailing list and have not tried to reproduce it myself.  I hope these instructions work for you, otherwise it may be necessary to verify via code inspection.

1. Start a Linux guest with -object iothread -device virtio-scsi-pci,iothread=iothread0
2. Hotplug a disk using drive_add + device_add scsi-hd (or equivalent QMP commands)
3. If the disk appears inside the guest (check /dev/sdX), go to Step 2 again

Actual results:
Sometimes the guest does not detect the new SCSI device.

Expected results:
The guest should always detect new SCSI devices.

Comment 4 Miroslav Rezanina 2018-08-01 17:18:30 UTC
Fix included in qemu-kvm-rhev-2.12.0-9.el7

Comment 6 Xueqiang Wei 2018-08-02 05:43:10 UTC
Try 2000 times, not hit this issue.

Details as below:

Host:
kernel-3.10.0-918.el7.x86_64
qemu-kvm-rhev-2.12.0-7.el7
Guest:
kernel-3.10.0-918.el7.x86_64


1. boot guest with "object iothread,id=iothread0"
/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_auaALh/monitor-qmpmonitor1-20180801-085320-iplJpwMp,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/avocado_auaALh/monitor-catch_monitor-20180801-085320-iplJpwMp,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idgWxBuj  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/avocado_auaALh/serial-serial0-20180801-085320-iplJpwMp,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20180801-085320-iplJpwMp,path=/var/tmp/avocado_auaALh/seabios-20180801-085320-iplJpwMp,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20180801-085320-iplJpwMp,iobase=0x402 \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -object iothread,id=iothread0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel76-64-virtio.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=0x4 \
    -device virtio-net-pci,mac=9a:34:35:36:37:38,id=ideUbf8k,vectors=4,netdev=idcvx3yp,bus=pci.0,addr=0x5  \
    -netdev tap,id=idcvx3yp,vhost=on,vhostfd=19,fd=10 \
    -m 4096  \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu 'Nehalem',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,strict=off,order=cdn,once=c \
    -enable-kvm

2. Hotplug a disk

{"execute":"qmp_capabilities"}

{"execute":"device_add","arguments":{"driver":"virtio-scsi-pci","id":"virtio_scsi_pci0","iothread":"iothread0"}}

{"execute":"__com.redhat_drive_add", "arguments":
{"file":"/home/kvm_autotest_root/images/storage0.raw","format":"raw","id":"drive_stg0","snapshot":"off","aio":"threads","cache":"none"}}

{"execute":"device_add","arguments":{"driver":"scsi-hd","drive":"drive_stg0","id":"stg0"}}

3. dd test on new added disk

# dd if=/dev/sda of=/dev/null bs=1k count=1000 iflag=direct && dd if=/dev/zero of=/dev/sda bs=1k count=1000 oflag=direct

4. unplug the disk

{"execute": "device_del", "arguments": {"id": "stg0"}}
{"execute": "device_del", "arguments": {"id": "virtio_scsi_pci0"}}

5. repeat 2-4 2000 times

After step 5, not hit this issue. 


For reproduce bug, the test steps are OK ? 
Is it necessary to set system disk to virtio-scsi, in my steps I used virtio-blk for system disk.

Comment 7 Xueqiang Wei 2018-08-06 02:33:37 UTC
Use the same environment mentioned in Comment 6, try 5000 times with below steps, not hit this issue.


1. Start a Linux guest with -object iothread -device virtio-scsi-pci,iothread=iothread0
2. Hotplug a disk using drive_add + device_add scsi-hd (or equivalent QMP commands)
3. If the disk appears inside the guest (check /dev/sdX), go to Step 2 again

Comment 10 Xueqiang Wei 2018-08-20 01:53:43 UTC
Try 10000 times with below steps, not hit this issue. So verify the bug.

Version:
kernel-3.10.0-933.el7.x86_64
qemu-kvm-rhev-2.12.0-10.el7

Steps:
1. Start a Linux guest with -object iothread -device virtio-scsi-pci,iothread=iothread0
2. Hotplug a disk using drive_add + device_add scsi-hd (or equivalent QMP commands)
3. If the disk appears inside the guest (check /dev/sdX), go to Step 2 again

Comment 11 errata-xmlrpc 2018-11-01 11:13:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443