Bug 1436616
Summary: | usb-storage device under nec-usb-xhci is unusable after migration | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | yduan | ||||||||
Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | hachen <hachen> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 7.4 | CC: | chayang, coli, hachen, hhuang, jinzhao, jsnow, juzhang, kraxel, michen, mrezanin, peterx, quintela, qzhang, virt-maint, xfu, xianwang, xuma, yduan | ||||||||
Target Milestone: | rc | Keywords: | Regression | ||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | qemu-kvm-rhev-2.9.0-1.el7 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2017-08-02 04:35:59 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1376765 | ||||||||||
Attachments: |
|
Description
yduan
2017-03-28 10:18:27 UTC
I test with usb-ehci (pci-bridge -> usb-ehci -> usb-storage + migration), usb-storage device works well after migration. # rpm -q qemu-kvm-rhev qemu-kvm-rhev-2.9.0-0.el7.mrezanin201703210848.x86_64 /usr/libexec/qemu-kvm \ -machine pc \ ... -device pci-bridge,id=bridge1,chassis_nr=1 \ -device usb-ehci,id=ehci1,bus=bridge1,addr=0x3 \ -drive file=/mnt/stg2.qcow2,format=qcow2,id=drive_usb,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_usb,id=device_usb,bus=ehci1.0 \ ... With usb-ehci (pci.0 -> usb-ehci -> usb-storage + migration), usb-storage device works well after migration. # rpm -q qemu-kvm-rhev qemu-kvm-rhev-2.9.0-0.el7.mrezanin201703210848.x86_64 /usr/libexec/qemu-kvm \ -machine pc \ ... -device usb-ehci,id=ehci1 \ -drive file=/mnt/stg2.qcow2,format=qcow2,id=drive_usb,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_usb,id=device_usb,bus=ehci1.0 \ ... With nec-usb-xhci (pci.0 -> nec-usb-xhci -> usb-storage + migration), meet the same problem as Comment 0. # rpm -q qemu-kvm-rhev qemu-kvm-rhev-2.9.0-0.el7.mrezanin201703210848.x86_64 /usr/libexec/qemu-kvm \ -machine pc \ ... -device nec-usb-xhci,id=xhci1 \ -drive file=/mnt/stg2.qcow2,format=qcow2,id=drive_usb,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_usb,id=device_usb,bus=xhci1.0 \ ... With nec-usb-xhci (pci.0 -> nec-usb-xhci -> usb-storage + migration), usb-storage device works well after migration. # rpm -q qemu-kvm-rhev qemu-kvm-rhev-2.8.0-6.el7.x86_64 /usr/libexec/qemu-kvm \ -machine pc \ ... -device nec-usb-xhci,id=xhci1 \ -drive file=/mnt/stg2.qcow2,format=qcow2,id=drive_usb,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_usb,id=device_usb,bus=xhci1.0 \ ... According to Comment 6, Comment 7 and Comment 8, I think it's a regression bug about nec-usb-xhci device. Please confirm the full command line you were using to repeat it. Full CMD (pci.0 -> nec-usb-xhci): /usr/libexec/qemu-kvm \ -S \ -name 'RHEL7.4' \ -machine pc \ -m 8192 \ -smp 4,maxcpus=4,sockets=1,cores=4,threads=1 \ -cpu SandyBridge,enforce \ -rtc base=localtime,clock=host,driftfix=slew \ -nodefaults \ -device AC97 \ -vga qxl \ -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \ -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \ -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \ -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \ -device usb-tablet,id=usb-tablet1 \ -boot menu=on \ -enable-kvm \ -monitor stdio \ -spice port=5900,disable-ticketing \ -qmp tcp:0:9999,server,nowait \ -netdev tap,id=netdev0,vhost=on \ -device virtio-net-pci,mac=BA:BC:13:83:4F:BD,id=net0,netdev=netdev0,status=on \ -drive file=/mnt/rhel74-64-virtio.qcow2,format=qcow2,id=drive_sysdisk,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive_sysdisk,id=device_sysdisk,bootindex=0 \ -device nec-usb-xhci,id=xhci0 \ -drive file=/mnt/stg0.qcow2,format=qcow2,id=drive_usb,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device usb-storage,drive=drive_usb,id=device_usb,bus=xhci0.0 \ (In reply to Dr. David Alan Gilbert from comment #10) > Please confirm the full command line you were using to repeat it. Before migration: # fdisk -l Disk /dev/vda: 21.5 GB, 21474836480 bytes, 41943040 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: dos Disk identifier: 0x000cdb05 Device Boot Start End Blocks Id System /dev/vda1 * 2048 2099199 1048576 83 Linux /dev/vda2 2099200 41943039 19921920 8e Linux LVM Disk /dev/mapper/rhel_bootp--73--199--233-root: 18.2 GB, 18249416704 bytes, 35643392 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mapper/rhel_bootp--73--199--233-swap: 2147 MB, 2147483648 bytes, 4194304 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/sda: 209 MB, 209715200 bytes, 409600 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes After Migration: # fdisk -l Disk /dev/vda: 21.5 GB, 21474836480 bytes, 41943040 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: dos Disk identifier: 0x000cdb05 Device Boot Start End Blocks Id System /dev/vda1 * 2048 2099199 1048576 83 Linux /dev/vda2 2099200 41943039 19921920 8e Linux LVM Disk /dev/mapper/rhel_bootp--73--199--233-root: 18.2 GB, 18249416704 bytes, 35643392 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mapper/rhel_bootp--73--199--233-swap: 2147 MB, 2147483648 bytes, 4194304 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes fdisk: cannot open /dev/sda: No such device or address # dmesg | grep -i error [ 349.634675] sd 2:0:0:0: Device offlined - not ready after error recovery [ 349.635934] blk_update_request: I/O error, dev sda, sector 0 [ 349.636224] Buffer I/O error on dev sda, logical block 0, async page read Confirmed. Using the command line above it does fail for me. Some other observations: The destination prints: host qemu-kvm: usb-msd: Bad signature 53425300 inside the guest the first warnings are: xhci_hcd 0000:00:06.0: WARN Event TRB for slot 1 ep 0 with no TDs queued? xhci_hcd 0000:00:06.0: WARN Event TRB for slot 1 ep 2 with no TDs queued? xhci_hcd 0000:00:06.0: WARN Event TRB for slot 1 ep 3 with no TDs queued? Also fails on upstream 2.9.0-rc2, works on 2.8.0 Git bisect says: 2.8.0 good 77620ba65ef32121de20848f9635c4afe233a1ce good 6fe791b5e3aca8a6de8a322e85e76d2f13338a7e good 5459ef3bff961bc462ac89460ab6b08a14624c8d good a951316b8a5c3c63254f20a826afeed940dd4cba good 811ad5d8f1d3f35240043fe880d34dce6f2097a3 good 13e8ff7abbf1dde46280536ab4fae5012661b8b0 good, good ddb603ab6c981c1d67cb42266fc700c33e5b2d8f bad, bad 96d87bdda3919bb16f754b3d3fd1227e1f38f13c bad 7d2c6c95511e42dffe2b263275e09957723d0ff4 bad f073cd3a2bf1054135271b837c58a7da650dd84b bad 0b4384d0bb98f0016ba671b1c9cc75c2f31cd057 bad 2.9.0-rc2 bad ddb603ab6c981c1d67cb42266fc700c33e5b2d8f is the first bad commit commit ddb603ab6c981c1d67cb42266fc700c33e5b2d8f Author: Gerd Hoffmann <kraxel> Date: Mon Jan 30 16:36:46 2017 +0100 xhci: don't kick in xhci_submit and xhci_fire_ctl_transfer xhci_submit and xhci_fire_ctl_transfer are is called from xhci_kick_epctx processing loop only, so there is no need to call xhci_kick_epctx make sure processing continues. Also eecursive calls into xhci_kick_epctx can cause trouble. Drop the xhci_kick_epctx calls. Cc: 1653384.net Fixes: 94b037f2a451b3dc855f9f2c346e5049a361bd55 Reported-by: Fabian Lesniak <fabian> Signed-off-by: Gerd Hoffmann <kraxel> Message-id: 1485790607-31399-4-git-send-email-kraxel :040000 040000 50c47761469f82783864229a3d646a06ff02fae0 fe7e3cc6c0722962ad7013e1b656b2feb10ccab7 M hw what's odd for me is that this doesn't seem to have anything to do with PCI bridges, indeed the command line in c11 doesn't have one, and it fails for me. So I think this is just a plain old xhci bug. Gerd posted the following fix upstream, we should get it during the rebase: [PATCH] xhci: flush dequeue pointer to endpoint context Gerd's patch is now merged upstream as 243afe858b95765b98d1 and will be in the -rc3 tag This bug is hit for qemu2.9-rc2 with "usb-bot" version: Host: 3.10.0-635.el7.x86_64 qemu-kvm-rhev-2.9.0-0.el7.patchwork201703291116.x86_64 Guest: 3.10.0-635.el7.x86_64 steps: 1)Boot a guest in src host with the qemu cli including "usb-bot": /usr/libexec/qemu-kvm \ -name 'vm1' \ -sandbox off \ -machine pc-i440fx-rhel7.4.0 \ -nodefaults \ -device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=04 \ -device nec-usb-xhci,id=usb1,bus=pci.0,addr=06 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=09 \ -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=unsafe,format=qcow2,file=/root/rhel74-64-virtio.qcow2 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bus=pci.0,addr=05,bootindex=0 \ -device usb-bot,id=bot,bus=usb1.0,port=4 \ -drive file=/root/test.iso,if=none,id=usb-cdrom,format=raw \ -device scsi-cd,bus=bot.0,scsi-id=0,lun=1,drive=usb-cdrom,id=usb-cdrom \ -drive file=/root/r2.qcow2,id=usb-disk,if=none,format=qcow2 \ -device scsi-hd,bus=bot.0,scsi-id=0,lun=0,drive=usb-disk,id=usb-disk \ -device virtio-net-pci,mac=9a:4f:50:51:52:53,id=id9HRc5V,vectors=4,netdev=idjlQN53,bus=pci.0,addr=07 \ -netdev tap,id=idjlQN53,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \ -m 4G \ -smp 4 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device usb-mouse,id=input1,bus=usb1.0,port=2 \ -device usb-kbd,id=input2,bus=usb1.0,port=3 \ -vnc :1 \ -qmp tcp:0:8881,server,nowait \ -vga std \ -monitor stdio \ -rtc base=localtime \ -cpu SandyBridge \ -boot order=cdn,once=c,menu=on,strict=off \ -enable-kvm \ 2)boot a guest with the same qemu cli as src appending "incoming tcp:0:5801" 3)In guest, check the scsi-hd and mount cdrom to /mnt [root@dhcp187-168 ~]# fdisk -l Disk /dev/vda: 21.5 GB, 21474836480 bytes, 41943040 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: dos Disk identifier: 0x0002242a Device Boot Start End Blocks Id System /dev/vda1 * 2048 2099199 1048576 83 Linux /dev/vda2 2099200 41943039 19921920 8e Linux LVM Disk /dev/mapper/rhel_dhcp--9--141-root: 18.2 GB, 18249416704 bytes, 35643392 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mapper/rhel_dhcp--9--141-swap: 2147 MB, 2147483648 bytes, 4194304 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/sda: 2147 MB, 2147483648 bytes, 4194304 sectors*********scsi-hd in cli Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes [root@dhcp187-168 ~]# mount /dev/sr0 /mnt mount: /dev/sr0 is write-protected, mounting read-only [root@dhcp187-168 ~]# ls /mnt addons EULA images LiveOS Packages repodata RPM-GPG-KEY-redhat-release EFI GPL isolinux media.repo release-notes RPM-GPG-KEY-redhat-beta TRANS.TBL 4)do migration in src host (qemu) migrate -d tcp:10.16.184.234:5801 5)check the status of migration and check the usb devices in guest. Actual result: in src: (qemu) info migrate Migration status: completed (qemu) info status VM status: paused (postmigrate) in dst: (qemu) qemu-kvm: usb-msd: Bad signature 53425301 (qemu) info status VM status: running in guest: [root@dhcp187-168 ~]# fdisk -l Disk /dev/vda: 21.5 GB, 21474836480 bytes, 41943040 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: dos Disk identifier: 0x0002242a Device Boot Start End Blocks Id System /dev/vda1 * 2048 2099199 1048576 83 Linux /dev/vda2 2099200 41943039 19921920 8e Linux LVM Disk /dev/mapper/rhel_dhcp--9--141-root: 18.2 GB, 18249416704 bytes, 35643392 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mapper/rhel_dhcp--9--141-swap: 2147 MB, 2147483648 bytes, 4194304 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes *******************no scsi-hd in cli [root@dhcp187-168 ~]# ls /mnt *******************scsi-cd works well addons EULA images LiveOS Packages repodata RPM-GPG-KEY-redhat-release EFI GPL isolinux media.repo release-notes RPM-GPG-KEY-redhat-beta TRANS.TBL ie.,after migration, scsi-hd that connected in usb-bot can't be found in guest while scsi-cd can works well.I upload the dmesg files that generated before and after migration in guest as attachments. Created attachment 1271062 [details]
guest_dmesg_aftermigrate
Created attachment 1271063 [details]
guest_dmesg_beforemigrate
# fdisk -l Disk /dev/vda: 21.5 GB, 21474836480 bytes, 41943040 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: dos Disk identifier: 0x000365fb Device Boot Start End Blocks Id System /dev/vda1 * 2048 2099199 1048576 83 Linux /dev/vda2 2099200 41943039 19921920 8e Linux LVM Disk /dev/sda: 5368 MB, 5368709120 bytes, 10485760 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mapper/rhel_bootp--73--199--6-root: 18.2 GB, 18249416704 bytes, 35643392 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mapper/rhel_bootp--73--199--6-swap: 2147 MB, 2147483648 bytes, 4194304 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/sda appears in migrated guest. verified. Test using https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13377285 with nec-usb-xhci and usb stg, usb stg is available after migration. It doesn't reintroduce this bz. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 |