Bug 1653594 - fix ehci fetchqtd race
Summary: fix ehci fetchqtd race
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.0
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: rc
: 8.0
Assignee: Gerd Hoffmann
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 1624641
TreeView+ depends on / blocked
 
Reported: 2018-11-27 08:19 UTC by Gerd Hoffmann
Modified: 2019-11-06 07:12 UTC (History)
10 users (show)

Fixed In Version: qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-06 07:12:13 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 175462 0 None None None 2019-06-13 02:05:04 UTC
Red Hat Product Errata RHBA-2019:3723 0 None None None 2019-11-06 07:12:37 UTC

Description Gerd Hoffmann 2018-11-27 08:19:01 UTC
Description of problem:
https://patchwork.ozlabs.org/patch/1003105/

Comment 2 Ademar Reis 2018-12-21 22:16:51 UTC
(In reply to Gerd Hoffmann from comment #0)
> Description of problem:
> https://patchwork.ozlabs.org/patch/1003105/

Merged upstream, but not in 3.1. Doesn't seem to be a high priority for RHEL7, so I'm deferring it to RHEL8-fast. If we want this in RHEL8.0, then it needs to be backported.

Gerd, how to test the fix and what's the impact?


--------

commit b7d3a7e1a8830af78e71952e82f186b12b70ff1f
Author: Gerd Hoffmann <kraxel>
Date:   Mon Nov 26 11:08:36 2018 +0100

    ehci: fix fetch qtd race
    
    The token field contains the (guest-filled) state of the qtd, which
    indicates whenever the other fields are valid or not.  So make sure
    we read the token first, otherwise we may end up with an stale next
    pointer:
    
      (1) ehci reads next
      (2) guest writes next
      (3) guest writes token
      (4) ehci reads token
      (5) ehci operates with stale next.
    
    Typical effect is that qemu doesn't notice that the guest appends new
    qtds to the end of the queue.  Looks like the usb device stopped
    responding.  Linux can recover from that, but leaves a message in the
    kernel log that it did reset the usb device in question.
    
    Signed-off-by: Gerd Hoffmann <kraxel>
    Message-id: 20181126100836.8805-1-kraxel

Comment 3 Gerd Hoffmann 2019-01-07 11:07:36 UTC
> Merged upstream, but not in 3.1. Doesn't seem to be a high priority for
> RHEL7, so I'm deferring it to RHEL8-fast. If we want this in RHEL8.0, then
> it needs to be backported.
> 
> Gerd, how to test the fix and what's the impact?

usb-storage stress testing can trigger this.

>     responding.  Linux can recover from that, but leaves a message in the
>     kernel log that it did reset the usb device in question.

On newer (guest) kernels the only noticable effect is the kernel message and rather slow block i/o (the reset is triggered by a timeout).

On older (guest) kernels the error handling doesn't work that well and you might see filesystems throwing errors.

Comment 4 IBM Bug Proxy 2019-02-13 14:50:27 UTC
------- Comment From hannsj_uhl.com 2019-02-13 09:46 EDT-------
.

Comment 5 Gerd Hoffmann 2019-08-14 09:21:09 UTC
Fixed by 4.0 rebase.

Comment 7 Michael 2019-08-28 09:24:50 UTC
Hi Gerd:

How does QE to verify this Bug? Sanity Test using ehci? Or we need more specific operation? 


Thanks for your help. again.

Comment 8 Gerd Hoffmann 2019-08-28 11:00:50 UTC
(In reply to Michael from comment #7)
> Hi Gerd:
> 
> How does QE to verify this Bug?

See comment 3.

Isn't there a usb-storage stress test in autotest?
If so please run than, with usb-storage connected to ehci.

> Sanity Test using ehci?

Yes.

Comment 9 Michael 2019-09-04 02:48:34 UTC
Hi Gerd:

Now I try to verify this Bug. First of all, I try to reproduce this Bug. According comment 8, I used the previous version and I did the stress test on the usb storage. However, every thing was fine and I did not find error or failure. Please see my step as follow:

[1] Boot a Linux guest with usb-storage. Used ehci controller. 

/usr/libexec/qemu-kvm -M q35 -cpu EPYC -enable-kvm -m 4G -smp 4 -nodefaults \
-device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
... ...
-qmp tcp::8888,server,nowait \
-device virtio-serial-pci,id=virtio-serial0,bus=pcie.0-root-port-5,addr=0x0 \
-chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 \
-spice port=5901,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \
-device ich9-usb-ehci1,id=ehci,bus=pcie.0-root-port-4,addr=0x0 \                   <--- USB Controller
-drive file=data.qcow2,if=none,id=usb-ehci,format=qcow2,werror=stop,rerror=stop \  <--- USB Storage 
-device usb-storage,bus=ehci.0,drive=usb-ehci \                                    <--- USB Storage


#lsblk (inside the guest)
NAME                              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                                 8:0    0   20G  0 disk 
├─sda1                              8:1    0    1G  0 part /boot
└─sda2                              8:2    0   19G  0 part 
  ├─rhel_bootp--73--131--246-root 253:0    0   17G  0 lvm  /
  └─rhel_bootp--73--131--246-swap 253:1    0    2G  0 lvm  [SWAP]
sdb                                 8:16   0  150G  0 disk 


#lsusb    (inside the guest)
Bus 001 Device 002: ID 46f4:0001  
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub


#lsusb -vvv      (inside the guest)
Bus 001 Device 002: ID 46f4:0001  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x46f4 
  idProduct          0x0001 
  bcdDevice            0.00
  iManufacturer           1 QEMU
  iProduct                2 QEMU USB HARDDRIVE
  iSerial                 3 1-0000:00:04.0:00.0-1
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0020
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          5 High speed config (usb 2.0)
    bmAttributes         0xc0
      Self Powered
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk-Only
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0         8
  bNumConfigurations      1
can't get debug descriptor: Resource temporarily unavailable
Device Status:     0x0001
  Self Powered

Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0002 2.0 root hub
  bcdDevice            4.18
  iManufacturer           3 Linux 4.18.0-141.el8.x86_64 ehci_hcd
  iProduct                2 EHCI Host Controller
  iSerial                 1 0000:03:00.0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             6
  wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
  bPwrOn2PwrGood       10 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0503 highspeed power enable connect
   Port 2: 0000.0100 power
   Port 3: 0000.0100 power
   Port 4: 0000.0100 power
   Port 5: 0000.0100 power
   Port 6: 0000.0100 power
can't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
Device Status:     0x0001
  Self Powered



[2] Add stress by fio tool. 
#fio -filename=/dev/sdb -direct=1 -rw=randrw -bs=1G -size=145G -numjob=10 -name=mytest

... ... After 3 hours. I got the result. 

|| || || || || || || || || || || || || || || || || || || || || || || || || || || || || || || || || || ||
\/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/ \/

mytest: (g=0): rw=randrw, bs=(R) 1024MiB-1024MiB, (W) 1024MiB-1024MiB, (T) 1024MiB-1024MiB, ioengine=psync, iodepth=1
...
fio-3.7
Starting 10 processes
fio: pid=6724, got signal=9%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
fio: pid=6720, got signal=9(3)][-.-%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
fio: pid=6726, got signal=9(3),K(1),m(3)][-.-%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
fio: pid=6719, got signal=9(3),K(1),m(1),K(1),m(1)][-.-%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
fio: pid=6722, got signal=9(3),K(1),m(1),K(1),m(1)][-.-%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
fio: pid=6727, got signal=9(1),K(1),m(1),K(1),m(1),K(1),m(1)][-.-%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
Jobs: 4 (f=4): [m(1),K(2),m(1),K(1),m(1),K(1),m(1),K(2)][22.2%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 03h:20m:06s]   
Jobs: 4 (f=4): [m(1),K(2),m(1),K(1),m(1),K(1),m(1),K(2)][22.2%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 03h:20m:13s]
Jobs: 1 (f=1): [_(1),K(2),_(1),K(1),m(1),K(1),_(1),K(2)][99.3%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 01m:37s]          
mytest: (groupid=0, jobs=1): err= 0: pid=6718: Tue Sep  3 21:36:40 2019
   read: IOPS=0, BW=5397KiB/s (5527kB/s)(72.0GiB/13988503msec)
    clat (msec): min=33267, max=160619, avg=97674.10, stdev=22357.93
     lat (msec): min=33267, max=160619, avg=97674.10, stdev=22357.93
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=43268, max=155968, per=100.00%, avg=57980.52, stdev=28008.05, samples=71
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=71
  write: IOPS=0, BW=5472KiB/s (5603kB/s)(73.0GiB/13988503msec)
    clat (msec): min=27753, max=105830, avg=69588.10, stdev=15696.68
     lat (msec): min=44233, max=150707, avg=95282.74, stdev=22175.16
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=43803, max=152254, per=100.00%, avg=53884.01, stdev=15722.69, samples=73
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=73
  cpu          : usr=0.07%, sys=2.43%, ctx=28071013, majf=26703921, minf=6270661
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=72,73,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
mytest: (groupid=0, jobs=1): err= 0: pid=6721: Tue Sep  3 21:36:40 2019
   read: IOPS=0, BW=5227KiB/s (5352kB/s)(70.0GiB/14042564msec)
    clat (msec): min=22991, max=157408, avg=93797.62, stdev=26651.63
     lat (msec): min=22991, max=157408, avg=93797.62, stdev=26651.63
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=43467, max=155045, per=100.00%, avg=54933.39, stdev=20242.45, samples=69
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=69
  write: IOPS=0, BW=5600KiB/s (5735kB/s)(75.0GiB/14042564msec)
    clat (msec): min=42662, max=118444, avg=71090.63, stdev=17326.69
     lat (msec): min=53623, max=150831, avg=99686.24, stdev=21858.75
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=43509, max=159285, per=100.00%, avg=58071.55, stdev=25975.26, samples=75
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=75
  cpu          : usr=0.07%, sys=2.44%, ctx=28044141, majf=26673339, minf=7002939
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=70,75,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
mytest: (groupid=0, jobs=1): err= 0: pid=6723: Tue Sep  3 21:36:40 2019
   read: IOPS=0, BW=4993KiB/s (5113kB/s)(67.0GiB/14070128msec)
    clat (msec): min=15358, max=163635, avg=99327.75, stdev=26308.66
     lat (msec): min=15358, max=163635, avg=99327.75, stdev=26308.66
    clat percentiles (msec):
     |  1.00th=[15368],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=43234, max=151659, per=100.00%, avg=53669.71, stdev=19856.39, samples=66
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=66
  write: IOPS=0, BW=5813KiB/s (5952kB/s)(78.0GiB/14070128msec)
    clat (msec): min=9107, max=124204, avg=64955.87, stdev=16545.55
     lat (msec): min=9173, max=153352, avg=95065.24, stdev=27570.44
    clat percentiles (msec):
     |  1.00th=[ 9060],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=43222, max=156107, per=100.00%, avg=57908.79, stdev=24890.89, samples=78
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=78
  cpu          : usr=0.07%, sys=2.47%, ctx=28306040, majf=26936214, minf=5941292
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=67,78,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
mytest: (groupid=0, jobs=1): err= 0: pid=6725: Tue Sep  3 21:36:40 2019
   read: IOPS=0, BW=5026KiB/s (5146kB/s)(67.0GiB/13979147msec)
    clat (msec): min=47935, max=169535, avg=100251.61, stdev=22783.28
     lat (msec): min=47935, max=169535, avg=100251.61, stdev=22783.28
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=43188, max=150657, per=100.00%, avg=54533.91, stdev=19608.68, samples=67
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=67
  write: IOPS=0, BW=5851KiB/s (5991kB/s)(78.0GiB/13979147msec)
    clat (msec): min=24808, max=136691, avg=66422.56, stdev=16909.83
     lat (msec): min=41354, max=151409, avg=93104.51, stdev=23796.64
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=43236, max=155114, per=100.00%, avg=56874.78, stdev=24743.68, samples=77
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=77
  cpu          : usr=0.07%, sys=2.43%, ctx=27833160, majf=26467843, minf=5786089
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=67,78,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=20.1MiB/s (21.1MB/s), 4993KiB/s-5397KiB/s (5113kB/s-5527kB/s), io=276GiB (296GB), run=13979147-14070128msec
  WRITE: bw=22.1MiB/s (23.2MB/s), 5472KiB/s-5851KiB/s (5603kB/s-5991kB/s), io=304GiB (326GB), run=13979147-14070128msec

Disk stats (read/write):
  sdb: ios=2543789/2801665, merge=42/0, ticks=10973715/13410323, in_queue=20727408, util=35.72%

--------------------------------------------------------------------------------------------------------------------

Test finished and every thing looks fine.



Does it have any specific error or failure during the stress test? If not, can I use the same steps to verify the fix version? 


Thank you for your help again.

Comment 10 Gerd Hoffmann 2019-09-09 06:00:06 UTC
> >     responding.  Linux can recover from that, but leaves a message in the
> >     kernel log that it did reset the usb device in question.
> 
> On newer (guest) kernels the only noticable effect is the kernel message and
> rather slow block i/o (the reset is triggered by a timeout).

The kernel message looks like this:

  usb 1-1: reset high-speed USB device number 6 using xhci_hcd

(numbers can be different, also it would be ehci instead of xhci due to the different host adapter).

Comment 11 Michael 2019-09-10 00:48:42 UTC
Hi all:

Thanks Gerd's information. Now I can verify this Bug. The steps is pretty similar with comment 9. 

[1] Boot guest with USB storage using echi. 

/usr/libexec/qemu-kvm -M q35 -cpu EPYC -enable-kvm -m 4G -smp 4 -nodefaults \
-device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
... ...
-device ich9-usb-ehci1,id=ehci,bus=pcie.0-root-port-4,addr=0x0 \                             <--- USB Controller  
-drive file=usb-storage-14.qcow2,if=none,id=usb-ehci,format=qcow2,werror=stop,rerror=stop \  <--- USB Storage 
-device usb-storage,bus=ehci.0,drive=usb-ehci \                                              <--- USB Storage

[2] Check kernel message inside the guest. 
# dmesg | grep usb

usb 1-1: new high-speed USB device number 2 using ehci-pci
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[3] Stress test for the usb storage. 
#fio -filename=/dev/sdb -direct=1 -rw=randrw -bs=1G -size=145G -numjob=10 -name=mytest

-------------------------------------------------------------------------------------------------------
mytest: (groupid=0, jobs=1): err= 0: pid=3129: Mon Sep  9 22:29:49 2019
   read: IOPS=0, BW=3427KiB/s (3509kB/s)(65.0GiB/19888283msec)
    clat (msec): min=40752, max=424675, avg=163002.09, stdev=82481.20
     lat (msec): min=40752, max=424675, avg=163002.10, stdev=82481.20
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=94084, max=901613, per=100.00%, avg=284828.80, stdev=265318.62, samples=64
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=64
  write: IOPS=0, BW=4218KiB/s (4319kB/s)(80.0GiB/19888283msec)
    clat (msec): min=38491, max=256014, avg=96270.45, stdev=44094.27
     lat (msec): min=53280, max=271300, avg=116156.72, stdev=45592.00
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=94372, max=1024000, per=100.00%, avg=291438.29, stdev=268460.63, samples=80
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=80
  cpu          : usr=0.05%, sys=1.47%, ctx=24764802, majf=23413615, minf=8299803
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=65,80,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
mytest: (groupid=0, jobs=1): err= 0: pid=3132: Mon Sep  9 22:29:50 2019
   read: IOPS=0, BW=3526KiB/s (3610kB/s)(67.0GiB/19926498msec)
    clat (msec): min=7729, max=371277, avg=161183.22, stdev=79138.56
     lat (msec): min=7729, max=371277, avg=161183.22, stdev=79138.56
    clat percentiles (msec):
     |  1.00th=[ 7752],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=94936, max=966429, per=100.00%, avg=263343.35, stdev=267517.06, samples=66
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=66
  write: IOPS=0, BW=4105KiB/s (4203kB/s)(78.0GiB/19926498msec)
    clat (msec): min=14068, max=225552, avg=95548.02, stdev=44455.24
     lat (msec): min=14144, max=249099, avg=117010.56, stdev=45355.94
    clat percentiles (msec):
     |  1.00th=[14026],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=94799, max=1136052, per=100.00%, avg=325873.47, stdev=303631.14, samples=78
   iops        : min=    0, max=    1, avg= 0.03, stdev= 0.16, samples=78
  cpu          : usr=0.05%, sys=1.47%, ctx=24921887, majf=23567807, minf=7665443
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=67,78,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
mytest: (groupid=0, jobs=1): err= 0: pid=3133: Mon Sep  9 22:29:50 2019
   read: IOPS=0, BW=3493KiB/s (3577kB/s)(66.0GiB/19814139msec)
    clat (msec): min=43115, max=371331, avg=156476.14, stdev=73886.78
     lat (msec): min=43115, max=371331, avg=156476.15, stdev=73886.77
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=95918, max=1119077, per=100.00%, avg=305143.06, stdev=306621.14, samples=66
   iops        : min=    0, max=    1, avg= 0.02, stdev= 0.12, samples=66
  write: IOPS=0, BW=4181KiB/s (4281kB/s)(79.0GiB/19814139msec)
    clat (msec): min=35387, max=260584, avg=99047.88, stdev=47076.38
     lat (msec): min=41954, max=262339, avg=120058.20, stdev=48010.16
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=96482, max=1239451, per=100.00%, avg=331660.78, stdev=327120.95, samples=78
   iops        : min=    0, max=    1, avg= 0.05, stdev= 0.22, samples=78
  cpu          : usr=0.05%, sys=1.48%, ctx=24784026, majf=23432412, minf=7928688
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=66,79,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
mytest: (groupid=0, jobs=1): err= 0: pid=3134: Mon Sep  9 22:29:50 2019
   read: IOPS=0, BW=3528KiB/s (3612kB/s)(67.0GiB/19915277msec)
    clat (msec): min=34816, max=449696, avg=164907.84, stdev=86393.47
     lat (msec): min=34816, max=449696, avg=164907.84, stdev=86393.47
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=94220, max=908644, per=100.00%, avg=283419.37, stdev=266222.55, samples=67
   iops        : min=    0, max=    0, avg= 0.00, stdev= 0.00, samples=67
  write: IOPS=0, BW=4107KiB/s (4205kB/s)(78.0GiB/19915277msec)
    clat (msec): min=17604, max=223925, avg=90243.53, stdev=34540.20
     lat (msec): min=17658, max=252715, avg=113662.38, stdev=38243.97
    clat percentiles (msec):
     |  1.00th=[17113],  5.00th=[17113], 10.00th=[17113], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min=94304, max=1053845, per=100.00%, avg=287473.32, stdev=276143.14, samples=77
   iops        : min=    0, max=    1, avg= 0.01, stdev= 0.11, samples=77
  cpu          : usr=0.05%, sys=1.48%, ctx=24847504, majf=23492778, minf=7500316
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=67,78,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=13.6MiB/s (14.3MB/s), 3427KiB/s-3528KiB/s (3509kB/s-3612kB/s), io=265GiB (285GB), run=19814139-19926498msec
  WRITE: bw=16.2MiB/s (16.0MB/s), 4105KiB/s-4218KiB/s (4203kB/s-4319kB/s), io=315GiB (338GB), run=19814139-19926498msec

Disk stats (read/write):
  sdb: ios=2441920/2903040, merge=0/0, ticks=20391031/17285593, in_queue=33898128, util=25.81%


Test Version:
kernel:4.18.0-141.el8.x86_64
qemu-kvm-4.1.0-6.module+el8.1.0+4164+854d66f5.x86_64


There is no any error or failure in the message log. Guest working properly. Thus, mark this Bug as verified. If any one have question, just free for contact me.

Comment 14 errata-xmlrpc 2019-11-06 07:12:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723


Note You need to log in before you can comment on or make changes to this bug.