Bug 1935019

Summary: qemu guest failed boot when attach vhost-user-blk-pci with option iommu_platform=on
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: qing.wang <qinwang>
Component: qemu-kvmAssignee: Kevin Wolf <kwolf>
qemu-kvm sub component: virtio-blk,scsi QA Contact: qing.wang <qinwang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: coli, ddepaula, jinzhao, juzhang, kkiwi, kwolf, lijin, qzhang, virt-maint, xuwei
Version: 8.4Keywords: Triaged
Target Milestone: rc   
Target Release: 8.4   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-6.0.0-24.module+el8.5.0+11844+1e3017bd Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-16 07:51:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1957194    

Description qing.wang 2021-03-04 09:24:06 UTC
Description of problem:

Check iommu_platform option it should support with device vhost-user-blk-pci
The guest/qemu boot failed with unfriendly prompt message:
#(qemu) qemu-storage-daemon: vu_panic: Invalid vring_addr message
#qemu-kvm: Failed to read msg header. Read -1 instead of 12. Original request 0.
#qemu-kvm: Fail to update device iotlb
#qemu-kvm: Failed to write msg. Wrote -1 instead of 20.
#qemu-kvm: vhost_set_vring_call failed: Invalid argument (22)
#qemu-kvm: Failed to set msg fds.
#qemu-kvm: vhost VQ 0 ring restore failed: -1: Input/output error (5)


vhost-user-blk-pci options:
  addr=<int32>           - Slot and optional function number, example: 06.0 or 06 (default: -1)
  any_layout=<bool>      - on/off (default: true)
  ats=<bool>             - on/off (default: false)
  bootindex=<int32>
  chardev=<str>          - ID of a chardev to use as a backend
  class=<uint32>         -  (default: 0)
  config-wce=<bool>      - on/off (default: true)
  disable-legacy=<OnOffAuto> - on/off/auto (default: "auto")
  disable-modern=<bool>  -  (default: false)
  event_idx=<bool>       - on/off (default: true)
  failover_pair_id=<str>
  indirect_desc=<bool>   - on/off (default: true)
  iommu_platform=<bool>  - on/off (default: false)
  migrate-extra=<bool>   - on/off (default: true)
  modern-pio-notify=<bool> - on/off (default: false)
  multifunction=<bool>   - on/off (default: false)
  notify_on_empty=<bool> - on/off (default: true)
  num-queues=<uint16>    -  (default: 65535)
  packed=<bool>          - on/off (default: false)
  page-per-vq=<bool>     - on/off (default: false)
  queue-size=<uint32>    -  (default: 128)


Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux release 8.4 Beta (Ootpa)
4.18.0-291.el8.x86_64
qemu-kvm-common-5.2.0-9.module+el8.4.0+10182+4161bd91.x86_64

How reproducible:
100%

Steps to Reproduce:
1.create images if they are inexist
qemu-img create -f qcow2 /home/kvm_autotest_root/images/disk1.qcow2 1G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/disk2.qcow2 1G


2.export them with qsd

qemu-storage-daemon \
  --chardev socket,path=/tmp/qmp.sock,server,nowait,id=char1 \
  --monitor chardev=char1 \
  --object iothread,id=iothread0 \
  --blockdev driver=file,node-name=file1,filename=/home/kvm_autotest_root/images/disk1.qcow2 \
  --blockdev driver=qcow2,node-name=fmt1,file=file1 \
  --export type=vhost-user-blk,id=export1,addr.type=unix,addr.path=/tmp/vhost-user-blk1.sock,node-name=fmt1,writable=on,iothread=iothread0 \
  --blockdev driver=file,node-name=file2,filename=/home/kvm_autotest_root/images/disk2.qcow2\
  --blockdev driver=qcow2,node-name=fmt2,file=file2 \
  --export type=vhost-user-blk,id=export2,addr.type=unix,addr.path=/tmp/vhost-user-blk2.sock,node-name=fmt2,writable=on,iothread=iothread0 

3.boot qemu with vhost-user-blk-pci device
/usr/libexec/qemu-kvm -enable-kvm \
  -m 4G -M q35,accel=kvm,memory-backend=mem,kernel-irqchip=split \
  -nodefaults \
  -vga qxl \
  -cpu host,+kvm_pv_unhalt \
  -smp 4 \
  -device intel-iommu,device-iotlb=on,intremap \
  -object memory-backend-memfd,id=mem,size=4G,share=on \
  -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x3,chassis=1 \
  -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x3.0x1,bus=pcie.0,chassis=2 \
  -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x3.0x2,bus=pcie.0,chassis=3 \
  -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x3.0x3,bus=pcie.0,chassis=4 \
  -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x3.0x4,bus=pcie.0,chassis=5 \
  -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x3.0x5,bus=pcie.0,chassis=6 \
  -device pcie-root-port,id=pcie-root-port-6,port=0x6,addr=0x3.0x6,bus=pcie.0,chassis=7 \
  -device pcie-root-port,id=pcie-root-port-7,port=0x7,addr=0x3.0x7,bus=pcie.0,chassis=8 \
  -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
  -device virtio-scsi-pci,id=scsi0,bus=pcie-root-port-5 \
  -device virtio-scsi-pci,id=scsi1,bus=pcie-root-port-6 \
  -blockdev driver=qcow2,file.driver=file,file.filename=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,node-name=os_image1 \
  -device virtio-blk-pci,id=blk0,drive=os_image1,bootindex=0 \
  \
  -chardev socket,path=/tmp/vhost-user-blk1.sock,id=vhost1 \
  -device vhost-user-blk-pci,chardev=vhost1,id=blk1,bus=pcie-root-port-3,addr=0x0,num-queues=1,bootindex=1,iommu_platform=on \
  \
  -vnc :5 \
  -monitor stdio \
  -qmp tcp:0:5955,server,nowait

Actual results:
guest boot failed
Expected results:
if this feature not support it should removed from the options or give user friendly message.

Additional info:

Comment 1 Klaus Heinrich Kiwi 2021-06-15 15:31:25 UTC
Is anyone in cc: able to tell if this feature is/should be supported?

(In reply to qing.wang from comment #0)

> 
> Actual results:
> guest boot failed
> Expected results:
> if this feature not support it should removed from the options or give user
> friendly message.

If this is just a documentation/usage info issue, it shouldn't be set to high severity/high priority imho.

Kevin, anything from your side?

Comment 2 Kevin Wolf 2021-06-18 09:46:26 UTC
The bug on the QEMU side is that it didn't check whether the backend (in the external process) actually supports the flag. This is fixed as of upstream commit 04ceb61a. I'd suggest backporting this along with other similar missing checks for a nicer failure mode, though as vhost-user-blk is still tech preview, in theory waiting for the next rebase wouldn't be a big problem either.

I don't know whether we want to support the flag in the vhost-user-blk export eventually, but it's not a priority and would be an RFE.

Comment 4 Danilo de Paula 2021-07-02 21:17:59 UTC
QA_ACK, please?

Comment 10 Yanan Fu 2021-07-19 07:09:52 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 11 qing.wang 2021-07-19 10:52:36 UTC
Tested on
Red Hat Enterprise Linux release 8.5 Beta (Ootpa)
4.18.0-315.el8.x86_64
qemu-kvm-common-6.0.0-24.module+el8.5.0+11844+1e3017bd.x86_64

The issue get fixed, no crash happened.

1.boot qsd
qemu-storage-daemon \
  --chardev socket,path=/tmp/qmp.sock,server=on,wait=off,id=char1 \
  --monitor chardev=char1 \
  --object iothread,id=iothread0 \
  --blockdev driver=file,node-name=file1,filename=/home/kvm_autotest_root/images/disk1.qcow2 \
  --blockdev driver=qcow2,node-name=fmt1,file=file1 \
  --export type=vhost-user-blk,id=export1,addr.type=unix,addr.path=/tmp/vhost-user-blk1.sock,node-name=fmt1,writable=on,iothread=iothread0  \
  --blockdev driver=file,node-name=file2,filename=/home/kvm_autotest_root/images/disk2.qcow2 \
  --blockdev driver=qcow2,node-name=fmt2,file=file2 \
  --export type=vhost-user-blk,id=export2,addr.type=unix,addr.path=/tmp/vhost-user-blk2.sock,node-name=fmt2,writable=on,iothread=iothread0  \



2.boot vm with host-user-blk-pci device

/usr/libexec/qemu-kvm -enable-kvm \
  -m 4G -M accel=kvm,memory-backend=mem \
  -nodefaults \
  -vga qxl \
  -smp 4 \
  -object memory-backend-memfd,id=mem,size=4G,share=on \
  -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pci.0,addr=0x3,chassis=1 \
  -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x3.0x1,bus=pci.0,chassis=2 \
  -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x3.0x2,bus=pci.0,chassis=3 \
  -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x3.0x3,bus=pci.0,chassis=4 \
  -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x3.0x4,bus=pci.0,chassis=5 \
  -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x3.0x5,bus=pci.0,chassis=6 \
  -device pcie-root-port,id=pcie-root-port-6,port=0x6,addr=0x3.0x6,bus=pci.0,chassis=7 \
  -device pcie-root-port,id=pcie-root-port-7,port=0x7,addr=0x3.0x7,bus=pci.0,chassis=8 \
  -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
  -device virtio-scsi-pci,id=scsi0,bus=pcie-root-port-5 \
  -device virtio-scsi-pci,id=scsi1,bus=pcie-root-port-6 \
  -blockdev driver=qcow2,file.driver=file,file.filename=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,node-name=os_image1 \
  -device virtio-blk-pci,id=blk0,drive=os_image1,bootindex=0 \
  \
  -chardev socket,path=/tmp/vhost-user-blk1.sock,id=vhost1 \
  -device vhost-user-blk-pci,chardev=vhost1,id=blk1,bootindex=1,num-queues=1,iommu_platform=off \
  \
  -vnc :5 \
  -monitor stdio \
  -qmp tcp:0:5955,server,nowait

3.login guest, check the disk exist and io 
dd if=/dev/zero of=/dev/vdb count=100 bs=1M oflag=direct

if we set iommu_platform=on boot failed :
iommu_platform=true is not supported by the device.


but the help message 
/usr/libexec/qemu-kvm -device vhost-user-blk-pci,?
vhost-user-blk-pci options:
  acpi-index=<uint32>    -  (default: 0)
  addr=<int32>           - Slot and optional function number, example: 06.0 or 06 (default: -1)
  aer=<bool>             - on/off (default: false)
  any_layout=<bool>      - on/off (default: true)
  ats=<bool>             - on/off (default: false)
  bootindex=<int32>
  chardev=<str>          - ID of a chardev to use as a backend
  class=<uint32>         -  (default: 0)
  config-wce=<bool>      - on/off (default: true)
  disable-legacy=<OnOffAuto> - on/off/auto (default: "auto")
  disable-modern=<bool>  -  (default: false)
  event_idx=<bool>       - on/off (default: true)
  failover_pair_id=<str>
  indirect_desc=<bool>   - on/off (default: true)
  iommu_platform=<bool>  - on/off (default: false)

What do you think fire a new bug to request hide  "iommu_platform=<bool>  - on/off (default: false)" since it is not supported.

Comment 12 Kevin Wolf 2021-07-20 11:09:10 UTC
(In reply to qing.wang from comment #11)
> What do you think fire a new bug to request hide  "iommu_platform=<bool>  -
> on/off (default: false)" since it is not supported.

You are right that there is currently no way to make iommu_platform=on work with qemu-storage-daemon. Though in theory, you could connect to other vhost-user-blk backends that do support the option, so I think keeping it visible on the QEMU side is more correct.

Comment 13 qing.wang 2021-08-31 08:00:55 UTC
Also passed on
Red Hat Enterprise Linux release 9.0 Beta (Plow)
5.14.0-0.rc6.46.el9.x86_64
qemu-kvm-common-6.0.0-12.el9.x86_64

Comment 15 errata-xmlrpc 2021-11-16 07:51:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684