RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2180076 - [qemu-kvm] support fd passing for libblkio QEMU BlockDrivers
Summary: [qemu-kvm] support fd passing for libblkio QEMU BlockDrivers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.3
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Stefano Garzarella
QA Contact: qing.wang
URL:
Whiteboard:
Depends On: 2166106 2213317
Blocks: 1900770
TreeView+ depends on / blocked
 
Reported: 2023-03-20 17:02 UTC by Stefano Garzarella
Modified: 2023-11-07 09:19 UTC (History)
7 users (show)

Fixed In Version: qemu-kvm-8.0.0-6.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-07 08:27:12 UTC
Type: Feature Request
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/src qemu-kvm merge_requests 169 0 None opened block/blkio: support fd passing for virtio-blk-vhost-vdpa driver 2023-06-05 10:31:48 UTC
Red Hat Issue Tracker RHELPLAN-152447 0 None None None 2023-03-20 17:04:44 UTC
Red Hat Product Errata RHSA-2023:6368 0 None None None 2023-11-07 08:28:45 UTC

Description Stefano Garzarella 2023-03-20 17:02:23 UTC
libvirt can pass `/dev/fdset/N` if it opens the device. This is useful especially when the device can only be accessed with certain privileges that QEMU doesn't have, but the libvirt daemon does.
This is the case for vDPA devices accessed via `/dev/vhost-vdpa-X` as we discussed here:
https://bugzilla.redhat.com/show_bug.cgi?id=1900770#c22

For now, this behavior is not supported because libblkio-based BlockDrivers (such as virtio-blk-vhost-vdpa) pass the path to libblkio that opens the device.

To support `/dev/fdset/N`, we could always open the device via qemu_open() and pass the fd to libblkio. Unfortunately, some libblkio drivers (such as virtio-blk), still do not support the `fd` property: https://gitlab.com/libblkio/libblkio/-/issues/65

So we need to first support `fd` in the virtio-blk driver in libblkio and then update the BlockDrivers in QEMU.

Comment 1 Stefano Garzarella 2023-04-19 15:11:29 UTC
libblkio changes posted here: https://gitlab.com/libblkio/libblkio/-/merge_requests/175

Next week (QEMU is still in hard freeze) I'll post the QEMU changes

Comment 2 Stefano Garzarella 2023-05-02 14:55:13 UTC
QEMU patch posted upstream: https://lore.kernel.org/qemu-devel/20230502145050.224615-1-sgarzare@redhat.com/T/#u

Comment 3 Stefano Garzarella 2023-05-30 08:29:29 UTC
Update of the current status:
- the libblkio changes were merged and released with v1.3.0
- the QEMU changes were harder than expected because we had to find a way to let libvirt figure out when `path` supports /dev/fdset/N or not. I just posted the latest version which should be the final one: https://lore.kernel.org/qemu-devel/20230530071941.8954-1-sgarzare@redhat.com/

Comment 5 qing.wang 2023-06-06 01:20:48 UTC
it can be opened by shell-like  ?

exec 3<>/dev/vhost-vdpa-X?

or python
open(/dev/vhost-vdpa-X)

then pass the fd to qemu

Comment 6 Stefano Garzarella 2023-06-06 08:10:56 UTC
(In reply to qing.wang from comment #5)
> it can be opened by shell-like  ?

Yep, I tested in this way:

# open fd
exec {fd}<>"/dev/vhost-vdpa-0" && echo $fd

qemu-system-x86_64 ... \
  -add-fd fd={fd},set=3,opaque="rdwr:/dev/vhost-vdpa-0" \
  -blockdev node-name=drive_src1,driver=virtio-blk-vhost-vdpa,path=/dev/fdset/3,cache.direct=on
  ...

# close fd
exec {fd}>&-

Comment 8 Yanan Fu 2023-06-27 06:04:27 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 11 qing.wang 2023-07-07 07:12:10 UTC
Test failed due to same reason with https://bugzilla.redhat.com/show_bug.cgi?id=2213317

qemu-kvm: -blockdev node-name=file_stg1,driver=virtio-blk-vhost-vdpa,path=/dev/fdset/3,cache.direct=on,cache.no-flush=off,discard=unmap: Unknown driver 'virtio-blk-vhost-vdpa'


Red Hat Enterprise Linux release 9.3 Beta (Plow)
5.14.0-332.el9.x86_64
qemu-kvm-8.0.0-6.el9.x86_64
seabios-bin-1.16.1-1.el9.noarch
edk2-ovmf-20230301gitf80f052277c8-5.el9.noarch
libvirt-9.3.0-2.el9.x86_64
virtio-win-prewhql-0.1-238.iso

1. open fd
exec {fd}<>"/dev/vhost-vdpa-0" && echo $fd

2. boot vm
/usr/libexec/qemu-kvm \
  -name testvm \
  -machine q35 \
  -m  6G \
  -smp 2 \
  -cpu host,+kvm_pv_unhalt \
  -device ich9-usb-ehci1,id=usb1 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
   \
   \
  -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x3,chassis=1 \
  -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x3.0x1,bus=pcie.0,chassis=2 \
  -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x3.0x2,bus=pcie.0,chassis=3 \
  -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x3.0x3,bus=pcie.0,chassis=4 \
  -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x3.0x4,bus=pcie.0,chassis=5 \
  -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x3.0x5,bus=pcie.0,chassis=6 \
  -device pcie-root-port,id=pcie-root-port-6,port=0x6,addr=0x3.0x6,bus=pcie.0,chassis=7 \
  -device pcie-root-port,id=pcie-root-port-7,port=0x7,addr=0x3.0x7,bus=pcie.0,chassis=8 \
  -device pcie-root-port,id=pcie_extra_root_port_0,bus=pcie.0,addr=0x4  \
  -object iothread,id=iothread0 \
  -device virtio-scsi-pci,id=scsi0,bus=pcie-root-port-0 \
  -blockdev driver=qcow2,file.driver=file,cache.direct=off,cache.no-flush=on,file.filename=/home/kvm_autotest_root/images/rhel930-64-virtio-scsi.qcow2,node-name=drive_image1,file.aio=threads   \
  -device scsi-hd,id=os,drive=drive_image1,bus=scsi0.0,bootindex=0,serial=OS_DISK   \
  \
  -add-fd fd=${fd},set=3,opaque="rdwr:/dev/vhost-vdpa-0" \
  -blockdev node-name=file_stg1,driver=virtio-blk-vhost-vdpa,path=/dev/fdset/3,cache.direct=on,cache.no-flush=off,discard=unmap \
  -blockdev node-name=drive_stg1,driver=raw,cache.direct=on,cache.no-flush=off,file=file_stg1 \
  -device virtio-blk-pci,iothread=iothread0,serial=stg1,bus=pcie.0-root-port-4,addr=0x0,write-cache=on,id=stg1,drive=drive_stg1,rerror=report,werror=report \
  \
  \
  -vnc :5 \
  -monitor stdio \
  -qmp tcp:0:5955,server=on,wait=off \
  -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b7,id=nic1,netdev=nicpci,bus=pcie-root-port-7 \
  -netdev tap,id=nicpci \
  -boot menu=on,reboot-timeout=1000,strict=off \
  \

3.close fd
exec {fd}>&-

Comment 13 Stefano Garzarella 2023-07-07 07:25:09 UTC
As we had already discussed on slack, BZ2213317 is a dependency of this, so I don't think there is any point in testing this BZ if we don't solve the other one first.

In what state should I put this BZ?
We don't have to make any changes for it, since the changes are already merged.

Comment 24 qing.wang 2023-07-25 04:52:48 UTC
Test on 

Red Hat Enterprise Linux release 9.3 Beta (Plow)
5.14.0-340.el9.x86_64
qemu-kvm-8.0.0-8.el9.x86_64
seabios-bin-1.16.1-1.el9.noarch
edk2-ovmf-20230524-2.el9.noarch
libvirt-9.3.0-2.el9.x86_64

1. create vdpa device on host
  modprobe vhost-vdpa
  modprobe vdpa-sim-blk
  vdpa dev add mgmtdev vdpasim_blk name blk0
  vdpa dev add mgmtdev vdpasim_blk name blk1
  vdpa dev list -jp
  ls /dev/vhost-vdpa*
  [ $? -ne 0 ] && echo "wrong create vdpa device"

2.open vhost vdpa device and pass the fd to VM 

exec {fd}<>"/dev/vhost-vdpa-0" && echo $fd

/usr/libexec/qemu-kvm \
  -name testvm \
  -machine q35,memory-backend=mem \
  -object memory-backend-memfd,id=mem,size=6G,share=on \
  -m  6G \
  -smp 2 \
  -cpu host,+kvm_pv_unhalt \
  -device ich9-usb-ehci1,id=usb1 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
   \
   \
  -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x3,chassis=1 \
  -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x3.0x1,bus=pcie.0,chassis=2 \
  -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x3.0x2,bus=pcie.0,chassis=3 \
  -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x3.0x3,bus=pcie.0,chassis=4 \
  -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x3.0x4,bus=pcie.0,chassis=5 \
  -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x3.0x5,bus=pcie.0,chassis=6 \
  -device pcie-root-port,id=pcie-root-port-6,port=0x6,addr=0x3.0x6,bus=pcie.0,chassis=7 \
  -device pcie-root-port,id=pcie-root-port-7,port=0x7,addr=0x3.0x7,bus=pcie.0,chassis=8 \
  -device pcie-root-port,id=pcie_extra_root_port_0,bus=pcie.0,addr=0x4  \
  -object iothread,id=iothread0 \
  -device virtio-scsi-pci,id=scsi0,bus=pcie-root-port-0 \
  -blockdev driver=qcow2,file.driver=file,cache.direct=off,cache.no-flush=on,file.filename=/home/kvm_autotest_root/images/rhel930-64-virtio-scsi.qcow2,node-name=drive_image1,file.aio=threads   \
  -device scsi-hd,id=os,drive=drive_image1,bus=scsi0.0,bootindex=0,serial=OS_DISK   \
  \
  -add-fd fd=${fd},set=3,opaque="rdwr:/dev/vhost-vdpa-0" \
  -blockdev node-name=prot_stg0,driver=virtio-blk-vhost-vdpa,path=/dev/fdset/3,cache.direct=on \
  -blockdev node-name=fmt_stg0,driver=raw,file=prot_stg0 \
  -device virtio-blk-pci,iothread=iothread0,bus=pcie-root-port-4,addr=0,id=stg0,drive=fmt_stg0,bootindex=1 \
  \
  \
  -vnc :5 \
  -monitor stdio \
  -qmp tcp:0:5955,server=on,wait=off \
  -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b7,id=nic1,netdev=nicpci,bus=pcie-root-port-7 \
  -netdev tap,id=nicpci \
  -boot menu=on,reboot-timeout=1000,strict=off \
  \

3. login guest and check the disk
function tests_failed() {
        exit_code="$?"
        echo "Test failed: $1"
        exit "${exit_code}"
}
[ "$1" == "" ] && size="128M" || size=$1

vdpa_devs=`lsblk -nd|grep $size|awk '{print $1}'`
echo ${vdpa_devs}

for dev in ${vdpa_devs};do
  echo "$dev"
  mkfs.xfs -f /dev/${dev} || tests_failed "format"
  mkdir -p /home/${dev}
  mount /dev/${dev} /home/${dev} || tests_failed "mount"
  dd if=/dev/zero of=/home/${dev}/test.img count=100 bs=1M oflag=direct || tests_failed "IO"
  umount -fl /home/${dev}
done

4.unplug disk
{"execute": "device_del", "arguments": {"id": "stg0"}}

{"execute": "blockdev-del","arguments": {"node-name": "fmt_stg0"}}
{"execute": "blockdev-del","arguments": {"node-name":"prot_stg0"}}

5.plug disk
{"execute": "blockdev-add", "arguments": {"node-name": "prot_stg0", "driver": "virtio-blk-vhost-vdpa",  "path": "/dev/fdset/3","cache": {"direct": true, "no-flush": false}}}
{"execute": "blockdev-add", "arguments": {"node-name": "fmt_stg0", "driver": "raw",   "file": "prot_stg0"}}
{"execute": "device_add", "arguments": {"driver": "virtio-blk-pci", "id": "stg0", "drive": "fmt_stg0","bus":"pcie-root-port-4"}}


It get  error
{"execute": "blockdev-add", "arguments": {"node-name": "prot_stg0", "driver": "virtio-blk-vhost-vdpa",  "path": "/dev/fdset/3","cache": {"direct": true, "no-flush": false}}}
{"error": {"class": "GenericError", "desc": "blkio_connect failed: Failed to use to vDPA device fd: Input/output error"}}

Hi, Stefano Garzarella,could you please help to check the step 5 is a valid test ?

Comment 25 Stefano Garzarella 2023-07-25 07:44:49 UTC
(In reply to qing.wang from comment #24)
> 
> Hi, Stefano Garzarella,could you please help to check the step 5 is a valid
> test ?

Hi Qing Wang,
I think it's partially not a valid test, I expect failure because the fd was now closed when you removed the device.
So you should reopen the device (/dev/vhost-vdpa-0) somehow, and you will have a new fd to add to the fdset, at which point you can add the device.

Unfortunately, though, I have no idea how to do this with QMP.

Comment 26 qing.wang 2023-07-25 08:23:25 UTC
(In reply to Stefano Garzarella from comment #25)
> (In reply to qing.wang from comment #24)
> > 
> > Hi, Stefano Garzarella,could you please help to check the step 5 is a valid
> > test ?
> 
> Hi Qing Wang,
> I think it's partially not a valid test, I expect failure because the fd was
> now closed when you removed the device.
> So you should reopen the device (/dev/vhost-vdpa-0) somehow, and you will
> have a new fd to add to the fdset, at which point you can add the device.
> 
> Unfortunately, though, I have no idea how to do this with QMP.

Thank suggestion.

Change the step to keep the unplug/plug operation it works.

4.unplug disk
{"execute": "device_del", "arguments": {"id": "stg0"}}

5.plug disk

{"execute": "device_add", "arguments": {"driver": "virtio-blk-pci", "id": "stg0", "drive": "fmt_stg0","bus":"pcie-root-port-4"}}

Comment 28 errata-xmlrpc 2023-11-07 08:27:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6368


Note You need to log in before you can comment on or make changes to this bug.