Bug 1998027

Summary: There Is ' VFIO_MAP_DMA failed' Info in HMP When Rebooting Guest After Installation
Product: Red Hat Enterprise Linux 9 Reporter: Tingting Mao <timao>
Component: qemu-kvmAssignee: Philippe Mathieu-Daudé <philmd>
qemu-kvm sub component: Storage QA Contact: Tingting Mao <timao>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: medium    
Priority: low CC: coli, jferlan, kkiwi, mrezanin, virt-maint
Version: 9.0Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-6.2.0-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-21 10:05:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1996530    

Description Tingting Mao 2021-08-26 10:03:13 UTC
This bug was initially created as a copy of Bug #1996530

I am copying this bug because: The same issue in rhel9



Description of problem:
As subject, there is the hint info in HMP but the guest is still running well.


Version-Release number of selected component (if applicable):
qemu-kvm-6.0.0-12.el9
kernel-5.14.0-0.rc6.46.el9.x86_64


How reproducible:
10/10


Steps to Reproduce:
Setup NVMe disk:
1. Unbind the host NVMe controller from host
# echo 0000:bc:00.0 > /sys/bus/pci/devices/0000\:bc\:00.0/driver/unbind

2.Bind the host NVMe controller to the host vfio-pci driver
# echo 144d a822 > /sys/bus/pci/drivers/vfio-pci/new_id

Installing guest on the NVMe disk:
1. Create the NVMe disk with qcow2+20G.
# qemu-img create -f qcow2 nvme://0000:bc:00.0/1 20G
# qemu-img info nvme://0000:bc:00.0/1
image: nvme://0000:bc:00.0/1
file format: qcow2
virtual size: 20 GiB (21474836480 bytes)
disk size: unavailable
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false

2. Install guest on the disk
/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35 \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 15360  \
    -smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
    -cpu 'Haswell-noTSX',+kvm_pv_unhalt \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -object iothread,id=iothread0 \
    -object iothread,id=iothread1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:1c:0c:0d:e3:4c,id=idjmZXQS,netdev=idEFQ4i1,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idEFQ4i1,vhost=on  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -monitor stdio \
    -chardev socket,server=on,path=/var/tmp/monitor-qmpmonitor1-20210721-024113-AsZ7KYro,id=qmp_id_qmpmonitor1,wait=off  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \
    -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x1.0x5,bus=pcie.0,chassis=5 \
    -device virtio-scsi-pci,id=virtio_scsi_pci1,bus=pcie-root-port-5,addr=0x0,iothread=iothread1 \
    -blockdev node-name=nvme_image1,driver=nvme,device=0000:bc:00.0,namespace=1,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_nvme1,driver=qcow2,file=nvme_image1,read-only=off,discard=unmap \
    -device scsi-hd,id=nvme1,drive=drive_nvme1 \
    -device pcie-root-port,id=pcie-root-port-6,port=0x6,addr=0x1.0x6,bus=pcie.0,chassis=6 \
    -device virtio-scsi-pci,id=virtio_scsi_pci2,bus=pcie-root-port-6,addr=0x0 \
    -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/linux/RHEL-8.5.0-20210714.n.0-x86_64-dvd1.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \
    -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on \


Actual results:
Hit the error hint in HMP when rebooting the guest after installation.
# sh qemu-install.sh 
QEMU 6.0.0 monitor - type 'help' for more information
(qemu) c
(qemu) qemu-kvm: VFIO_MAP_DMA failed: No space left on device

Check the status of the guest:
(qemu) info status
VM status: running


Expected results:
There is no hint info in HMP.


Additional info:
Hit the same issue when installing guest with the NVMe disk created with luks format.

Comment 1 Klaus Heinrich Kiwi 2021-08-27 12:15:40 UTC
Assigning to Phil who took the RHEL8.x clone of it.. I'm using low-priority since any upstream fix will naturally come to RHEL9 with the 6.2 rebase eventually, and I don't think this is sufficiently important for us to backport - so it's probably just sit around for a while.

Comment 2 Tingting Mao 2021-09-09 01:20:41 UTC
*** Bug 2002458 has been marked as a duplicate of this bug. ***

Comment 3 John Ferlan 2021-09-18 11:08:02 UTC
*** Bug 2002458 has been marked as a duplicate of this bug. ***

Comment 4 John Ferlan 2021-09-18 11:14:14 UTC
Housekeeping - the referenced commit went into qemu-6.2 which is planned to be rebased some time in Nov/Dec when it's "released" upstream. It may be the case that an 6.2-rc release could be used for an early rebase, but we'll need to revisit this then.

Was notified by QE of this existing copy instead of the clone I created in bug 2002458 (see c5 there for some details).

Actually assign to Phil, set devel_ack+, set DTM=14 and moved to POST under the assumption this would be picked up by the rebase at that time.

Comment 5 Tingting Mao 2021-09-28 01:18:09 UTC
Also hit the issue in qemu-img.


Tested env:
kernel-5.14.0-0.rc7.54.el9.x86_64
qemu-kvm-6.1.0-2.el9


Steps:
1. Check the source image info
# qemu-img info target.qcow2 
image: target.qcow2
file format: qcow2
virtual size: 20 GiB (21474836480 bytes)
disk size: 3.32 GiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
    extended l2: false

2. Convert the source image to a image over NVMe
# qemu-img convert -f qcow2 -O raw target.qcow2 nvme://0000:bc:00.0/1 -p
qemu-img: VFIO_MAP_DMA failed: No space left on device
qemu-img: VFIO_MAP_DMA failed: No space left on device
    (100.00/100%)

Comment 6 Philippe Mathieu-Daudé 2021-09-28 07:45:11 UTC
(In reply to Tingting Mao from comment #5)
> Also hit the issue in qemu-img.
> 2. Convert the source image to a image over NVMe
> # qemu-img convert -f qcow2 -O raw target.qcow2 nvme://0000:bc:00.0/1 -p
> qemu-img: VFIO_MAP_DMA failed: No space left on device
> qemu-img: VFIO_MAP_DMA failed: No space left on device
>     (100.00/100%)

Thanks for testing qemu-img.

The same fix applies to both qemu-kvm and qemu-img:
https://bugzilla.redhat.com/show_bug.cgi?id=1996530#c6

The POST status is still valid.

Comment 7 Yanan Fu 2021-12-20 12:44:53 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 8 Tingting Mao 2021-12-21 10:05:58 UTC
Tried in latest qemu, there is no the hint info anymore.


Tested with:
qemu-kvm-6.2.0-1.el9
kernel-5.14.0-32.el9.x86_64


Scenario1 - Install guest with qcow2/luks image over NVMe
/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35 \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 15360  \
    -smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2  \
    -cpu 'Haswell-noTSX',+kvm_pv_unhalt \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -object iothread,id=iothread0 \
    -object iothread,id=iothread1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:1c:0c:0d:e3:4c,id=idjmZXQS,netdev=idEFQ4i1,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idEFQ4i1,vhost=on  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -monitor stdio \
    -chardev socket,server=on,path=/var/tmp/monitor-qmpmonitor1-20210721-024113-AsZ7KYro,id=qmp_id_qmpmonitor1,wait=off  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \
    -object secret,id=sec0,data=redhat \
    -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x1.0x5,bus=pcie.0,chassis=5 \
    -device virtio-scsi-pci,id=virtio_scsi_pci1,bus=pcie-root-port-5,addr=0x0,iothread=iothread1 \
    -blockdev node-name=nvme_image1,driver=nvme,device=0000:bc:00.0,namespace=1,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_nvme1,driver=luks,key-secret=sec0,file=nvme_image1,read-only=off,discard=unmap \
    -device scsi-hd,id=nvme1,drive=drive_nvme1 \
    -device pcie-root-port,id=pcie-root-port-6,port=0x6,addr=0x1.0x6,bus=pcie.0,chassis=6 \
    -device virtio-scsi-pci,id=virtio_scsi_pci2,bus=pcie-root-port-6,addr=0x0 \
    -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/linux/RHEL-9.0.0-20211216.2-x86_64-dvd1.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \
    -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on \


Scenario2 - Convert via qemu-img
# qemu-img convert -f qcow2 -O raw RHEL-9.0.0-20210702.2-x86_64.qcow2 nvme://0000:bc:00.0/1 -p
    (100.00/100%)


Results:
No the hint info.