Bug 1929720

Summary: [aarch64] Handle vsmmuv3 IOTLB invalidation with non power of 2 size
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Eric Auger <eric.auger>
Component: qemu-kvmAssignee: Eric Auger <eric.auger>
qemu-kvm sub component: Devices QA Contact: Yihuang Yu <yihyu>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: drjones, jinzhao, juzhang, lcapitulino, peterx, qzhang, virt-maint, yihyu, zhenyzha
Version: 8.4Keywords: Triaged
Target Milestone: rc   
Target Release: 8.5   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-6.0.0-18.module+el8.5.0+11243+5269aaa1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-16 07:51:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1885765, 1957194    

Description Eric Auger 2021-02-17 13:50:02 UTC
Sometimes the guest drivers invalidates a number of pages which is not a power of 2 (guest using range invalidatioon, >= 8.4). However the way we handle IOTLB invalidations only works with power of 2. So when this happens the size needs to be fixed up otherwise IOTLB might not be correctly invalidated (VHOST, VFIO, internal SMMUv3 device IOTLB).

Comment 1 Yihuang Yu 2021-02-19 02:19:27 UTC
Hello Eric, is the problem like this?

qemu-kvm: ../util/iov.c:59: iov_to_buf_full: Assertion `offset == 0' failed.
Aborted (core dumped)


I launched a guest with iommu enabled on the virtio-gpu-pci device and triggered the problem above.

Comment 2 Eric Auger 2021-02-19 10:10:08 UTC
In my case I launched a guest with virtio-blk-pci and virtio-net-pci, both protected with virtual smmu. I got the problem when installing a package on the guest
sudo dnf -y install numactl-devel: it segsevs.

It was 100% reproducable in my case. I never tried with virtio-gpu-pci. Maybe yet another one :-(

Eric

Comment 3 Yihuang Yu 2021-02-20 15:22:58 UTC
Thank you, Eric

I can reproduce this problem with virtio-blk + smmuv3, which failed to reproduce with virtio-scsi.


20:15:50 ERROR| aexpect.exceptions.ShellCmdError: Shell command failed: 'dnf -y install numactl-devel'    (status: 132,    output: 'Illegal instruction (core dumped)\n')

or 

2021-02-20 10:16:18: [   23.941220] Process 1690(systemd-coredum) has RLIMIT_CORE set to 1
2021-02-20 10:16:18: [   23.943165] Aborting core

Comment 4 Yihuang Yu 2021-02-20 15:28:58 UTC
Hello Eric,

Another topic, as I mentioned in comment 1, "virtio-gpu + smmuv3" will crash qemu, but libvirt does not add iommu to the gpu device by default, so do you suggest filing a bug for it?

Comment 5 Eric Auger 2021-02-24 07:54:31 UTC
Yes please do so. I will have investigate it.

Comment 6 Yihuang Yu 2021-02-24 12:45:00 UTC
Thank you, I have filed a new bug 1932279 to track it.

Comment 7 Eric Auger 2021-02-25 16:56:43 UTC
"[PATCH v2 0/7] Some vIOMMU fixes" posted upstream

Comment 10 Eric Auger 2021-03-16 10:44:33 UTC
[PATCH v3 0/7] Some vIOMMU fixes has reached the master branch
So this will be part of QEMU 6.0
Moving the BZ to POST

Comment 13 Eric Auger 2021-05-25 09:53:57 UTC
[PATCH v2] hw/arm/smmuv3: Another range invalidation fix
was applied on target-arm.next, on May 20. Waiting for the commit to be on main branch before backporting.

Comment 16 Yihuang Yu 2021-06-03 05:19:29 UTC
Set Verified:Tested,SanityOnly as gating/tier1 test pass.

Comment 19 Yihuang Yu 2021-06-10 14:52:52 UTC
Verify with qemu-kvm-6.0.0-18.module+el8.5.0+11243+5269aaa1.aarch64
guest kernel: 4.18.0-310.el8.aarch64

Launch a guest with:
MALLOC_PERTURB_=1  /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \
    -blockdev node-name=file_aavmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel850-aarch64-virtio.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \
    -machine virt,gic-version=host,memory-backend=mem-machine_mem,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars,iommu=smmuv3 \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0 \
    -m 8192 \
    -object memory-backend-ram,size=8192M,id=mem-machine_mem  \
    -smp 8,maxcpus=8,cores=4,threads=1,sockets=2  \
    -cpu 'host' \
    -serial unix:'/tmp/serial-serial0',server=on,wait=off \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-aarch64-virtio.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,write-cache=on,bus=pcie-root-port-3,addr=0x0,iommu_platform=on,ats=on \
    -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \
    -device virtio-net-pci,mac=9a:a1:d2:30:b0:c9,rombar=0,id=idkLlMzA,netdev=idLvm30S,bus=pcie-root-port-4,addr=0x0,iommu_platform=on,ats=on  \
    -netdev tap,id=idLvm30S,vhost=on  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew \
    -enable-kvm \
    -monitor stdio

Install an rpm package inside the guest.
# dnf install -y numactl-devel
Updating Subscription Management repositories.
Unable to read consumer identity

This system is not registered with an entitlement server. You can use subscription-manager to register.

Last metadata expiration check: 1:12:53 ago on Thu 10 Jun 2021 09:37:21 PM CST.
Dependencies resolved.
================================================================================
 Package            Architecture Version               Repository          Size
================================================================================
Installing:
 numactl-devel      aarch64      2.0.12-13.el8         beaker-BaseOS       29 k

Transaction Summary
================================================================================
Install  1 Package

Total download size: 29 k
Installed size: 25 k
Downloading Packages:
numactl-devel-2.0.12-13.el8.aarch64.rpm         1.4 MB/s |  29 kB     00:00    
--------------------------------------------------------------------------------
Total                                           1.2 MB/s |  29 kB     00:00     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                        1/1 
  Installing       : numactl-devel-2.0.12-13.el8.aarch64                    1/1 
  Running scriptlet: numactl-devel-2.0.12-13.el8.aarch64                    1/1 
  Verifying        : numactl-devel-2.0.12-13.el8.aarch64                    1/1 
Installed products updated.

Installed:
  numactl-devel-2.0.12-13.el8.aarch64                                           

Complete!

Comment 21 Eric Auger 2021-06-14 07:06:09 UTC
"219729cfbf  hw/arm/smmuv3: Another range invalidation fix" is the last fix upstreamed wrt that topic and which is supposed to fix the issue.
This was introduced 12 days ago downstream with:
 5546404e138  hw/arm/smmuv3: Another range invalidation fix.
you qemu binary should have it. My aarch64 machines are currently out of service due to the outage, I cannot help atm.

Comment 22 Yihuang Yu 2021-06-15 07:05:05 UTC
Verify this bug according to comment 19 and comment 21.

Comment 24 errata-xmlrpc 2021-11-16 07:51:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684