RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1974807 - [aarch64] Launch guest with virtio-gpu-pci and virtual smmu causes "virtio_gpu_dequeue_ctrl_func" ERROR
Summary: [aarch64] Launch guest with virtio-gpu-pci and virtual smmu causes "virtio_gp...
Keywords:
Status: CLOSED DUPLICATE of bug 1932279
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: 8.5
Hardware: aarch64
OS: Linux
medium
medium
Target Milestone: beta
: 8.5
Assignee: Eric Auger
QA Contact: Yihuang Yu
URL:
Whiteboard:
Depends On:
Blocks: 1885765
TreeView+ depends on / blocked
 
Reported: 2021-06-22 15:14 UTC by Yihuang Yu
Modified: 2021-06-29 14:58 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-29 14:57:47 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Yihuang Yu 2021-06-22 15:14:38 UTC
Description of problem:
This is the issue after bug 1971821 is fixed. When launching a guest with "-device virtio-gpu-pci,iommu_platform=on" and "-machine virt,gic-version=host,iommu=smmuv3", the guest can be launched but the console output has some error messages, and the vnc interface displays a black screen.

Version-Release number of selected component (if applicable):
host kernel: 5.13.0-0.rc4.33.el9.aarch64
guest kernel: kernel-4.18.0-316.el8.aarch64
qemu version: qemu-kvm-6.0.0-5.el9.aarch64

How reproducible:
always

Steps to Reproduce:
1. Launch a guest with iommu and smmuv3
MALLOC_PERTURB_=1  /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \
    -blockdev node-name=file_aavmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel850-aarch64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \
    -machine virt,gic-version=host,iommu=smmuv3,memory-backend=mem-machine_mem,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0,iommu_platform=on \
    -m 9216 \
    -object memory-backend-ram,size=9216M,id=mem-machine_mem  \
    -smp 8,maxcpus=8,cores=4,threads=1,sockets=2  \
    -cpu 'host' \
    -serial unix:'/tmp/avocado_i72teslk/serial-serial0-20210622-102506-U1g0aqA9',server=on,wait=off \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0,iommu_platform=on \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel850-aarch64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \
    -device virtio-net-pci,mac=9a:10:11:62:e0:fa,rombar=0,id=idYy6la2,netdev=idi6Dvfu,bus=pcie-root-port-4,addr=0x0,iommu_platform=on  \
    -netdev tap,id=idi6Dvfu,vhost=on  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew \
    -enable-kvm \

2. Check the console output and vnc interface

Actual results:
2021-06-22 10:25:34: [    4.470512] [drm] pci: virtio-gpu-pci detected at 0000:03:00.0
2021-06-22 10:25:34: [    4.472270] [drm] features: -virgl +edid
2021-06-22 10:25:34: [    4.474606] [drm] number of scanouts: 1
2021-06-22 10:25:34: [    4.475786] [drm] number of cap sets: 0
2021-06-22 10:25:34: [    4.477680] [drm] Initialized virtio_gpu 0.1.0 0 for virtio0 on minor 0
2021-06-22 10:25:34: [    4.482960] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x106)
2021-06-22 10:25:34: [    4.486070] Console: switching to colour frame buffer device 128x48
2021-06-22 10:25:34: [    4.486668] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
2021-06-22 10:25:34: [    4.486982] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
2021-06-22 10:25:34: [    4.488421] random: fast init done
2021-06-22 10:25:34: [    4.489963] sd 0:0:0:0: Power-on or device reset occurred
2021-06-22 10:25:34: [    4.490531] virtio_gpu virtio0: [drm] fb0: virtio_gpudrmfb frame buffer device
2021-06-22 10:25:34: [    4.494351] sd 0:0:0:0: [sda] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
2021-06-22 10:25:34: [    4.495655] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
2021-06-22 10:25:34: [    4.496483] sd 0:0:0:0: [sda] Write Protect is off
2021-06-22 10:25:34: [    4.498805] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
2021-06-22 10:25:34: [    4.501227] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
2021-06-22 10:25:34: [    4.501783] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
2021-06-22 10:25:34: [    4.503842] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
2021-06-22 10:25:34: [    4.504724] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
2021-06-22 10:25:34: [    4.508894]  sda: sda1 sda2 sda3
2021-06-22 10:25:34: [    4.509802] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
2021-06-22 10:25:34: [    4.514538] sd 0:0:0:0: [sda] Attached SCSI disk
2021-06-22 10:25:34: [    4.514608] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
2021-06-22 10:25:34: [    4.527798] sd 0:0:0:0: Attached scsi generic sg0 type 0

VNC displays a black screen.

Expected results:
Not related error messages, the VNC can display a graphical interface.

Additional info:

Comment 1 Eric Auger 2021-06-22 16:42:58 UTC
Hum you said it happened with a RHEL9 guest. In the above command I see you launch a rhel8.5. Please could you clarify?

Comment 2 Yihuang Yu 2021-06-23 00:57:48 UTC
(In reply to Eric Auger from comment #1)
> Hum you said it happened with a RHEL9 guest. In the above command I see you
> launch a rhel8.5. Please could you clarify?

Eric, this problem is in both RHEL8 and RHEL9 guests, the first time I hit it was in the guest of RHEL9, but after bug 1971821 is fixed, the RHEL8 guest also has the same problem. So this bug is to track the RHEL8 guest.

Comment 3 Eric Auger 2021-06-23 15:49:05 UTC
I am a total beginner at graphics on ARM. I am looking for advices on how to exercice the virtio-gpu with rhel8.5/9?

I installed a RHEL8.5 VM with virt-manager adding vnc and virtio-gpu. I got the graphical installer and completed the install (Note I was able to do that only on RHEL8.5 since on RHEL9.0 since I got some issues with the mouse which was not working properly). Then I patched the xml to add the smmuv3 and added <driver iommu='on'/> on the block, net and virtio-gpu-pci. I cannot reproduce the reported issue. Yihuang, is there any manner for me to launch the exact same test as you?

Thanks

Eric

Comment 4 Eric Auger 2021-06-24 12:04:35 UTC
Correction, with the above libvirt test case I can reproduce *sometimes* but it looks less than 50% of the times. However my testcase is sufficient. This happens with the latest ark kernel as a guest.

../..
[   67.330862] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
[   67.540790] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1203 (command 0x105)
[   70.690482] virtio_gpu_dequeue_ctrl_func: 14 callbacks suppressed

Gerd, do you have any clue of what could be the cause? I suspect a problem in the virtio-gpu driver?

Comment 5 Gerd Hoffmann 2021-06-24 13:38:04 UTC
> Gerd, do you have any clue of what could be the cause? I suspect a problem
> in the virtio-gpu driver?

Anything in the logs on the host?
Anything in the logs with "-d guest_errors" added to qemu cmd line?

Comment 6 Eric Auger 2021-06-24 13:52:29 UTC
I am not able to reproduce with upstream qemu whereas with downstream the issues occurs with 20% reproducibility. 

Interestingly we miss the following upstream commit both in 8.5 and 9.0.
9049f8bc44  virtio-gpu: handle partial maps properly (6 weeks ago) <Gerd Hoffmann>

This was the first issue found when investigating BZ1932279 and then we found the guest kernel issue ... and the kernel issue let us forget the bug in qemu ;-) which now does not produce an assert as it did in the past.

with this fix backported in downstream qemu 9.0 I cannot reproduce anymore. So I will send a backport on both 8.5 and 9.0

Comment 7 Gerd Hoffmann 2021-06-24 14:25:15 UTC
> Interestingly we miss the following upstream commit both in 8.5 and 9.0.
> 9049f8bc44  virtio-gpu: handle partial maps properly (6 weeks ago) <Gerd
> Hoffmann>

Ah, right, it was after 6.0 release so not picked up by rebase.

> So I will send a backport on both 8.5 and 9.0

thanks.

Comment 8 Qunfang Zhang 2021-06-28 02:02:48 UTC
(In reply to Eric Auger from comment #6)
> I am not able to reproduce with upstream qemu whereas with downstream the
> issues occurs with 20% reproducibility. 
> 
> Interestingly we miss the following upstream commit both in 8.5 and 9.0.
> 9049f8bc44  virtio-gpu: handle partial maps properly (6 weeks ago) <Gerd
> Hoffmann>
> 
> This was the first issue found when investigating BZ1932279 and then we
> found the guest kernel issue ... and the kernel issue let us forget the bug
> in qemu ;-) which now does not produce an assert as it did in the past.
> 
> with this fix backported in downstream qemu 9.0 I cannot reproduce anymore.
> So I will send a backport on both 8.5 and 9.0

Hi Eric,

This bug is with devel_ack+ and qa_ack+, then which DTM and ITM should we set? 

Thanks,
Qunfang

Comment 9 Eric Auger 2021-06-29 14:57:47 UTC
So Eventually this happens to be a qemu bug tracked by BZ1932279. So let's close this one as DUP

*** This bug has been marked as a duplicate of bug 1932279 ***


Note You need to log in before you can comment on or make changes to this bug.