RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1468260 - vhost-user/iommu: crash when backend disconnects
Summary: vhost-user/iommu: crash when backend disconnects
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.4
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: ---
Assignee: Maxime Coquelin
QA Contact: Pei Zhang
URL:
Whiteboard:
Depends On:
Blocks: 1480446
TreeView+ depends on / blocked
 
Reported: 2017-07-06 13:35 UTC by Maxime Coquelin
Modified: 2018-04-11 00:28 UTC (History)
10 users (show)

Fixed In Version: qemu-kvm-rhev-2.10.0-1.el7
Doc Type: Bug Fix
Doc Text:
Previously, the qemu-kvm service in some cases terminated unexpectedly when starting the Input/Output Memory Management Unit (IOMMU) feature. This update ensures that all active conections are released in the proper order when starting IOMMU. As a result, the back end no longer attempts to handle requests after the connections are released, which prevents the problem from occurring.
Clone Of:
: 1480446 (view as bug list)
Environment:
Last Closed: 2018-04-11 00:26:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:1104 0 None None None 2018-04-11 00:28:56 UTC

Description Maxime Coquelin 2017-07-06 13:35:37 UTC
Description of problem:

When the iommu feature is negotiated, Qemu crashes when backend disconnects first.

Two issues:
1. IOMMU UNMAP notifications are still propagated to the user backend, but the vhost_dev struct has been destroyed.
2. If the bacvkend just sent an IOTLB miss request before disconnecting, the request handling on Qemu side can happen after the vhost_dev has been destroyed because the slave fd handler is not unregistered had vhost_dev stop time.

Set severity as low since no vhost-user backend using IOMMU are available upstream. DPDK is expected to support it by v17.11 release.

Version-Release number of selected component (if applicable):


How reproducible:
100% for crash 1, 75% for crash 2 (depends whether IOTLB miss on-going at disconnect time).

Steps to Reproduce:

1. Use DPDK's testpmd on host with vhost-user backend that supports IOMMU.
2. Attach an IOMMU to the Virtio device in Qemu.
3. Use Virtio-net Kernel driver on guest side, so that lots of IOTLB miss are sent by the backend
4. set host's testpmd in txonly mode to flood the guest
5. quit testpmd

Actual results:

2 types of segmentation faults

Expected results:

No crash

Additional info:

Patches available & merged into upstream's master:

commit b9ec9bd468b2c5b218d16642e8f8ea4df60418bb
Author: Maxime Coquelin <maxime.coquelin>
Date:   Fri Jun 30 18:04:22 2017 +0200

    vhost-user: unregister slave req handler at cleanup time
    
    If the backend sends a request just before closing the socket,
    the aio dispatcher might schedule its reading after the vhost
    device has been cleaned, leading to a NULL pointer dereference
    in slave_read();
    
    vhost_user_cleanup() already closes the socket but it is not
    enough, the handler has to be unregistered.
    
    Signed-off-by: Maxime Coquelin <maxime.coquelin>
    Reviewed-by: Marc-André Lureau <marcandre.lureau>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

commit 384b557da1a44ce260cd0328c06a250507348f73
Author: Maxime Coquelin <maxime.coquelin>
Date:   Fri Jun 30 18:04:21 2017 +0200

    vhost: ensure vhost_ops are set before calling iotlb callback
    
    This patch fixes a crash that happens when vhost-user iommu
    support is enabled and vhost-user socket is closed.
    
    When it happens, if an IOTLB invalidation notification is sent
    by the IOMMU, vhost_ops's NULL pointer is dereferenced.
    
    Signed-off-by: Maxime Coquelin <maxime.coquelin>
    Reviewed-by: Marc-André Lureau <marcandre.lureau>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

Comment 4 Pei Zhang 2017-12-13 07:23:50 UTC
This bug has been fixed.

==Reproduce==
Versions:
qemu-kvm-rhev-2.9.0-8.el7.x86_64
dpdk-17.11-1.el7fdb.x86_64
3.10.0-820.el7.x86_64

Steps:
1. In host, boot testpmd with iommu-support=on in vhost-user
# testpmd \
-l 1,3,5 --socket-mem=1024,1024 -n 4 \
-d /usr/lib64/librte_pmd_vhost.so \
--vdev 'net_vhost0,iface=/tmp/vhost-user1,iommu-support=1' -- \
--portmask=3 --disable-hw-vlan -i --rxq=1 --txq=1 \
--nb-cores=2 --forward-mode=io

2. Boot qemu with vIOMMU, and vhost-user "iommu_platform=on,ats=on"
/usr/libexec/qemu-kvm -name rhel7.5_nonrt \
-M q35,kernel-irqchip=split \
-cpu host -m 8G \
-device intel-iommu,intremap=true,caching-mode=true \
-object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc \
-smp 4,sockets=1,cores=4,threads=1 \
-device pcie-root-port,id=root.1,slot=1 \
-device pcie-root-port,id=root.2,slot=2 \
-drive file=/home/images_nfv-virt-rt-kvm/rhel7.5_nonrt.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0,bus=root.1 \
-chardev socket,id=charnet1,path=/tmp/vhost-user1 \
-netdev vhost-user,chardev=charnet1,id=hostnet1 \
-device virtio-net-pci,netdev=hostnet1,id=net1,mac=18:66:da:5f:dd:02,iommu_platform=on,ats=on,bus=root.2 \
-vnc :2 \
-monitor stdio \

3. qemu quit with segmentation after several seconds.
(qemu) Segmentation fault


So this bug has been reproduced.


==Verification==
Versions:
qemu-kvm-rhev-2.10.0-12.el7.x86_64
dpdk-17.11-1.el7fdb.x86_64
3.10.0-820.el7.x86_64

Steps:
1. In host, boot testpmd with iommu-support=on in vhost-user
Same command line as above step 2.

2. Boot qemu with vIOMMU, and vhost-user "iommu_platform=on,ats=on"
Same command line as above step 2.

3. Check qemu and guest status, work well, no any error.

4. Reboot/shutdown guest, work well.


So this bug has been fixed very well. Thanks.


Move status of this bug to "VERIFIED"

Comment 6 errata-xmlrpc 2018-04-11 00:26:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104


Note You need to log in before you can comment on or make changes to this bug.