Bug 1498817
Summary: | Vhost IOMMU support regression since qemu-kvm-rhev-2.9.0-16.el7_4.5 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Maxime Coquelin <maxime.coquelin> |
Component: | qemu-kvm-rhev | Assignee: | Maxime Coquelin <maxime.coquelin> |
Status: | CLOSED ERRATA | QA Contact: | Pei Zhang <pezhang> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.4 | CC: | ailan, chayang, jasowang, mst, mtessun, pbonzini, peterx, pezhang, sgordon, virt-maint, wexu |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.10.0-4.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-11 00:38:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Maxime Coquelin
2017-10-05 10:31:58 UTC
A patch series fixing this issue had already been posted upstream by Peter Xu, but haven't been applied: <1496404254-17429-1-git-send-email-peterx> https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00571.html I rebased and re-posted the two first patches of the series: Message-Id: <20171005171309.1250-1-maxime.coquelin> https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg01139.html Following patches posted upstream: commit 96fe9842106b9bdcb2d2d4493b062e65ed6db6d6 Author: Maxime Coquelin <maxime.coquelin> Date: Tue Oct 10 10:20:15 2017 +0200 memory: fix off-by-one error in memory_region_notify_one() This patch fixes an off-by-one error that could lead to the notifyee to receive notifications for ranges it is not registered to. The bug has been spotted by code review. Fixes: bd2bfa4c52e5 ("memory: introduce memory_region_notify_one()") Cc: qemu-stable Cc: Peter Xu <peterx> Signed-off-by: Maxime Coquelin <maxime.coquelin> commit 3a90d32d26caf499787e2b33a92c96d8bb903c6f Author: Peter Xu <peterx> Date: Thu Oct 5 16:30:34 2017 +0200 exec: simplify address_space_get_iotlb_entry This patch let address_space_get_iotlb_entry() to use the newly introduced page_mask parameter in flatview_do_translate(). Then we will be sure the IOTLB can be aligned to page mask, also we should nicely support huge pages now when introducing a764040. Fixes: a764040 ("exec: abstract address_space_do_translate()") Signed-off-by: Peter Xu <peterx> Signed-off-by: Maxime Coquelin <maxime.coquelin> Acked-by: Michael S. Tsirkin <mst> commit 23ba2a608f236564ac37705c97c5e7b916bd7849 Author: Peter Xu <peterx> Date: Thu Oct 5 15:35:26 2017 +0200 exec: add page_mask for flatview_do_translate The function is originally used for flatview_space_translate() and what we care about most is (xlat, plen) range. However for iotlb requests, we don't really care about "plen", but the size of the page that "xlat" is located on. While, plen cannot really contain this information. A simple example to show why "plen" is not good for IOTLB translations: E.g., for huge pages, it is possible that guest mapped 1G huge page on device side that used this GPA range: 0x100000000 - 0x13fffffff Then let's say we want to translate one IOVA that finally mapped to GPA 0x13ffffe00 (which is located on this 1G huge page). Then here we'll get: (xlat, plen) = (0x13fffe00, 0x200) So the IOTLB would be only covering a very small range since from "plen" (which is 0x200 bytes) we cannot tell the size of the page. Actually we can really know that this is a huge page - we just throw the information away in flatview_do_translate(). This patch introduced "page_mask" optional parameter to capture that page mask info. Also, I made "plen" an optional parameter as well, with some comments for the whole function. No functional change yet. Signed-off-by: Peter Xu <peterx> Signed-off-by: Maxime Coquelin <maxime.coquelin> Patches merged upstream and RHEL 7.5 backport posted. Brew build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=14314471 Fix included in qemu-kvm-rhev-2.10.0-4.el7 This bug has been fixed well. Do PVP testing with vIOMMU. Host, qemu and guest all work well. And the throughput testing results looks good. ==Verification== Versions: 3.10.0-824.el7.x86_64 qemu-kvm-rhev-2.10.0-13.el7.x86_64 dpdk-17.11-4.el7.x86_64 Steps: 1. Boot testpmd in host with 2 "net_vhost,..,iommu-support=1" testpmd \ -l 1,3,5,7,9 --socket-mem=1024,1024 -n 4 \ -d /usr/lib64/librte_pmd_vhost.so \ --vdev 'net_vhost0,iface=/tmp/vhost-user1,iommu-support=1' \ --vdev 'net_vhost1,iface=/tmp/vhost-user2,iommu-support=1' -- \ --portmask=f --disable-hw-vlan -i --rxq=1 --txq=1 \ --nb-cores=4 --forward-mode=io testpmd> set portlist 0,2,1,3 testpmd> start 2. Boot VM with vIOMMU /usr/libexec/qemu-kvm -name rhel7.5_nonrt \ -M q35,kernel-irqchip=split \ -cpu host -m 8G \ -device intel-iommu,intremap=true,caching-mode=true \ -object memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on \ -numa node,memdev=mem -mem-prealloc \ -smp 6,sockets=1,cores=6,threads=1 \ -device pcie-root-port,id=root.1,chassis=1 \ -device pcie-root-port,id=root.2,chassis=2 \ -device pcie-root-port,id=root.3,chassis=3 \ -drive file=/home/images_nfv-virt-rt-kvm/rhel7.5_nonrt.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0,bus=root.1 \ -chardev socket,id=charnet1,path=/tmp/vhost-user1 \ -netdev vhost-user,chardev=charnet1,id=hostnet1 \ -device virtio-net-pci,netdev=hostnet1,id=net1,mac=18:66:da:5f:dd:02,iommu_platform=on,ats=on,bus=root.2 \ -chardev socket,id=charnet2,path=/tmp/vhost-user2 \ -netdev vhost-user,chardev=charnet2,id=hostnet2 \ -device virtio-net-pci,netdev=hostnet2,id=net2,mac=18:66:da:5f:dd:03,iommu_platform=on,ats=on,bus=root.3 \ -vnc :2 \ -monitor stdio \ 3. In guest, load vfio and start testpmd # modprobe vfio # modprobe vfio-pci # /usr/bin/testpmd \ -l 1,2,3 \ -n 4 \ -d /usr/lib64/librte_pmd_virtio.so.1 \ -w 0000:02:00.0 -w 0000:03:00.0 \ -- \ --nb-cores=2 \ --disable-hw-vlan \ -i \ --disable-rss \ --rxq=1 --txq=1 4. In another host, start Trex server and generates packets to guest to get throughput value. DIRECTORY=~/src/lua-trafficgen cd $DIRECTORY ./binary-search.py \ --traffic-generator=trex-txrx \ --search-runtime=30 \ --validation-runtime=60 \ --rate-unit=mpps \ --rate=0 \ --run-bidirec=1 \ --run-revunidirec=0 \ --frame-size=64 \ --num-flows=1024 \ --one-shot=0 \ --max-loss-pct=0.002 Throughput: 18.6Mpps So this bug has been fixed well. Move status of this bug to "VERIFIED". Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104 |