Bug 1283257 - [RFE] IOMMU support in Vhost-net
[RFE] IOMMU support in Vhost-net
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel (Show other bugs)
7.3
Unspecified Unspecified
high Severity unspecified
: rc
: 7.4
Assigned To: Wei
Quan Wenli
: FutureFeature
Depends On:
Blocks: 1283104 1288337 1395265 1401433
  Show dependency treegraph
 
Reported: 2015-11-18 09:26 EST by Amnon Ilan
Modified: 2017-08-01 20:31 EDT (History)
14 users (show)

See Also:
Fixed In Version: kernel-3.10.0-658.el7
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-01 16:02:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Amnon Ilan 2015-11-18 09:26:30 EST
Description of problem:

Vhost-net should properly support IOMMU in order to allow guests to securely access devices from user space (e.g. in the dpdk in guest case)
Comment 1 jason wang 2016-08-23 03:18:48 EDT
Not a 7.3 material. Defer to 7.4.
Comment 2 Wei 2017-02-20 11:03:03 EST
See also:
https://bugzilla.redhat.com/show_bug.cgi?id=1425127
Comment 3 jason wang 2017-03-23 02:10:12 EDT
Note for QE:

To test this, need cli like:

           -M q35 \
           -device intel-iommu,device-iotlb=on,intremap \
           -device ioh3420,id=root.1,chassis=1 \
           -device virtio-net-pci,netdev=hn0,id=v0,bus=root.1,disable-modern=off,disable-legacy=on,iommu_platform=on,ats=on \

This means you need:
[1] q35 chipset
[2] intel IOMMU with device-iotlb and interrupt remapping enabled
[3] pcie switch (ioh3420)
[4] modern virtio-net-pci device with both iommu_platform and ats enabled

In guest:
[1] add intel_iommu=on to kernel command line

For stress testing:
[1] netperf UDP with intel_iommu=on|strict
[2] pktgen to test both rx and tx

For performance testing:
Checking dpdk l2fwd performance should be sufficient
Comment 4 Quan Wenli 2017-03-23 02:27:25 EDT
(In reply to jason wang from comment #3)
> Note for QE:
> 
> To test this, need cli like:
> 
>            -M q35 \
>            -device intel-iommu,device-iotlb=on,intremap \
>            -device ioh3420,id=root.1,chassis=1 \
>            -device
> virtio-net-pci,netdev=hn0,id=v0,bus=root.1,disable-modern=off,disable-
> legacy=on,iommu_platform=on,ats=on \
> 
> This means you need:
> [1] q35 chipset
> [2] intel IOMMU with device-iotlb and interrupt remapping enabled
> [3] pcie switch (ioh3420)
> [4] modern virtio-net-pci device with both iommu_platform and ats enabled
> 

So we can just simply set "iommu_platform=off,ats=off" to disable vIOMMU?
 
> In guest:
> [1] add intel_iommu=on to kernel command line
> 
> For stress testing:
> [1] netperf UDP with intel_iommu=on|strict
> [2] pktgen to test both rx and tx
> 
> For performance testing:
> Checking dpdk l2fwd performance should be sufficient

for it, @pezhang, will your NFV team test it ?
Comment 5 jason wang 2017-03-23 02:31:38 EDT
(In reply to Quan Wenli from comment #4)
> (In reply to jason wang from comment #3)
> > Note for QE:
> > 
> > To test this, need cli like:
> > 
> >            -M q35 \
> >            -device intel-iommu,device-iotlb=on,intremap \
> >            -device ioh3420,id=root.1,chassis=1 \
> >            -device
> > virtio-net-pci,netdev=hn0,id=v0,bus=root.1,disable-modern=off,disable-
> > legacy=on,iommu_platform=on,ats=on \
> > 
> > This means you need:
> > [1] q35 chipset
> > [2] intel IOMMU with device-iotlb and interrupt remapping enabled
> > [3] pcie switch (ioh3420)
> > [4] modern virtio-net-pci device with both iommu_platform and ats enabled
> > 
> 
> So we can just simply set "iommu_platform=off,ats=off" to disable vIOMMU?
>  

You also need remove -device intel-iommu.

Thanks
Comment 6 Pei Zhang 2017-03-23 02:41:22 EDT
(In reply to Quan Wenli from comment #4)
> (In reply to jason wang from comment #3)
> > Note for QE:
> > 
> > To test this, need cli like:
> > 
> >            -M q35 \
> >            -device intel-iommu,device-iotlb=on,intremap \
> >            -device ioh3420,id=root.1,chassis=1 \
> >            -device
> > virtio-net-pci,netdev=hn0,id=v0,bus=root.1,disable-modern=off,disable-
> > legacy=on,iommu_platform=on,ats=on \
> > 
> > This means you need:
> > [1] q35 chipset
> > [2] intel IOMMU with device-iotlb and interrupt remapping enabled
> > [3] pcie switch (ioh3420)
> > [4] modern virtio-net-pci device with both iommu_platform and ats enabled
> > 
> 
> So we can just simply set "iommu_platform=off,ats=off" to disable vIOMMU?
>  
> > In guest:
> > [1] add intel_iommu=on to kernel command line
> > 
> > For stress testing:
> > [1] netperf UDP with intel_iommu=on|strict
> > [2] pktgen to test both rx and tx
> > 
> > For performance testing:
> > Checking dpdk l2fwd performance should be sufficient
> 
> for it, @pezhang, will your NFV team test it ?

Wenli, NFV testing can cover this.

Best Regards,
Pei
Comment 7 Rafael Aquini 2017-04-26 23:57:32 EDT
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing
Comment 9 Rafael Aquini 2017-05-01 09:18:02 EDT
Patch(es) available on kernel-3.10.0-658.el7
Comment 11 Quan Wenli 2017-06-05 05:59:57 EDT
Hi, jason, wei

Could you help check following result between noiommu mode and vfio mode ? we could see:
1. 8% improvement with vfio mode when guest receiving the packet.
2. no difference with guest tx, and tx pps's performance is poor, it may seem like https://bugzilla.redhat.com/show_bug.cgi?id=1401433#c18

Packages:

host: 3.10.0-677.el7.x86_64
guest: 3.10.0-677.el7.x86_64
qemu-kvm-rhev-2.9.0-7.el7.x86_64


Steps:
1. boot up guest with iotlb device with viommu.

numactl -c 1 -m 1 /usr/libexec/qemu-kvm /home/kvm_autotest_root/images/RHEL-Server-7.2-64.qcow2 -netdev tap,id=hn0,queues=1,vhost=on,script=/etc/qemu-ifup-atbr0 -device ioh3420,id=root.1,chassis=1 -device virtio-net-pci,netdev=hn0,id=v0,mq=off,mac=00:00:05:00:00:07,bus=root.1 -netdev tap,id=hn1,queues=1,vhost=on,script=/etc/qemu-ifup-atbr0 -device ioh3420,id=root.2,chassis=2 -device virtio-net-pci,netdev=hn1,id=v1,mq=off,mac=00:00:05:00:00:08,bus=root.2 -m 6G -enable-kvm -cpu host -vnc :11 -smp 4 -monitor tcp:0:4444,server,nowait -M q35,kernel-irqchip=split -monitor stdio

2. pin 4 vcpus and 2 vhost to individual cores on host in one numa node. 

3. on guest 

3.1 Install dpdk-17.05-2.el7fdb.x86_64.rpm/dpdk-devel-17.05-2.el7fdb.x86_64.rpm/dpdk-tools-17.05-2.el7fdb.x86_64.rpm 
3.1 Add "intel_iommu=on", then reboot guest
3.2 echo 2048 >  /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
3.1 # ifconfig eth0 down & ifconfig eth1 down
3.2 # modprobe vfio ====> test with vfio mode 
 or # modprobe vfio enable_unsafe_noiommu_mode=Y ===> test with noiommu mode
    # modprobe vfio-pci
3.3 # lspci | grep Eth 
    # dpdk-devbind --bind=vfio-pci 0000:01:00.0
    # dpdk-devbind --bind=vfio-pci 0000:02:00.0
3.4 run testpmd then start it. 
# /usr/bin/testpmd \
-l 1,2,3 \
-n 4 \
-d /usr/lib64/librte_pmd_virtio.so.1 \
-w 0000:01:00.0 -w 0000:02:00.0 \
-- \
--nb-cores=2 \
--disable-hw-vlan \
-i \
--disable-rss \
--rxq=1 --txq=1

testpmd> start

4.Run "pktgen.sh tap0 "on host
  meanwhile run pps.sh tap0 on host to gather guest rx pps performance.
  and run pps.sh tap1 on host to gather guest tx pps performance.

5.Result 
           no-iommu mode  | vfio mode    
Guest rx |   968645       |   1050282    ------> 8% improve
Guest tx |   362869       |   364116     ------> no difference
Comment 12 Quan Wenli 2017-06-15 02:59:22 EDT
Hi, jason, wei 

Could you help check performance results in comment #11, is it as expected?
Comment 13 jason wang 2017-06-15 03:38:24 EDT
(In reply to Quan Wenli from comment #12)
> Hi, jason, wei 
> 
> Could you help check performance results in comment #11, is it as expected?

Kind of except for the low pps on tx.

What's the tx number of no-iommu mode before 658? If it's still low, this bug can be verified and you may open a new bug for tracking tx issue.

Thanks
Comment 14 Quan Wenli 2017-06-15 06:00:38 EDT
(In reply to jason wang from comment #13)
> (In reply to Quan Wenli from comment #12)
> > Hi, jason, wei 
> > 
> > Could you help check performance results in comment #11, is it as expected?
> 
> Kind of except for the low pps on tx.
> 
> What's the tx number of no-iommu mode before 658? If it's still low, this
> bug can be verified and you may open a new bug for tracking tx issue.
> 
> Thanks

Retest again with kernel-679 guest, the tx pps number is up to 0.52 for both no-iommu and vfio mode, downgrade kernel to 657, there is no ethernet in guest, so  can not get any pps.

Do you think the 0.52 tx pps is still bad ?
Comment 15 jason wang 2017-06-15 06:04:03 EDT
(In reply to Quan Wenli from comment #14)
> (In reply to jason wang from comment #13)
> > (In reply to Quan Wenli from comment #12)
> > > Hi, jason, wei 
> > > 
> > > Could you help check performance results in comment #11, is it as expected?
> > 
> > Kind of except for the low pps on tx.
> > 
> > What's the tx number of no-iommu mode before 658? If it's still low, this
> > bug can be verified and you may open a new bug for tracking tx issue.
> > 
> > Thanks
> 
> Retest again with kernel-679 guest, the tx pps number is up to 0.52 for both
> no-iommu and vfio mode, downgrade kernel to 657, there is no ethernet in
> guest, so  can not get any pps.

You need clear iommu_platform I think.

> 
> Do you think the 0.52 tx pps is still bad ?

Not good at least.

Thanks
Comment 16 Quan Wenli 2017-06-16 04:50:10 EDT
(In reply to jason wang from comment #15)
> (In reply to Quan Wenli from comment #14)
> > (In reply to jason wang from comment #13)
> > > (In reply to Quan Wenli from comment #12)
> > > > Hi, jason, wei 
> > > > 
> > > > Could you help check performance results in comment #11, is it as expected?
> > > 
> > > Kind of except for the low pps on tx.
> > > 
> > > What's the tx number of no-iommu mode before 658? If it's still low, this
> > > bug can be verified and you may open a new bug for tracking tx issue.
> > > 
> > > Thanks
> > 
> > Retest again with kernel-679 guest, the tx pps number is up to 0.52 for both
> > no-iommu and vfio mode, downgrade kernel to 657, there is no ethernet in
> > guest, so  can not get any pps.
> 
> You need clear iommu_platform I think.

no iommu_platform=on,ats=on 657 kernel - > 530173 pps
no iommu_platform=on,ats=on 679 kernel - > 529250 pps
enable iommu_platform=on,ats=on 679 kernel - > 529689 pps
enable iommu_platform=on,ats=on  kernel-4.11.0-rc5+ - > 528384 pps


They are almost similar on tx pps. do you think I need to open a new bug for tracking low tx performance(0.5mpps). 

 
> 
> > 
> > Do you think the 0.52 tx pps is still bad ?
> 
> Not good at least.
> 
> Thanks
Comment 17 jason wang 2017-06-16 05:57:55 EDT
(In reply to Quan Wenli from comment #16)
> (In reply to jason wang from comment #15)
> > (In reply to Quan Wenli from comment #14)
> > > (In reply to jason wang from comment #13)
> > > > (In reply to Quan Wenli from comment #12)
> > > > > Hi, jason, wei 
> > > > > 
> > > > > Could you help check performance results in comment #11, is it as expected?
> > > > 
> > > > Kind of except for the low pps on tx.
> > > > 
> > > > What's the tx number of no-iommu mode before 658? If it's still low, this
> > > > bug can be verified and you may open a new bug for tracking tx issue.
> > > > 
> > > > Thanks
> > > 
> > > Retest again with kernel-679 guest, the tx pps number is up to 0.52 for both
> > > no-iommu and vfio mode, downgrade kernel to 657, there is no ethernet in
> > > guest, so  can not get any pps.
> > 
> > You need clear iommu_platform I think.
> 
> no iommu_platform=on,ats=on 657 kernel - > 530173 pps
> no iommu_platform=on,ats=on 679 kernel - > 529250 pps
> enable iommu_platform=on,ats=on 679 kernel - > 529689 pps
> enable iommu_platform=on,ats=on  kernel-4.11.0-rc5+ - > 528384 pps
> 
> 
> They are almost similar on tx pps. do you think I need to open a new bug for
> tracking low tx performance(0.5mpps). 

According to your test, it was not introduced by iommu support. Please open a bug and flag it to 7.5. And we can verify this bug.

Thanks

> 
>  
> > 
> > > 
> > > Do you think the 0.52 tx pps is still bad ?
> > 
> > Not good at least.
> > 
> > Thanks
Comment 18 Quan Wenli 2017-06-19 01:59:44 EDT
(In reply to jason wang from comment #17)
> (In reply to Quan Wenli from comment #16)
> > (In reply to jason wang from comment #15)
> > > (In reply to Quan Wenli from comment #14)
> > > > (In reply to jason wang from comment #13)
> > > > > (In reply to Quan Wenli from comment #12)
> > > > > > Hi, jason, wei 
> > > > > > 
> > > > > > Could you help check performance results in comment #11, is it as expected?
> > > > > 
> > > > > Kind of except for the low pps on tx.
> > > > > 
> > > > > What's the tx number of no-iommu mode before 658? If it's still low, this
> > > > > bug can be verified and you may open a new bug for tracking tx issue.
> > > > > 
> > > > > Thanks
> > > > 
> > > > Retest again with kernel-679 guest, the tx pps number is up to 0.52 for both
> > > > no-iommu and vfio mode, downgrade kernel to 657, there is no ethernet in
> > > > guest, so  can not get any pps.
> > > 
> > > You need clear iommu_platform I think.
> > 
> > no iommu_platform=on,ats=on 657 kernel - > 530173 pps
> > no iommu_platform=on,ats=on 679 kernel - > 529250 pps
> > enable iommu_platform=on,ats=on 679 kernel - > 529689 pps
> > enable iommu_platform=on,ats=on  kernel-4.11.0-rc5+ - > 528384 pps
> > 
> > 
> > They are almost similar on tx pps. do you think I need to open a new bug for
> > tracking low tx performance(0.5mpps). 
> 
> According to your test, it was not introduced by iommu support. Please open
> a bug and flag it to 7.5. And we can verify this bug.
> 

Set it to verified and opend bug 1462633 for tracking the low tx issue. 

> Thanks
> 
> > 
> >  
> > > 
> > > > 
> > > > Do you think the 0.52 tx pps is still bad ?
> > > 
> > > Not good at least.
> > > 
> > > Thanks
Comment 20 errata-xmlrpc 2017-08-01 16:02:32 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1842
Comment 21 errata-xmlrpc 2017-08-01 20:31:22 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1842

Note You need to log in before you can comment on or make changes to this bug.