Bug 1350196
Summary: | Enable IOMMU device with -device intel-iommu | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Marcel Apfelbaum <marcel> |
Component: | qemu-kvm-rhev | Assignee: | Marcel Apfelbaum <marcel> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | low | Docs Contact: | |
Priority: | medium | ||
Version: | 7.3 | CC: | ailan, alex.williamson, chayang, huding, jinzhao, jishao, juzhang, knoel, kzhang, marcel, peterx, pezhang, virt-maint, xfu, xiywang, xuzhang, yfu |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.6.0-18.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-11-07 21:19:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1235581, 1283104, 1283251, 1288337, 1358653 |
Description
Marcel Apfelbaum
2016-06-26 12:11:34 UTC
Fix included in qemu-kvm-rhev-2.6.0-18.el7 Verification: Versions: Host: 3.10.0-483.el7.x86_64 qemu-kvm-rhev-2.6.0-18.el7.x86_64 Guest: 3.10.0-482.el7.x86_64 Steps: 1. Boot guest with 3 virtio-net devices and '-device intel-iommu'. /usr/libexec/qemu-kvm -name rhel7.3 -M q35 \ -device intel-iommu \ -cpu host -m 4G -numa node \ -smp 4,sockets=2,cores=2,threads=1 \ -netdev tap,id=hostnet0 \ -netdev tap,id=hostnet1 \ -netdev tap,id=hostnet2 \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=12:54:00:5c:88:61 \ -device virtio-net-pci,netdev=hostnet1,id=net1,mac=12:54:00:5c:88:62 \ -device virtio-net-pci,netdev=hostnet2,id=net2,mac=12:54:00:5c:88:63 \ -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 \ -spice port=5901,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \ -monitor stdio \ -serial unix:/tmp/monitor,server,nowait \ -qmp tcp:0:5555,server,nowait \ -drive file=/home/pezhang/rhel7.3.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0 \ -vnc :2 \ 2. In guest, load vfio-pci module # modprobe vfio # modprobe vfio-pci 3. In guest, check iommu # find /sys/kernel/iommu_groups/ -type l /sys/kernel/iommu_groups/0/devices/0000:00:00.0 /sys/kernel/iommu_groups/1/devices/0000:00:01.0 /sys/kernel/iommu_groups/2/devices/0000:00:02.0 /sys/kernel/iommu_groups/3/devices/0000:00:03.0 /sys/kernel/iommu_groups/4/devices/0000:00:04.0 /sys/kernel/iommu_groups/5/devices/0000:00:05.0 /sys/kernel/iommu_groups/6/devices/0000:00:1f.0 /sys/kernel/iommu_groups/6/devices/0000:00:1f.2 /sys/kernel/iommu_groups/6/devices/0000:00:1f.3 4. In guest, bind two virtio network devices to vfio. It's successfully. # lspci | grep Eth 00:01.0 Ethernet controller: Red Hat, Inc Virtio network device 00:02.0 Ethernet controller: Red Hat, Inc Virtio network device 00:03.0 Ethernet controller: Red Hat, Inc Virtio network device # lspci -n -s 0000:00:02.0 00:02.0 0200: 1af4:1000 # echo "1af4 1000" > /sys/bus/pci/drivers/vfio-pci/new_id # echo 0000:00:02.0 > /sys/bus/pci/devices/0000\:00\:02.0/driver/unbind # echo 0000:00:03.0 > /sys/bus/pci/devices/0000\:00\:03.0/driver/unbind # echo 0000:00:02.0 > /sys/bus/pci/drivers/vfio-pci/bind # echo 0000:00:03.0 > /sys/bus/pci/drivers/vfio-pci/bind # ls /sys/bus/pci/drivers/vfio-pci/ 0000:00:02.0 0000:00:03.0 bind module new_id remove_id uevent unbind Hi Marcel, Is this bug verified? If not, what steps or scenarios should QE do? Thank you, Pei Hi Pei, (In reply to Pei Zhang from comment #3) > Verification: > Versions: > Host: > 3.10.0-483.el7.x86_64 > qemu-kvm-rhev-2.6.0-18.el7.x86_64 > > Guest: > 3.10.0-482.el7.x86_64 > > > Steps: > 1. Boot guest with 3 virtio-net devices and '-device intel-iommu'. > /usr/libexec/qemu-kvm -name rhel7.3 -M q35 \ > -device intel-iommu \ > -cpu host -m 4G -numa node \ > -smp 4,sockets=2,cores=2,threads=1 \ > -netdev tap,id=hostnet0 \ > -netdev tap,id=hostnet1 \ > -netdev tap,id=hostnet2 \ > -device virtio-net-pci,netdev=hostnet0,id=net0,mac=12:54:00:5c:88:61 \ > -device virtio-net-pci,netdev=hostnet1,id=net1,mac=12:54:00:5c:88:62 \ > -device virtio-net-pci,netdev=hostnet2,id=net2,mac=12:54:00:5c:88:63 \ > -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 \ > -spice > port=5901,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless- > migration=on \ > -monitor stdio \ > -serial unix:/tmp/monitor,server,nowait \ > -qmp tcp:0:5555,server,nowait \ > -drive > file=/home/pezhang/rhel7.3.qcow2,format=qcow2,if=none,id=drive-virtio-blk0, > werror=stop,rerror=stop \ > -device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0 \ > -vnc :2 \ > > 2. In guest, load vfio-pci module > # modprobe vfio > # modprobe vfio-pci > > 3. In guest, check iommu > # find /sys/kernel/iommu_groups/ -type l > /sys/kernel/iommu_groups/0/devices/0000:00:00.0 > /sys/kernel/iommu_groups/1/devices/0000:00:01.0 > /sys/kernel/iommu_groups/2/devices/0000:00:02.0 > /sys/kernel/iommu_groups/3/devices/0000:00:03.0 > /sys/kernel/iommu_groups/4/devices/0000:00:04.0 > /sys/kernel/iommu_groups/5/devices/0000:00:05.0 > /sys/kernel/iommu_groups/6/devices/0000:00:1f.0 > /sys/kernel/iommu_groups/6/devices/0000:00:1f.2 > /sys/kernel/iommu_groups/6/devices/0000:00:1f.3 > > 4. In guest, bind two virtio network devices to vfio. It's successfully. > # lspci | grep Eth > 00:01.0 Ethernet controller: Red Hat, Inc Virtio network device > 00:02.0 Ethernet controller: Red Hat, Inc Virtio network device > 00:03.0 Ethernet controller: Red Hat, Inc Virtio network device > > # lspci -n -s 0000:00:02.0 > 00:02.0 0200: 1af4:1000 > > # echo "1af4 1000" > /sys/bus/pci/drivers/vfio-pci/new_id > > # echo 0000:00:02.0 > /sys/bus/pci/devices/0000\:00\:02.0/driver/unbind > # echo 0000:00:03.0 > /sys/bus/pci/devices/0000\:00\:03.0/driver/unbind > > # echo 0000:00:02.0 > /sys/bus/pci/drivers/vfio-pci/bind > # echo 0000:00:03.0 > /sys/bus/pci/drivers/vfio-pci/bind > > # ls /sys/bus/pci/drivers/vfio-pci/ > 0000:00:02.0 0000:00:03.0 bind module new_id remove_id uevent unbind > > > Hi Marcel, > Is this bug verified? If not, what steps or scenarios should QE do? > You need o be sure the iommu device is enabled in the guest. Please follow the instructions from http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM Please pay attention to point 3. Let me know if you have further questions, Marcel > Thank you, > Pei Does the kernel QE team have low level IOMMU tests? How about Intel? (In reply to Karen Noel from comment #5) > Does the kernel QE team have low level IOMMU tests? How about Intel? Hi Kexin, Could you add comment? Best Regards, Junyi *** Bug 1283250 has been marked as a duplicate of this bug. *** (In reply to Marcel Apfelbaum from comment #4) > > You need o be sure the iommu device is enabled in the guest. > Please follow the instructions from > http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM > > Please pay attention to point 3. > > Let me know if you have further questions, > Marcel Hi Marcel, Following your suggestions, I continued below testing: Summary: (1)Checked point 3(below step 1), seams IOMMU has enabled. Right? (2)Tested device assignment with pci-stub in L1 guest with e1000, virtio-net, and rt8139, they all fail with qemu shows 'No IOMMU found.' So should QE test this bug from nest virtualization layer(do device assignment from L1 to L2 guest)? (3)If test device assignment with vfio in L1 guest, then e1000 and rt8139 will work, virtio-net will not. https://bugzilla.redhat.com/show_bug.cgi?id=1235580#c14 Should QE verify this bug and above bug 1235580 using same scenarios or steps? Are there differences? Marcel, could you give some updates? Thanks. Testing steps continued: 1. dmesg info in guest (point 3) # dmesg | grep -e DMAR -e IOMMU [ 0.000000] ACPI: DMAR 000000007ffe23fb 00040 (v01 BOCHS BXPCDMAR 00000001 BXPC 00000001) [ 0.000000] DMAR: IOMMU enabled [ 0.141195] DMAR: Host address width 39 [ 0.141196] DMAR: DRHD base: 0x000000fed90000 flags: 0x1 [ 0.141228] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 12008c22260206 ecap f02 [ 0.799597] DMAR: No RMRR found [ 0.799599] DMAR: No ATSR found [ 0.799867] DMAR: dmar0: Using Queued invalidation [ 0.800383] DMAR: Setting RMRR: [ 0.800385] DMAR: Prepare 0-16MiB unity mapping for LPC [ 0.800409] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff] [ 0.800515] DMAR: Intel(R) Virtualization Technology for Directed I/O 2. Unbind from kernel driver, and bind to pci-stub driver in guest # modprobe pci_stub # lspci | grep Eth 00:01.0 Ethernet controller: Red Hat, Inc Virtio network device # lspci -n 00:01.0 0200: 1af4:1000 # echo "1af4 1000" > /sys/bus/pci/drivers/pci-stub/new_id # echo "0000:00:01.0" > /sys/bus/pci/devices/0000:00:01.0/driver/unbind # echo "0000:00:01.0" > /sys/bus/pci/drivers/pci-stub/bind # ls /sys/bus/pci/drivers/pci-stub 0000:00:01.0 bind new_id remove_id uevent unbind 3. Boot nested guest(L2 guest) with pci-assign, fail with error info 'No IOMMU found'. # /usr/libexec/qemu-kvm -device pci-assign,host=00:01.0 qemu-kvm: -device pci-assign,host=00:01.0: No IOMMU found. Unable to assign device "(null)" 4. As virtio-net may cheats DMA(see https://bugzilla.redhat.com/show_bug.cgi?id=1235580#c15),so we also tried e1000 and rtl8139, but they hit same issue. # /usr/libexec/qemu-kvm -device pci-assign,host=0000:00:02.0 qemu-kvm: -device pci-assign,host=0000:00:02.0: No IOMMU found. Unable to assign device "(null)" Thank you, Pei I accidentally removed the need-info flag. (In reply to Pei Zhang from comment #8) > (In reply to Marcel Apfelbaum from comment #4) > > > > You need o be sure the iommu device is enabled in the guest. > > Please follow the instructions from > > http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM > > > > Please pay attention to point 3. > > > > Let me know if you have further questions, > > Marcel > > Hi Marcel, > Following your suggestions, I continued below testing: > > Summary: > (1)Checked point 3(below step 1), seams IOMMU has enabled. Right? Right > (2)Tested device assignment with pci-stub in L1 guest with e1000, > virtio-net, and rt8139, they all fail with qemu shows 'No IOMMU found.' > So should QE test this bug from nest virtualization layer(do device > assignment from L1 to L2 guest)? I don't think vIOMMU is ready yet for nested virtualization. Peter Xu is working on enabling Interrupt remapping and others are working on vhost-net/vhost-user support. > (3)If test device assignment with vfio in L1 guest, then e1000 and rt8139 > will work, virtio-net will not. This matches my view on this matter, vIOMMU works with emulated devices (e1000) and AHCI (the Q35 integrated storage controller), but not with virtio devices yet. > https://bugzilla.redhat.com/show_bug.cgi?id=1235580#c14 > Should QE verify this bug and above bug 1235580 using same scenarios or > steps? Are there differences? > > Marcel, could you give some updates? Thanks. > > Testing steps continued: > 1. dmesg info in guest (point 3) > # dmesg | grep -e DMAR -e IOMMU > [ 0.000000] ACPI: DMAR 000000007ffe23fb 00040 (v01 BOCHS BXPCDMAR > 00000001 BXPC 00000001) > [ 0.000000] DMAR: IOMMU enabled That shows the device is present an enabled. Strictly speaking this is enough *for this BZ* since the BZ is referring to the command line option to enable the device. > [ 0.141195] DMAR: Host address width 39 > [ 0.141196] DMAR: DRHD base: 0x000000fed90000 flags: 0x1 > [ 0.141228] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap > 12008c22260206 ecap f02 > [ 0.799597] DMAR: No RMRR found > [ 0.799599] DMAR: No ATSR found > [ 0.799867] DMAR: dmar0: Using Queued invalidation > [ 0.800383] DMAR: Setting RMRR: > [ 0.800385] DMAR: Prepare 0-16MiB unity mapping for LPC > [ 0.800409] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - > 0xffffff] > [ 0.800515] DMAR: Intel(R) Virtualization Technology for Directed I/O > > 2. Unbind from kernel driver, and bind to pci-stub driver in guest > # modprobe pci_stub > > # lspci | grep Eth > 00:01.0 Ethernet controller: Red Hat, Inc Virtio network device > > # lspci -n > 00:01.0 0200: 1af4:1000 > > # echo "1af4 1000" > /sys/bus/pci/drivers/pci-stub/new_id > # echo "0000:00:01.0" > /sys/bus/pci/devices/0000:00:01.0/driver/unbind > # echo "0000:00:01.0" > /sys/bus/pci/drivers/pci-stub/bind > # ls /sys/bus/pci/drivers/pci-stub > 0000:00:01.0 bind new_id remove_id uevent unbind > > 3. Boot nested guest(L2 guest) with pci-assign, fail with error info 'No > IOMMU found'. > # /usr/libexec/qemu-kvm -device pci-assign,host=00:01.0 > qemu-kvm: -device pci-assign,host=00:01.0: No IOMMU found. Unable to assign > device "(null)" > I added Alex to CC, he may be able to give us some insights. > 4. As virtio-net may cheats DMA(see > https://bugzilla.redhat.com/show_bug.cgi?id=1235580#c15),so we also tried > e1000 and rtl8139, but they hit same issue. > # /usr/libexec/qemu-kvm -device pci-assign,host=0000:00:02.0 > qemu-kvm: -device pci-assign,host=0000:00:02.0: No IOMMU found. Unable to > assign device "(null)" > > To summarize: 1. In order to check this BZ is enough to see that the IOMMU is enabled and the guest is working properly with AHCI and an e1000 device. 2. There is work on progress to make the IOMMU work with virto devices and vhost. 3. There is a series downstream on enabling IR. 4. I do not know the status of nested device assignment, anyway it should be a different BZ. Thanks, Marcel > > Thank you, > Pei (In reply to Marcel Apfelbaum from comment #10) > (In reply to Pei Zhang from comment #8) > > 3. Boot nested guest(L2 guest) with pci-assign, fail with error info 'No > > IOMMU found'. > > # /usr/libexec/qemu-kvm -device pci-assign,host=00:01.0 > > qemu-kvm: -device pci-assign,host=00:01.0: No IOMMU found. Unable to assign > > device "(null)" > > > > I added Alex to CC, he may be able to give us some insights. Unless RHEL5/6 is involved as a device assignment host somewhere in this stack, pci-assign is not supported. I can't tell from the QE report whether that's the case or not though. (In reply to Marcel Apfelbaum from comment #10) > I don't think vIOMMU is ready yet for nested virtualization. > Peter Xu is working on enabling Interrupt remapping and others are working > on vhost-net/vhost-user support. OK, got it. I thought nested virtualization is a checkpoint of vIOMMU functionality before. Seems it's wrong.I will follow nested issue in other bugs, not this one. > > > Testing steps continued: > > 1. dmesg info in guest (point 3) > > # dmesg | grep -e DMAR -e IOMMU > > [ 0.000000] ACPI: DMAR 000000007ffe23fb 00040 (v01 BOCHS BXPCDMAR > > 00000001 BXPC 00000001) > > [ 0.000000] DMAR: IOMMU enabled > > That shows the device is present an enabled. > Strictly speaking this is enough *for this BZ* since the BZ > is referring to the command line option to enable the device. OK, got it. > > To summarize: > 1. In order to check this BZ is enough to see that the IOMMU is enabled > and the guest is working properly with AHCI and an e1000 device. (1)check IOMMU is enabled As above show. (2)guest is working properly with AHCI and an e1000 device. Boot guest with AHCI and e1000, guest works well. # /usr/libexec/qemu-kvm -name rhel7.3 -M q35 \ -device intel-iommu \ -cpu host -m 4G \ -smp 4,sockets=2,cores=2,threads=1 \ -netdev tap,id=hostnet0 \ -device e1000,netdev=hostnet0,id=net0,mac=12:54:00:5c:88:61 \ -spice port=5901,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \ -monitor stdio \ -device ahci,id=ahci0 \ -drive file=/home/pezhang/rhel7.3.qcow2,format=qcow2,if=none,id=drive-system-disk,werror=stop,rerror=stop \ -device ide-drive,bus=ahci0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \ # lspci | grep -e AHCI -e Eth 00:02.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03) 00:03.0 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02) 00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02) # find /sys/kernel/iommu_groups/ -type l /sys/kernel/iommu_groups/0/devices/0000:00:00.0 /sys/kernel/iommu_groups/1/devices/0000:00:01.0 /sys/kernel/iommu_groups/2/devices/0000:00:02.0 /sys/kernel/iommu_groups/3/devices/0000:00:03.0 /sys/kernel/iommu_groups/4/devices/0000:00:1f.0 /sys/kernel/iommu_groups/4/devices/0000:00:1f.2 /sys/kernel/iommu_groups/4/devices/0000:00:1f.3 (1)and (2) both works, so this bug have been fixed well. Please let me know if any steps missed. > 2. There is work on progress to make the IOMMU work with virto devices > and vhost. > > 3. There is a series downstream on enabling IR. > > 4. I do not know the status of nested device assignment, anyway it should > be a different BZ. OK. Thanks for sharing these info. Thank you, Pei (In reply to Alex Williamson from comment #11) > > Unless RHEL5/6 is involved as a device assignment host somewhere in this > stack, pci-assign is not supported. Thanks for pointing out this. I didn't realize pci-assign only works in rhel6. > I can't tell from the QE report whether > that's the case or not though. I confirmed with Chao Yang(chayang@) and Yanan Fu(yfu@) about device assignment coverage issue: (1)for rhel5, don't test, but it works with -pcidevice host=***. (2)for rhel6 pci-assign (3)for rhel7 vfio-pci Thank you, Pei (In reply to Xuesong Zhang from comment #14) > (In reply to Marcel Apfelbaum from comment #10) > > > > > To summarize: > > 1. In order to check this BZ is enough to see that the IOMMU is enabled > > and the guest is working properly with AHCI and an e1000 device. > > hi, Marcel, > I'm libvirt QE. I do not understand why AHCI controller is related with the > vIOMMU. I can understand the IOMMU will affect the PCI device assignment > from the past experience, but the AHCI disk assignment should not be > affected by IOMMU enable/disable in host. Would you please help to give us > some detail explanation if any of my understanding is not correct? Thanks. Sure, vIOMMU is used by *all* guest PCI devices, not only the assigned ones. Once we have an vIOMMU in the guest, the AHCI controller, the e1000 device and basically all PCI devices will use it implicitly. Some corner cases are the virtio devices that do not use it yet (there is work in progress for it) and the vfio guest platform (not the host one) Thanks, Marcel > > > > > 2. There is work on progress to make the IOMMU work with virto devices > > and vhost. > > > > 3. There is a series downstream on enabling IR. > > > > 4. I do not know the status of nested device assignment, anyway it should > > be a different BZ. > > Set this bug 'VERIFIED' as Comment 12. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2673.html |