Bug 1259556
| Summary: | Allow VFIO devices on the same guest PHB as emulated devices | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | David Gibson <dgibson> |
| Component: | qemu-kvm-rhev | Assignee: | David Gibson <dgibson> |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.2 | CC: | abologna, gklein, hannsj_uhl, juzhang, lmiksik, michen, mrezanin, qzhang, virt-maint, zhengtli |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | ppc64le | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-rhev-2.3.0-29.el7 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-12-04 16:55:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 825045, 1154205, 1172230, 1201513, 1261708, 1264728, 1277183, 1277184 | ||
|
Description
David Gibson
2015-09-03 03:29:30 UTC
Draft build at http://brewweb.devel.redhat.com/brew/taskinfo?taskID=9803236 Hi, Here is an interesting result in my test.
Host Kernel: 3.10.0-315.el7.ppc64le
qemu verison: qemu-kvm-rhev-2.3.0-19.el7
1. Bind an usb controller(IOMMU group 2) and a network device(IOMMU group 1) to vfio-pci:
#echo "104c 8241" > /sys/bus/pci/drivers/vfio-pci/new_id
#echo 0003:03:00.0 > /sys/bus/pci/devices/0003\:03\:00.0/driver/unbind
#echo 0003:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind
#echo "14e4 1657" > /sys/bus/pci/drivers/vfio-pci/new_id
#echo 0003:09:00.0 > /sys/bus/pci/devices/0003\:09\:00.0/driver/unbind
#echo 0003:09:00.1 > /sys/bus/pci/devices/0003\:09\:00.1/driver/unbind
#echo 0003:09:00.2 > /sys/bus/pci/devices/0003\:09\:00.2/driver/unbind
#echo 0003:09:00.3 > /sys/bus/pci/devices/0003\:09\:00.3/driver/unbind
#echo 0003:09:00.0 > /sys/bus/pci/drivers/vfio-pci/bind
#echo 0003:09:00.1 > /sys/bus/pci/drivers/vfio-pci/bind
#echo 0003:09:00.2 > /sys/bus/pci/drivers/vfio-pci/bind
#echo 0003:09:00.3 > /sys/bus/pci/drivers/vfio-pci/bind
2. Boot up guest with cmd:
#qemu-kvm ... \
-device spapr-pci-vfio-host-bridge,id=vfiohost,index=0x1,iommu=2 \
-device vfio-pci,host=0003:09:00.0,bus=vfiohost.0,addr=0x1,id=vfio_dev \
... \
3.In guest, Got an IP for vfio-device(enP1p0s1) by dhcp and do scp action.
[root@localhost ~]# ifconfig
ifconfig
enP1p0s1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.19.106.4 netmask 255.255.255.0 broadcast 10.19.106.255
inet6 fe80::9abe:94ff:fe01:754c prefixlen 64 scopeid 0x20<link>
inet6 2620:52:0:136a:9abe:94ff:fe01:754c prefixlen 64 scopeid 0x0<global>
ether 98:be:94:01:75:4c txqueuelen 1000 (Ethernet)
RX packets 150711 bytes 10900709 (10.3 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 765691 bytes 1104467243 (1.0 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 19
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 0 (Local Loopback)
RX packets 60 bytes 5484 (5.3 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 60 bytes 5484 (5.3 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
4. scp file to an external host.
[root@localhost ~]# scp test root.67.19:/root/test_home/liuzt/
scp test root.67.19:/root/test_home/liuzt/
root.67.19's password: redhat
test 100% 1000MB 71.4MB/s 00:14
I am confused why the NIC in the wrong group works.
I'm raising the priority and setting this one as a blocker cause it is blocking the verification of RHEV RFE #825045 Zhengtong, I am completely mystified as to how the NIC is working in that setup. However, I think this is a distraction from actually verifying the bug. I think the case to test is whether a VFIO device will work on an spapr-pci-host-bridge (as opposed to spapr-vfio-pci-host-bridge). That's the case we really care about in applying this change. Hi David, As stated on the comment#21 of bug1250326 , While I use spapr-pci-host-bridge, the nic can't get IP by DHCP. So I think some functions of vfio device may be blocked. That's right, without the patches from this bug, VFIO devices won't be able to fully work on the spapr-pci-host-bridge. Unfortunately the patches I posted really aren't ready to go. Moving back to ASSIGNED until I figure out what to do next. Ok, I've reworked this to address the concerns that Alex W and I had. I have a test build at https://brewweb.devel.redhat.com/taskinfo?taskID=9852214 Rebased my parkport on the latest downstream. New test build at http://brewweb.devel.redhat.com/brew/taskinfo?taskID=9858618 Fix included in qemu-kvm-rhev-2.3.0-29.el7 Tested with one guest attached with an enulated NIC and a VFIO NIC on the same spapr-pci-host-bridge bus. VFIO NIC won't work with qemu-kvm-rhev-2.3.0-28, and works well with qemu-kvm-rhev-2.3.0-29. And this bug could be marked verified.
details:
reproduce with qemu-kvm-rhev-2.3.0-28
====================================================================
1. Boot guest with emulated NIC&VFIO_NIC
#/usr/libexec/qemu-kvm \
...
-device spapr-pci-host-bridge,id=vfiohost,index=0x1 \
-device vfio-pci,host=0003:09:00.0,bus=vfiohost.0,addr=0x1,id=vfio_dev \
-netdev tap,id=hostnet0,script=/etc/qemu-ifup \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:c4:e7:85,bus=vfiohost.0,addr=0x2 \
...
2. In guest:
# dhclient
# ifconfig
enP1p0s1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 40:f2:e9:5d:ab:84 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 20
enP1p0s2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.122.34 netmask 255.255.255.0 broadcast 192.168.122.255
inet6 fe80::5054:ff:fec4:e785 prefixlen 64 scopeid 0x20<link>
ether 52:54:00:c4:e7:85 txqueuelen 1000 (Ethernet)
RX packets 63 bytes 6418 (6.2 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 77 bytes 8062 (7.8 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
# [root@localhost ~]# ethtool enP1p0s1
...
Current message level: 0x000000ff (255)
drv probe link timer ifdown ifup rx_err tx_err
Link detected: no
#### VFIO device can't get ip by dhclient. Link status is down by ethtool
====================================================================
Verified with qemu-kvm-rhev-2.3.0.29
1. Boot guest with emulated NIC&VFIO_NIC
#/usr/libexec/qemu-kvm \
...
-device spapr-pci-host-bridge,id=vfiohost,index=0x1 \
-device vfio-pci,host=0003:09:00.0,bus=vfiohost.0,addr=0x1,id=vfio_dev \
-netdev tap,id=hostnet0,script=/etc/qemu-ifup \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:c4:e7:85,bus=vfiohost.0,addr=0x2 \
...
2. In guest:
#dhclient
#ifconfig
enP1p0s1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.19.112.8 netmask 255.255.248.0 broadcast 10.19.119.255
inet6 fe80::42f2:e9ff:fe5d:ab84 prefixlen 64 scopeid 0x20<link>
ether 40:f2:e9:5d:ab:84 txqueuelen 1000 (Ethernet)
RX packets 57 bytes 5972 (5.8 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 14 bytes 2836 (2.7 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 20
enP1p0s2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.122.34 netmask 255.255.255.0 broadcast 192.168.122.255
inet6 fe80::5054:ff:fec4:e785 prefixlen 64 scopeid 0x20<link>
ether 52:54:00:c4:e7:85 txqueuelen 1000 (Ethernet)
RX packets 60 bytes 5857 (5.7 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 61 bytes 6389 (6.2 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
#ethtool enP1p0s1
... Current message level: 0x000000ff (255)
drv probe link timer ifdown ifup rx_err tx_err
Link detected: yes
3. Ping from external host succeed:
[liuzt@localhost script]$ ping 10.19.112.8
PING 10.19.112.8 (10.19.112.8) 56(84) bytes of data.
64 bytes from 10.19.112.8: icmp_seq=1 ttl=55 time=392 ms
64 bytes from 10.19.112.8: icmp_seq=2 ttl=55 time=405 ms
64 bytes from 10.19.112.8: icmp_seq=3 ttl=55 time=409 ms
[root@ibm-p8-rhevm-16 ~]# ping 192.168.122.34 -c 5
PING 192.168.122.34 (192.168.122.34) 56(84) bytes of data.
64 bytes from 192.168.122.34: icmp_seq=1 ttl=64 time=0.146 ms
64 bytes from 192.168.122.34: icmp_seq=2 ttl=64 time=0.102 ms
64 bytes from 192.168.122.34: icmp_seq=3 ttl=64 time=0.099 ms
64 bytes from 192.168.122.34: icmp_seq=4 ttl=64 time=0.070 ms
64 bytes from 192.168.122.34: icmp_seq=5 ttl=64 time=0.069 ms
#### Both VFIO NIC and emulated NIC works well.
=====================================================================
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2546.html |