| Summary: | Failed to assign/ hot-plug one PF to a guest whose host with Mellanox card | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jingjing Shao <jishao> |
| Component: | libvirt | Assignee: | Laine Stump <laine> |
| Status: | CLOSED DUPLICATE | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.2 | CC: | alex.williamson, dyuan, jishao, mzhan, rbalakri |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-07-06 07:49:01 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Comment 2
Laine Stump
2016-02-05 15:02:39 UTC
(In reply to Laine Stump from comment #2) > Please try separating the unbind of host driver from the device assignment > to see which step is failing, i.e. first run: > > virsh nodedev-detach pci_0000_24_00_0 > > then see if the netdevs have disappeared from "ip link show" output, as well > as grabbing the output of "virsh nodedev-dumpxml pci_0000_24_00_0". > > If that is successful, try attaching the device with "managed='no'" in the > XML. Hi Laine, I do the steps you told and also get the errors 1.[root@hp-dl380pg8-16 ~]# virsh nodedev-detach pci_0000_24_00_0 Device pci_0000_24_00_0 detached 2.[root@hp-dl380pg8-16 ~]# ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000 link/ether 28:80:23:9d:46:9c brd ff:ff:ff:ff:ff:ff 3: ens2f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000 link/ether 90:e2:ba:29:c0:ac brd ff:ff:ff:ff:ff:ff 4: ens2f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000 link/ether 90:e2:ba:29:c0:ad brd ff:ff:ff:ff:ff:ff 5: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000 link/ether 28:80:23:9d:46:9d brd ff:ff:ff:ff:ff:ff 6: eno3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000 link/ether 28:80:23:9d:46:9e brd ff:ff:ff:ff:ff:ff 7: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000 link/ether 28:80:23:9d:46:9f brd ff:ff:ff:ff:ff:ff 10: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT link/ether 52:54:00:9f:ec:02 brd ff:ff:ff:ff:ff:ff 11: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN mode DEFAULT qlen 500 link/ether 52:54:00:9f:ec:02 brd ff:ff:ff:ff:ff:ff 3.[root@hp-dl380pg8-16 ~]# virsh nodedev-list --tree +- pci_0000_20_02_0 | | | +- pci_0000_24_00_0 | 4.[root@hp-dl380pg8-16 ~]# virsh nodedev-dumpxml pci_0000_24_00_0 <device> <name>pci_0000_24_00_0</name> <path>/sys/devices/pci0000:20/0000:20:02.0/0000:24:00.0</path> <parent>pci_0000_20_02_0</parent> <driver> <name>vfio-pci</name> </driver> <capability type='pci'> <domain>0</domain> <bus>36</bus> <slot>0</slot> <function>0</function> <product id='0x1003'>MT27500 Family [ConnectX-3]</product> <vendor id='0x15b3'>Mellanox Technologies</vendor> <iommuGroup number='49'> <address domain='0x0000' bus='0x24' slot='0x00' function='0x0'/> </iommuGroup> <numa node='1'/> <pci-express> <link validity='cap' port='8' speed='8' width='8'/> <link validity='sta' speed='8' width='8'/> </pci-express> </capability> </device> 5.[root@hp-dl380pg8-16 ~]# cat pci_0000_24_00_0.xml <hostdev mode='subsystem' type='pci' managed='no'> <source> <address bus='0x24' slot='0x00' function='0x0'/> </source> </hostdev> 6.[root@hp-dl380pg8-16 ~]# virsh attach-device r6 pci_0000_24_00_0.xml error: Failed to attach device from pci_0000_24_00_0.xml error: internal error: unable to execute QEMU command 'device_add': Device initialization failed. 7.add information as below to the guest'xml <hostdev mode='subsystem' type='pci' managed='no'> <source> <address bus='0x24' slot='0x00' function='0x0'/> </source> </hostdev> 8.[root@hp-dl380pg8-16 ~]# virsh start r6 error: Failed to start domain r6 error: internal error: process exited while connecting to monitor: RHEL-6 compat: ich9-usb-uhci1: irq_pin = 3 RHEL-6 compat: ich9-usb-uhci2: irq_pin = 3 RHEL-6 compat: ich9-usb-uhci3: irq_pin = 3 2016-02-06T05:41:15.253268Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: vfio: failed to set iommu for container: Operation not permitted 2016-02-06T05:41:15.253303Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: vfio: failed to setup container for group 49 2016-02-06T05:41:15.253314Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: vfio: failed to get group 49 2016-02-06T05:41:15.253328Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: Device initialization failed. 2016-02-06T05:41:15.253343Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: Device 'vfio-pci' could not be initialized a) The allow_unsafe_assigned_interrupts=1 option to kvm should never, ever, ever be used unless the system requires it. I've made this explicitly clear every time I review the QE test plans. b) Check dmesg, you're on an hp system and I expect there's an error there complaining that the device is RMRR locked and cannot be used for device assignment. This is a platform issue with the system that can only be resolved by the system vendor. Find a non-HP system to use for testing. If Alex's suspicions are correct, please retest on a different system and close the BZ if the situation is resolved. (In reply to Alex Williamson from comment #4) > a) The allow_unsafe_assigned_interrupts=1 option to kvm should never, ever, > ever be used unless the system requires it. I've made this explicitly clear > every time I review the QE test plans. > > b) Check dmesg, you're on an hp system and I expect there's an error there > complaining that the device is RMRR locked and cannot be used for device > assignment. This is a platform issue with the system that can only be > resolved by the system vendor. Find a non-HP system to use for testing. a) Sorry that the case about pci-assignment does not update timely. I had update the test plan and test case for this part. b)Yes, I got the error # dmesg | grep RMRR [ 2603.224838] vfio-pci 0000:24:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement. Contact your platform vendor. (In reply to Laine Stump from comment #6) > If Alex's suspicions are correct, please retest on a different system and > close the BZ if the situation is resolved. I will update the result when I get the non-hp system with Mellanox card nextweek (In reply to Jingjing Shao from comment #8) > (In reply to Laine Stump from comment #6) > > If Alex's suspicions are correct, please retest on a different system and > > close the BZ if the situation is resolved. > > I will update the result when I get the non-hp system with Mellanox card > nextweek I check the issue on dell R730 with Mellanox card # lspci | grep Eth 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe 04:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3] 04:00.1 Ethernet controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 04:00.2 Ethernet controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] # cat hostdev.xml <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0' bus='0x04' slot='0x00' function='0x0'/> </source> </hostdev> # virsh start r7.1 Domain r7.1 started # virsh dumpxml r7.1 | grep hostdev -A9 # # # # virsh attach-device r7.1 hostdev.xml Device attached successfully # virsh dumpxml r7.1 | grep hostdev -A9 <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </hostdev> # virsh detach-device r7.1 hostdev.xml Device detached successfully # virsh dumpxml r7.1 | grep hostdev -A9 # # # *** This bug has been marked as a duplicate of bug 1097907 *** |