Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1302184

Summary: Failed to assign/ hot-plug one PF to a guest whose host with Mellanox card
Product: Red Hat Enterprise Linux 7 Reporter: Jingjing Shao <jishao>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: alex.williamson, dyuan, jishao, mzhan, rbalakri
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-06 07:49:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 2 Laine Stump 2016-02-05 15:02:39 UTC
Please try separating the unbind of host driver from the device assignment to see which step is failing, i.e. first run:

   virsh nodedev-detach pci_0000_24_00_0

then see if the netdevs have disappeared from "ip link show" output, as well as grabbing the output of "virsh nodedev-dumpxml pci_0000_24_00_0".

If that is successful, try attaching the device with "managed='no'" in the XML.

Comment 3 Jingjing Shao 2016-02-06 05:43:20 UTC
(In reply to Laine Stump from comment #2)
> Please try separating the unbind of host driver from the device assignment
> to see which step is failing, i.e. first run:
> 
>    virsh nodedev-detach pci_0000_24_00_0
> 
> then see if the netdevs have disappeared from "ip link show" output, as well
> as grabbing the output of "virsh nodedev-dumpxml pci_0000_24_00_0".
> 
> If that is successful, try attaching the device with "managed='no'" in the
> XML.

Hi Laine, 

I do the steps you told and also get the errors


1.[root@hp-dl380pg8-16 ~]# virsh nodedev-detach pci_0000_24_00_0
Device pci_0000_24_00_0 detached


2.[root@hp-dl380pg8-16 ~]# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000
    link/ether 28:80:23:9d:46:9c brd ff:ff:ff:ff:ff:ff
3: ens2f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000
    link/ether 90:e2:ba:29:c0:ac brd ff:ff:ff:ff:ff:ff
4: ens2f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000
    link/ether 90:e2:ba:29:c0:ad brd ff:ff:ff:ff:ff:ff
5: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000
    link/ether 28:80:23:9d:46:9d brd ff:ff:ff:ff:ff:ff
6: eno3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000
    link/ether 28:80:23:9d:46:9e brd ff:ff:ff:ff:ff:ff
7: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000
    link/ether 28:80:23:9d:46:9f brd ff:ff:ff:ff:ff:ff
10: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT 
    link/ether 52:54:00:9f:ec:02 brd ff:ff:ff:ff:ff:ff
11: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN mode DEFAULT qlen 500
    link/ether 52:54:00:9f:ec:02 brd ff:ff:ff:ff:ff:ff


3.[root@hp-dl380pg8-16 ~]# virsh nodedev-list --tree

 +- pci_0000_20_02_0
  |   |
  |   +- pci_0000_24_00_0
  |     



4.[root@hp-dl380pg8-16 ~]# virsh nodedev-dumpxml pci_0000_24_00_0
<device>
  <name>pci_0000_24_00_0</name>
  <path>/sys/devices/pci0000:20/0000:20:02.0/0000:24:00.0</path>
  <parent>pci_0000_20_02_0</parent>
  <driver>
    <name>vfio-pci</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>36</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x1003'>MT27500 Family [ConnectX-3]</product>
    <vendor id='0x15b3'>Mellanox Technologies</vendor>
    <iommuGroup number='49'>
      <address domain='0x0000' bus='0x24' slot='0x00' function='0x0'/>
    </iommuGroup>
    <numa node='1'/>
    <pci-express>
      <link validity='cap' port='8' speed='8' width='8'/>
      <link validity='sta' speed='8' width='8'/>
    </pci-express>
  </capability>
</device>


5.[root@hp-dl380pg8-16 ~]# cat pci_0000_24_00_0.xml 
 <hostdev mode='subsystem' type='pci' managed='no'>
 <source>
 <address bus='0x24' slot='0x00' function='0x0'/>
 </source>
 </hostdev>


6.[root@hp-dl380pg8-16 ~]# virsh attach-device r6 pci_0000_24_00_0.xml
error: Failed to attach device from pci_0000_24_00_0.xml
error: internal error: unable to execute QEMU command 'device_add': Device initialization failed.


7.add information as below to the guest'xml
 <hostdev mode='subsystem' type='pci' managed='no'>
 <source>
 <address bus='0x24' slot='0x00' function='0x0'/>
 </source>
 </hostdev>


8.[root@hp-dl380pg8-16 ~]# virsh start r6
error: Failed to start domain r6
error: internal error: process exited while connecting to monitor: RHEL-6 compat: ich9-usb-uhci1: irq_pin = 3
RHEL-6 compat: ich9-usb-uhci2: irq_pin = 3
RHEL-6 compat: ich9-usb-uhci3: irq_pin = 3
2016-02-06T05:41:15.253268Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: vfio: failed to set iommu for container: Operation not permitted
2016-02-06T05:41:15.253303Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: vfio: failed to setup container for group 49
2016-02-06T05:41:15.253314Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: vfio: failed to get group 49
2016-02-06T05:41:15.253328Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: Device initialization failed.
2016-02-06T05:41:15.253343Z qemu-kvm: -device vfio-pci,host=24:00.0,id=hostdev0,bus=pci.0,addr=0x3: Device 'vfio-pci' could not be initialized

Comment 4 Alex Williamson 2016-06-29 18:50:13 UTC
a) The allow_unsafe_assigned_interrupts=1 option to kvm should never, ever, ever be used unless the system requires it.  I've made this explicitly clear every time I review the QE test plans.

b) Check dmesg, you're on an hp system and I expect there's an error there complaining that the device is RMRR locked and cannot be used for device assignment.  This is a platform issue with the system that can only be resolved by the system vendor.  Find a non-HP system to use for testing.

Comment 5 Alex Williamson 2016-06-29 18:52:13 UTC
For reference:

https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf

Comment 6 Laine Stump 2016-06-29 18:55:17 UTC
If Alex's suspicions are correct, please retest on a different system and close the BZ if the situation is resolved.

Comment 7 Jingjing Shao 2016-07-01 12:41:12 UTC
(In reply to Alex Williamson from comment #4)
> a) The allow_unsafe_assigned_interrupts=1 option to kvm should never, ever,
> ever be used unless the system requires it.  I've made this explicitly clear
> every time I review the QE test plans.
> 
> b) Check dmesg, you're on an hp system and I expect there's an error there
> complaining that the device is RMRR locked and cannot be used for device
> assignment.  This is a platform issue with the system that can only be
> resolved by the system vendor.  Find a non-HP system to use for testing.

a) Sorry that the case about pci-assignment does not update timely. I had update the test plan and test case for this part.

b)Yes, I got the error 
# dmesg | grep  RMRR
[ 2603.224838] vfio-pci 0000:24:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.

Comment 8 Jingjing Shao 2016-07-01 12:44:40 UTC
(In reply to Laine Stump from comment #6)
> If Alex's suspicions are correct, please retest on a different system and
> close the BZ if the situation is resolved.

I will update the result when I get the non-hp system with Mellanox card nextweek

Comment 9 Jingjing Shao 2016-07-06 07:49:01 UTC
(In reply to Jingjing Shao from comment #8)
> (In reply to Laine Stump from comment #6)
> > If Alex's suspicions are correct, please retest on a different system and
> > close the BZ if the situation is resolved.
> 
> I will update the result when I get the non-hp system with Mellanox card
> nextweek

I check the issue on dell R730 with Mellanox card
# lspci | grep Eth
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe
04:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3]
04:00.1 Ethernet controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]
04:00.2 Ethernet controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function]

# cat hostdev.xml 
 <hostdev mode='subsystem' type='pci'  managed='yes'>
 <source>
  <address domain='0x0' bus='0x04' slot='0x00' function='0x0'/>
 </source>
</hostdev>


# virsh start r7.1
Domain r7.1 started


# virsh dumpxml r7.1 | grep hostdev -A9
# 
# 
# 


# virsh attach-device  r7.1 hostdev.xml 
Device attached successfully

# virsh dumpxml r7.1 | grep hostdev -A9
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </hostdev>

# virsh detach-device r7.1  hostdev.xml 
Device detached successfully


# virsh dumpxml r7.1 | grep hostdev -A9
#
#
#

*** This bug has been marked as a duplicate of bug 1097907 ***