RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1040626 - Error starting domain: internal error: missing IFLA_VF_INFO in netlink response
Summary: Error starting domain: internal error: missing IFLA_VF_INFO in netlink response
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libnl3
Version: 7.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Thomas Graf
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks: 1067873
TreeView+ depends on / blocked
 
Reported: 2013-12-11 17:51 UTC by Alex Williamson
Modified: 2015-02-23 22:54 UTC (History)
18 users (show)

Fixed In Version: libnl3-3.2.21-5.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-13 09:54:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Alex Williamson 2013-12-11 17:51:14 UTC
Description of problem:

Got the following error attempting to start a domain:

Error starting domain: internal error: missing IFLA_VF_INFO in netlink response

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 100, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 122, in tmpcb
    callback(*args, **kwargs)
  File "/usr/share/virt-manager/virtManager/domain.py", line 1220, in startup
    self._backend.create()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 69

This occurs after adding the following xml fragment to the VM (cold add):

    <interface type='hostdev' managed='yes'>
      <mac address='02:10:91:73:00:00'/>
      <driver name='vfio'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x01' slot='0x10' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>

This same fragment works on F20.

If I instead add the device with this fragment:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x10' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>

It works, but now I'm not able to have libvirt program the MAC address of the 82599 VF being assigned.

Version-Release number of selected component (if applicable):
libvirt-1.1.1-14.el7.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Alex Williamson 2013-12-11 18:18:03 UTC
Note that the failing xml follows the example provided here:

http://libvirt.org/formatdomain.html#elementsNICSHostdev

Comment 7 Alex Williamson 2014-01-20 22:54:12 UTC
An 82599 supports 64 VFs per PF.  Binary search says that the problem only occurs for 32 or more VFs, same as report in comment 5.

Comment 8 Alex Williamson 2014-01-20 22:58:47 UTC
Re-adding tgraf needinfo from comment 3

Comment 11 Alex Williamson 2014-01-20 23:31:23 UTC
Just to confirm, I can start the VM with the max 63 VFs configured with the *4 fix that went into libnl 1.1.4

Comment 16 Thomas Graf 2014-02-26 13:17:00 UTC
*** Bug 1069548 has been marked as a duplicate of this bug. ***

Comment 17 Yulong Pei 2014-02-27 07:55:11 UTC
This bug blocked igb, bnx2x NIC's sr-iov testing. so set TestBlocker flag.

Comment 18 Xuesong Zhang 2014-03-06 11:14:25 UTC
As for this bug, test with the latest build, it can be changed the status to verified now.

package version:
libvirt-1.1.1-26.el7.x86_64
qemu-kvm-rhev-1.5.3-52.el7.x86_64
kernel-3.10.0-105.el7.x86_64
libnl3-3.2.21-5.el7.x86_64

steps:
1. find one host contains 82599 SR-IOV card, and generate the max vfs number on the host. Make sure the vf number is large than 32
# lspci|grep 82599|wc -l
128

2. add the following xml to one shutoff guest.
    <interface type='hostdev' managed='yes'>
      <mac address='52:54:00:0e:09:61'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x44' slot='0x1f' function='0x4'/>
      </source>
    </interface>

3. the guest can be started up without any error.
# virsh start a
Domain a started

4. check the dumpxml of the guest, make sure the interface is in there.
# virsh dumpxml a|grep hostdev -A5
    <interface type='hostdev' managed='yes'>
      <mac address='52:54:00:0e:09:61'/>
      <driver name='vfio'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x44' slot='0x1f' function='0x4'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>

Comment 21 Ludek Smid 2014-06-13 09:54:48 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Comment 23 florin.stingaciu 2015-02-23 21:42:39 UTC
(In reply to Ludek Smid from comment #21)
> This request was resolved in Red Hat Enterprise Linux 7.0.
> 
> Contact your manager or support representative in case you have further
> questions about the request.

I am experiencing this same issue while trying to boot a VM. I'm using a Mellanox ConnectX3 configured with 8 VFs on a hypervisor running CentOS 7.  

01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
01:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.2 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.4 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.5 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.6 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.7 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:01.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

Here are the relevant package versions:
libnl3-3.2.21-6.el7.x86_64
kernel-3.10.0-123.el7.x86_64
libvirt-1.1.1-29.el7_0.7.x86_64
qemu-kvm-1.5.3-60.el7_0.11.x86_64

The configuration for the PCI interface on the VM:
    <interface type='hostdev' managed='yes'>
      <mac address='52:54:00:c0:34:2b'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

This configuration fails upon boot with the following error:
error: internal error: missing IFLA_VF_INFO in netlink response

If I define a PCI device in the following manner, the VM boots up fine and I can see the interface:
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev> 

One thing worth mentioning is that the VFs are on top of a infiniband interface. I've been troubleshooting this for a couple of days now without any luck. I've also brought this to the attention of the libvirt mailing list. Any help would be greatly appreciated.

Comment 24 Laine Stump 2015-02-23 22:33:35 UTC
Mellanox cards are a bit different from othe SRIOV cards, and their drivers are (or at least very recently were) under active development to make them more similar to standard SRIOV. The problem you are experiencing may have the same symptoms as this BZ, but it is not the same problem.

Comment 25 florin.stingaciu 2015-02-23 22:45:18 UTC
(In reply to Laine Stump from comment #24)
> Mellanox cards are a bit different from othe SRIOV cards, and their drivers
> are (or at least very recently were) under active development to make them
> more similar to standard SRIOV. The problem you are experiencing may have
> the same symptoms as this BZ, but it is not the same problem.

Should I open a new ticket or should I attempt to get in touch with Mellanox?

Comment 26 Laine Stump 2015-02-23 22:54:26 UTC
I would recommend direct communication with Mellanox.


Note You need to log in before you can comment on or make changes to this bug.