Bug 1113474 - The VF is failed to be used in macvtap-passthrough network if it has been used in hostdev network
Summary: The VF is failed to be used in macvtap-passthrough network if it has been use...
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Laine Stump
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-06-26 09:18 UTC by hongming
Modified: 2015-07-22 05:46 UTC (History)
8 users (show)

(edit)
Previously, a virtual function (VF) could not be used in the macvtap-passthrough network if it was previously used in the hostdev network. With this update, libvirt ensures that the VF's MAC address is properly adjusted for the macvtap-passthrough network, which allows the VF to be used properly in the described scenario.
Clone Of:
: 1211758 (view as bug list)
(edit)
Last Closed: 2015-07-22 05:46:00 UTC


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1252 normal SHIPPED_LIVE libvirt bug fix update 2015-07-20 17:50:06 UTC

Description hongming 2014-06-26 09:18:24 UTC
Description of problem:
The VF is failed to be used in macvtap-passthrough network if it has been used in hostdev network

Version-Release number of selected component (if applicable):
100%

How reproducible:
libvirt-0.10.2-38.el6.x86_64

Steps to Reproduce:
# rmmod igb

# modprobe igb max_vfs=1

# lspci|grep 82576
0e:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
0e:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
0f:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function 
(rev 01)
0f:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function 
(rev 01)
10:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
10:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network 
Connection (rev 01)
11:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function 
(rev 01)
11:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function 
(rev 01)

# virsh nodedev-list --tree
computer
   |
.....

   |       |   +- pci_0000_0f_10_1
   |       |       |
   |       |       +- net_eth7_62_91_f6_77_d2_1a
......


# virsh net-list
Name                 State      Autostart     Persistent
--------------------------------------------------
default              active     yes           yes
hostdev              active     no            yes
passthrough          active     no            yes

# virsh net-dumpxml hostdev
<network>
   <name>hostdev</name>
   <uuid>7354df99-a04a-ee96-7b55-97a1029ab8bc</uuid>
   <forward mode='hostdev' managed='yes'>
     <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' 
function='0x1'/>
   </forward>
</network>

# virsh net-dumpxml passthrough
<network>
   <name>passthrough</name>
   <uuid>a0d30d24-96fd-c62c-9de6-48f6afda28f6</uuid>
   <forward dev='eth7' mode='passthrough'>
     <interface dev='eth7'/>
   </forward>
</network>

# virsh start rhel6
Domain rhel6 started

# cat if-hostdev.xml
<interface type='network'>
    <source network='hostdev'/>
</interface>

# virsh attach-device rhel6 if-hostdev.xml  <====== Hotplug the VF(eth7 
, 0f:10.1) to rhel6 guest from hostdev network

Device attached successfully

# virsh net-dumpxml hostdev
<network connections='1'>
   <name>hostdev</name>
   <uuid>7354df99-a04a-ee96-7b55-97a1029ab8bc</uuid>
   <forward mode='hostdev' managed='yes'>
     <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' 
function='0x1'/>
   </forward>
</network>

# virsh destroy rhel6  <====== Destroy the rhel6 guest and release the VF(eth7 , 0f:10.1 )

Domain rhel6 destroyed

# virsh net-dumpxml hostdev
<network>
   <name>hostdev</name>
   <uuid>7354df99-a04a-ee96-7b55-97a1029ab8bc</uuid>
   <forward mode='hostdev' managed='yes'>
     <address type='pci' domain='0x0000' bus='0x0f' slot='0x10' 
function='0x1'/>
   </forward>
</network>

# virsh start rhel6
Domain rhel6 started

# cat if-passthrogh.xml
<interface type='network'>
    <source network='passthrough'/>
</interface>

# virsh attach-device rhel65 if-passthrogh.xml
error: Failed to attach device from if-passthrogh.xml
error: Cannot set interface  on 'eth7': Cannot assign requested address




Actual results:
The VF is failed to be used in macvtap-passthrough network if it has 
been used in hostdev network

Expected results:
The VF can be used successfully in macvtap-passthrough network if it has 
been used in hostdev network


Additional info:

Comment 2 Laine Stump 2014-06-26 10:49:06 UTC
It would be useful to reference Bug 1111455 here, because this one gets to a lower level cause of the same problem that is causing the other, i.e. apparently for some reason the VF is not getting reattached to the igbvf driver when the guest is finished with using it for pci passthrough; since it isn't bound to a netdev driver, it has no network device name, and can't be used for macvtap (which accesses the device via its netdev name, not via its PCI address).

Can you try doing a manual reattach of the device with

  virsh nodedev-reattach pci_0000_0f_10_1

to see if that gets back the netdev name?

This may tell us if it's a problem in libvirt's auto-management of device driver attachment, or something lower down, perhaps in the kernel.

Comment 3 hongming 2014-06-27 02:41:18 UTC
The result as follows.

[root@sriov2 images]# virsh attach-device rhel6 if-passthrogh.xml
error: Failed to attach device from if-passthrogh.xml
error: Cannot set interface MAC on 'eth7': Cannot assign requested address

[root@sriov2 images]# virsh nodedev-reattach pci_0000_0f_10_1
Device pci_0000_0f_10_1 re-attached

[root@sriov2 images]# ll /sys/bus/pci/drivers/igbvf/0000:0f:10.1/net
total 0
drwxr-xr-x. 5 root root 0 Jun 27 10:31 eth7

[root@sriov2 images]# virsh attach-device rhel6 if-passthrogh.xml
error: Failed to attach device from if-passthrogh.xml
error: Cannot set interface MAC on 'eth7': Cannot assign requested address

Comment 5 Laine Stump 2015-01-28 19:17:14 UTC
Can you please retest this? I was able to reproduce it once last week on my RHEL6 system that hadn't been updated in quite awhile. But after updating to the latest nightly builds (kernel 2.6.32-526), I can no longer reproduce.

I hadn't paid attention to the kernel that I was running before, but it may have been the one that was removed during the update (something in the 400s).

Comment 6 hongming 2015-02-10 10:06:04 UTC
I still can reproduce it using the following versions.

# uname -r
2.6.32-528.el6.x86_64

# rpm -q libvirt
libvirt-0.10.2-48.el6.x86_64


# lspci|grep 82576
03:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
03:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
03:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)


# virsh nodedev-list --tree
computer
  |
......
  |   |     
  |   +- pci_0000_03_10_0
  |   |   |
  |   |   +- net_eth3_f6_6f_d0_90_4b_8f

    
# cat if-hostdev.xml 
<interface type='network'>
    <source network='hostdev'/>
    </interface>

# virsh attach-device rhel6 if-hostdev.xml
Device attached successfully

# virsh net-dumpxml hostdev
<network connections='1'>
  <name>hostdev</name>
  <uuid>d58fcb8d-ed2c-74f2-b134-4e2bad788559</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x0'/>
  </forward>
</network>

# virsh destroy rhel6   <====== Release the VF from domain
Domain rhel6 destroyed

# virsh net-dumpxml hostdev
<network>
  <name>hostdev</name>
  <uuid>d58fcb8d-ed2c-74f2-b134-4e2bad788559</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x0'/>
  </forward>
</network>


# virsh start rhel6
Domain rhel6 started


# virsh net-dumpxml passthrough
<network>
  <name>passthrough</name>
  <uuid>d095cca0-a925-3d33-02f3-97528533262c</uuid>
  <forward dev='eth3' mode='passthrough'>
    <interface dev='eth3'/>
  </forward>
</network>

# cat if-passthrough.xml
<interface type='network'>
    <source network='passthrough'/>
    </interface>


# virsh attach-device rhel6 if-passthrough.xml
error: Failed to attach device from if-passthrough.xml
error: Cannot set interface MAC on 'eth3': Cannot assign requested address

Comment 7 Laine Stump 2015-03-23 14:35:56 UTC
Once again I have exactly replicated the steps you provide in comment 6, and there is no error. 

# uname -r
2.6.32-544.el6.x86_64

# rpm -q libvirt
libvirt-0.10.2-50.el6.x86_64

I also have an 82756 card, and also define two networks, one in macvtap passthrough mode and one hostdev, using the same device. I then start a RHEL6 guest with no network devices, and use attach-device to attach a hostdev interface, then destroy the guest, restart it, and use attach-device to attach a macvtap passthrough device (the same VF, as you've done). This all completes successfully.

# ls -l /sys/class/net/eth5
lrwxrwxrwx. 1 root root 0 Mar 23 10:21 /sys/class/net/eth5 -> ../../devices/pci0000:00/0000:00:07.0/0000:03:10.0/net/eth5

# virsh net-dumpxml pass
<network connections='1'>
  <name>pass</name>
  <uuid>d2f151d0-ba2e-9792-4c4b-2721d67964a2</uuid>
  <forward dev='eth5' mode='passthrough'>
    <interface dev='eth5'/>
  </forward>
</network>

# virsh net-dumpxml host
<network>
  <name>host</name>
  <uuid>bc79b4af-7fa3-304a-f11d-141862bcd7f3</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x0'/>
  </forward>
</network>

# cat /tmp/net.xml
    <interface type='network'>
      <source network='host'/>
    </interface>

# cat /tmp/net-macvtap.xml
    <interface type='network'>
      <source network='pass'/>
    </interface>

# virsh start RHEL6-nonet
Domain RHEL6-nonet started

# virsh attach-device RHEL6-nonet /tmp/net.xml
Device attached successfully

# virsh destroy RHEL6-nonet
Domain RHEL6-nonet destroyed

# virsh start RHEL6-nonet
Domain RHEL6-nonet started

# virsh attach-device RHEL6-nonet /tmp/net-macvtap.xml
Device attached successfully

===

The error message you get is interesting:

error: Cannot set interface MAC on 'eth3': Cannot assign requested address

That is a different message than you would get if the device was still bound to the pci-stub driver:

 error: Cannot get interface MAC on 'eth3': No such device

The only suggestion I have at this time is that if you can give me access to the machine you're using for testing, maybe I can figure out what is different from my testing system.

Comment 9 min zhan 2015-03-31 03:08:59 UTC
Add needinfo flag as it seems be removed in comment 6.

Comment 16 Laine Stump 2015-04-08 18:59:10 UTC
Vlad: This failure occurs on QE's machine with 82576 cards, but not on mine. libvirt is able to get the current mac address (SIOCGIFHWADDR) but SIOCSIFHWADDR fails with EADDRNOTAVAIL (99). I've checked and the device is properly re-attached to the igbvf driver and is visible by the system.

What are the possible reasons for EADDRNOTAVAIL in the 2.6.32-whatever kernel when trying to set MAC address on a VF that will be used for macvtap passthrough mode? I've verified that it *isn't* the one reason I know of - it is not a multicast address; I've tried several different addresses, so I don't think it's anything particular about this address.

Comment 17 Laine Stump 2015-04-08 19:33:48 UTC
I also tried attaching the VF to the guest as a simple <hostdev> (no setting of MAC address before or after letting qemu assign it to the guest, as is done with <interface type='hostdev'>), and then using it for macvtap passthrough, and in this case there was no problem. So it appears that there is something else that happens when the MAC address is set or reset for hostdev that prevents future setting of the MAC for macvtap passthrough.

Comment 18 Vlad Yasevich 2015-04-09 16:18:13 UTC
(In reply to Laine Stump from comment #16)
> Vlad: This failure occurs on QE's machine with 82576 cards, but not on mine.
> libvirt is able to get the current mac address (SIOCGIFHWADDR) but
> SIOCSIFHWADDR fails with EADDRNOTAVAIL (99). I've checked and the device is
> properly re-attached to the igbvf driver and is visible by the system.
> 
> What are the possible reasons for EADDRNOTAVAIL in the 2.6.32-whatever
> kernel when trying to set MAC address on a VF that will be used for macvtap
> passthrough mode? I've verified that it *isn't* the one reason I know of -
> it is not a multicast address; I've tried several different addresses, so I
> don't think it's anything particular about this address.

The only time we get this error from the driver is when it thinks the mac
address is invalid (has bit 0 of the highest order byte set).

QE:  Can you please check the dmesg log go see if there is any messages from the
card.  There are some very fine interactions between VFs and PFs when messing around with mac addresses.

-vlad

Comment 19 Laine Stump 2015-04-09 17:39:11 UTC
Aha!

[just prior to assigning to the guest with PCI passthrough]
[MAC address is set with netlink RTM_SETLINK message to PF giving VF#]
kernel: pci-stub 0000:0f:10.3: claimed by stub
kernel: igb 0000:0e:00.1: setting MAC 52:54:00:e2:ad:ad on VF 1
kernel: igb 0000:0e:00.1: Reload the VF driver to make this change effective.
kernel: pci-stub 0000:0f:10.3: enabling device (0000 -> 0002)

[finished with PCI passthrough -]
[MAC address is set with netlink RTM_SETLINK message to PF giving VF#]
kernel: igb 0000:0e:00.1: setting MAC 22:eb:b0:ce:30:dc on VF 1
kernel: igb 0000:0e:00.1: Reload the VF driver to make this change effective.
kernel: igbvf 0000:0f:10.3: enabling device (0000 -> 0002)
kernel: igb 0000:0e:00.1: VF 1 attempted to override administratively set MAC address
kernel: Reload the VF driver to resume operations
kernel: igbvf 0000:0f:10.3: Intel(R) 82576 Virtual Function
kernel: igbvf 0000:0f:10.3: Address: 22:eb:b0:ce:30:dc

[attempt to set MAC address for PCI passthrough (using ioctl(SIOCSIFHWADDR)]
kernel: igb 0000:0e:00.1: VF 1 attempted to override administratively set MAC address
Apr  8 15:30:32 sriov2 kernel: Reload the VF driver to resume operations


Bug 1164224 (filed against NM in RHEL7 for some reason) reports some similar behavior, and it appears that the "attempted to override" message comes from here:

http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/igb/igb_main.c#L6095

6087         retval = -EINVAL;
6088         if (!(vf_data->flags & IGB_VF_FLAG_PF_SET_MAC))
6089                 retval = igb_set_vf_mac_addr(adapter, msgbuf, vf);
6090         else
6091                 dev_warn(&pdev->dev,
6092                          "VF %d attempted to override administratively set MAC address\nReload the VF driver to resume operations\n",
6093                          vf);

So one item is that for macvtap passthrough for some reason we end up using SIOCSIFHWADDR directly on the VF, but for PCI passthrough we use RTM_SETLINK sent to the PF giving it the VF# as an arg. But from the kernel logs it looks like the *reset* of the MAC address via netlink after finishing with PCI passthrough is *also* failing, but we just coincidentally get the correct MAC again because the VF driver is being reattached to the device, resulting in a re-initialization. Odd that netlink doesn't seem to return an error though (or could it be that's just a warning when doing the set via netlink?)

I'm going try and force libvirt to always use netlink even for macvtap, and see if that changes the outcome.

Comment 20 Laine Stump 2015-04-09 18:20:20 UTC
Okay, and now I can reproduce on my system. The difference was that I hadn't ifup'ed the PF device.

Comment 21 Vlad Yasevich 2015-04-09 19:35:19 UTC
So, there are 2 ways to set the VF mac address:
 1) Using netlink through the PF device, similar to
      ip link set dev eth0 vf 0 mac <addr>
    Doing it this way will set the "admin" flag and will forbid you from
    overriding the mac address using other means.
    One side-effect of this type of operation is that mac address does
    not appear changed on the netdev device (ex: eth5).
 2) Doing it through ioctl or netlink on the VF itself.
      ip link set dev eth5 address <addr>
    This will set the mac on the eth5 device and send a message to the PF
    to update the VF table.  However, if the mac was already through the PF
    (option 1 above), the error is reported.

It also looks like igb and ixgbe devices do behave similarly.
Where as igb device will return an error prior to setting the mac address of the
device, ixgbe will set the mac address of the device and print the error thus possibly leaving the VF in unreachable state.

libvirt should probably stick to a single way to programming mac address.

-vlad

Comment 22 Laine Stump 2015-04-10 19:03:24 UTC
I've tried modifying libvirt to always use netlink STM_SETLINK (sent for the PF, with vf# as a parameter) whenever possible:

 https://www.redhat.com/archives/libvir-list/2015-April/msg00422.html

and this appears to solve the problem of setting the MAC address.

There is an additional problem though - when the VF is re-attached to the host driver after hostdev passthrough is finished, it of course becomes ~IFF_UP. So we can successfully setup macvtap passthrough, but no traffic will pass until the VF is set IFF_UP. libvirt should be doing this automatically.

Comment 23 Laine Stump 2015-04-14 19:03:01 UTC
The following patch posted upstream solves the problem with IFF_UP:

https://www.redhat.com/archives/libvir-list/2015-April/msg00510.html

Note that putting this patch into RHEL6 would require backporting the patches for Bug 1081464, which carries a danger of regression, so I will be proposing a much simpler downstream-only patch for RHEL6 instead.

Comment 24 Laine Stump 2015-04-21 18:01:01 UTC
These two patches have been pushed upstream:

commit cb3fe38c74bd1fb4ef76c64c045cf48467a9d259
Author: Laine Stump <laine@laine.org>
Date:   Fri Apr 10 12:19:49 2015 -0400

    util: set MAC address for VF via netlink message to PF+VF# when possible

commit 38172ed894ab20e0ee0072da6f695e4c97aecc4a
Author: Laine Stump <laine@laine.org>
Date:   Mon Apr 13 13:26:54 2015 -0400

    qemu: set macvtap physdevs online when macvtap is set online

Comment 27 Hu Jianwei 2015-05-05 10:20:43 UTC
Verified as below:

[root@sriov1 jiahu]# rpm -q libvirt qemu-kvm-rhev
libvirt-0.10.2-54.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.471.el6.x86_64

[root@sriov1 jiahu]# virsh net-dumpxml hostdev
<network>
  <name>hostdev</name>
  <uuid>d58fcb8d-ed2c-74f2-b134-4e2bad788559</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x0'/>
  </forward>
</network>

[root@sriov1 jiahu]# virsh net-dumpxml passthrough
<network>
  <name>passthrough</name>
  <uuid>d095cca0-a925-3d33-02f3-97528533262c</uuid>
  <forward dev='eth3' mode='passthrough'>
    <interface dev='eth3'/>
  </forward>
</network>

[root@sriov1 jiahu]# sh pv.sh 
 PF: +- eth0 10.66.6.158  03:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
 VF: |    +- eth3 N/A 03:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
 VF: |    +- eth5 N/A 03:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
 PF: +- eth1 N/A 03:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
 VF: |    +- eth4 N/A 03:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
 VF: |    +- eth6 N/A 03:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
[root@sriov1 jiahu]# virsh list 
 Id    Name                           State
----------------------------------------------------
 15    r6-mig                         running
 
[root@sriov1 jiahu]# cat if-hostdev.xml 
<interface type='network'>
    <source network='hostdev'/>
</interface>
[root@sriov1 jiahu]# virsh attach-device r6-mig if-hostdev.xml 
Device attached successfully

[root@sriov1 jiahu]# virsh net-dumpxml hostdev
<network connections='1'>
  <name>hostdev</name>
  <uuid>d58fcb8d-ed2c-74f2-b134-4e2bad788559</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x0'/>
  </forward>
</network>

[root@sriov1 jiahu]# virsh destroy r6-mig
Domain r6-mig destroyed

[root@sriov1 jiahu]# virsh net-dumpxml hostdev
<network>
  <name>hostdev</name>
  <uuid>d58fcb8d-ed2c-74f2-b134-4e2bad788559</uuid>
  <forward mode='hostdev' managed='yes'>
    <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x0'/>
  </forward>
</network>

[root@sriov1 jiahu]# virsh start r6-mig
Domain r6-mig started

[root@sriov1 jiahu]# cat if-passthrogh.xml 
<interface type='network'>
    <source network='passthrough'/>
</interface>
[root@sriov1 jiahu]# virsh attach-device r6-mig if-passthrogh.xml 
Device attached successfully

[root@sriov1 jiahu]# virsh net-dumpxml passthrough
<network connections='1'>
  <name>passthrough</name>
  <uuid>d095cca0-a925-3d33-02f3-97528533262c</uuid>
  <forward dev='eth3' mode='passthrough'>
    <interface dev='eth3' connections='1'/>
  </forward>
</network>

52: macvtap0@eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 500
    link/ether ca:44:02:fa:35:4f brd ff:ff:ff:ff:ff:ff
    macvtap  mode passthru

[root@sriov1 jiahu]# virsh destroy r6-mig
Domain r6-mig destroyed

[root@sriov1 jiahu]# virsh net-dumpxml passthrough
<network>
  <name>passthrough</name>
  <uuid>d095cca0-a925-3d33-02f3-97528533262c</uuid>
  <forward dev='eth3' mode='passthrough'>
    <interface dev='eth3'/>
  </forward>
</network>

We can get expected results, move to verified.

Comment 29 errata-xmlrpc 2015-07-22 05:46:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1252.html


Note You need to log in before you can comment on or make changes to this bug.