RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1401837 - MAC address of VF is not reset by libvirt since ixgbe driver does not accept 00:00:00:00:00
Summary: MAC address of VF is not reset by libvirt since ixgbe driver does not accept ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: pre-dev-freeze
: 7.3
Assignee: Ken Cox
QA Contact: LiLiang
URL:
Whiteboard:
Depends On:
Blocks: 1415609
TreeView+ depends on / blocked
 
Reported: 2016-12-06 08:43 UTC by Dan Kenigsberg
Modified: 2021-09-24 05:53 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1302166
Environment:
Last Closed: 2019-01-18 22:15:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Script to clear MAC addresses from unused VFs (1.82 KB, application/x-shellscript)
2016-12-08 18:59 UTC, Steve Dobbelstein
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1302166 0 medium CLOSED MAC address of VF is not editable even when attached to host 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1341248 0 unspecified CLOSED igb driver forbids resetting VF MAC address back to 00:00:00:00:00:00, which was its original value 2021-09-24 05:53:45 UTC
Red Hat Bugzilla 1415609 0 high CLOSED libvirt fails to remove guest mac from VFs of certain drivers 2021-09-24 05:39:43 UTC

Internal Links: 1302166 1341248 1415609

Description Dan Kenigsberg 2016-12-06 08:43:18 UTC
+++ This bug was initially created as a clone of Bug #1302166 +++

Description of problem:
MAC addresses are left on hypervisor VFs after a VM has been removed and the MAC returned to the pool, causing potential MAC conflicts.

"""
Intel x520 Dual Port 10GbE SFP+ adapters installed.
# uname -a
Linux vs-host 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
# modinfo ixgbe
filename:       /lib/modules/3.10.0-514.el7.x86_64/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
version:        4.4.0-k-rh7.3

Looking at my script to wipe the MAC addresses when my system gets in a state with duplicate addresses, I see that it sets the MAC address to an unused, non-zero address so that there are not duplicates of valid MAC addresses.  I initially wrote the script to set the MAC addresses to 00:00:00:00:00:00, but that fails.
# ip link set ens9f0 vf 26 mac 00:00:00:00:00:00
RTNETLINK answers: Invalid argument

The MAC addresses are initially set to zeros, so it appears libvirt is getting a failure when it tries to restore the MAC address to its previous value when the VM is shutdown.
"""

Comment 1 Ken Cox 2016-12-08 14:52:41 UTC
How are you enabling the VFs?  I assume you are using the max_vfs module parameter, which will result in the VFs coming up with zero mac addresses.  The max_vfs module parameter is deprecated and is noted as such in the system log.

Try using the sysfs interface, e.g. 

    echo 10 > /sys/class/net/<dev>/device/sriov_numvfs

remove the max_vfs module parameter before doing this.

Comment 2 Jay Turner 2016-12-08 15:29:50 UTC
(In reply to Ken Cox from comment #1)
> How are you enabling the VFs?  I assume you are using the max_vfs module
> parameter, which will result in the VFs coming up with zero mac addresses. 
> The max_vfs module parameter is deprecated and is noted as such in the
> system log.
> 
> Try using the sysfs interface, e.g. 
> 
>     echo 10 > /sys/class/net/<dev>/device/sriov_numvfs
> 
> remove the max_vfs module parameter before doing this.

Not sure if this was directed at me, but yes, we are using the sysfs interface for enabling VFs.  And yes, when the VFs come up, they have all-zero MACs.  But one is not able to reset the MAC to all zeros after relinquishing.

Comment 3 Ken Cox 2016-12-08 15:42:13 UTC
What is the output of 'lspci -nv' for this device, e.g.:
    lspci -nv -s 08:00.01
for this device?

Comment 4 Steve Dobbelstein 2016-12-08 18:30:00 UTC
# lspci -nv -s 11:00.0
11:00.0 0200: 8086:10fb (rev 01)
        Subsystem: 8086:7a12
        Physical Slot: 9
        Flags: bus master, fast devsel, latency 0, IRQ 148, NUMA node 0
        Memory at da400000 (64-bit, non-prefetchable) [size=1M]
        I/O ports at 5fc0 [disabled] [size=32]
        Memory at da2fc000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number 90-e2-ba-ff-ff-3e-3e-44
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe

Comment 5 Steve Dobbelstein 2016-12-08 18:59:42 UTC
Created attachment 1229601 [details]
Script to clear MAC addresses from unused VFs

Dan Kenigsberg asked that I share the script I use for clearing the MAC addresses from the virtual functions.  I have enhanced the script so that it only clears the MAC addresses from VFs that are not currently being used by VMs.  The script is provided in hope that it may be useful as a workaround to anyone else that is experiencing this problem.  It is provided as-is with no warranty.  It has had limited testing on only one system.

Comment 6 Ken Cox 2017-02-08 17:31:17 UTC
Is this script working for you?

On the VF interfaces, the mac addresses should be initialized to a valid address by some administrative function.  All zeros is not a valid mac address so the 'Invalid argument' error is correct in this case.  When the VF is enabled, dmesg spits out a message indicating that the mac address needs to be initialized.

ixgbe 0000:08:00.0: VF 62 has no MAC address assigned, you may have to assign one manually

Comment 8 Dan Kenigsberg 2017-02-08 19:34:51 UTC
We've ended doing a slightly different approach (and similar to what comment 6 suggests): https://gerrit.ovirt.org/#/c/71029/39/lib/vdsm/network/api.py@89
we're replacing the all-zero mac with 02:00:00:00:00:01 right after num_vfs is set.

if all-zeros is not a valid mac, it should not be the first thing userspace sees on a fresh VF.

Comment 9 Laine Stump 2017-02-08 19:59:54 UTC
Ken,

I haven't tried the script, but this:

> All zeros is not a valid mac address

(along with some misconceptions on my/others' parts) is the real source of the several bug reports relating to all-0 MAC addresses.

The problem is that:

1) some versions of some SRIOV netdev drivers initialize the MAC address stored for the VF (which will be set as the actual MAC address the next time the device is bound to a new driver) as 00:00:00:00:00:00. (As of current rhel 7, the drivers I'm aware do this are igb, ixgbe, and enic (I think I'm remembering that name correctly - the Cisco cards). Also possibly the mlx driver does this; I don't have access to a box with mlx hardware right now, but I remember a similar BZ was filed for them.

2) in order to preserve the current state of the device, libvirt (or some other random piece of code) saves this MAC address (after retrieving it via a netlink RTM_GETLINK to the PF device, providing the VF#).

3) at some later time, in order to restore the previous state of the device, libvirt (or some other piece of code) simply puts the same VF# and previously-retrieved MAC address into a netlink RTM_SETLINK message and sends it off.

If the address was previously non-0, or if the driver allows setting a MAC address of all 0 (enic does, and I think mlx now does as well), then the previous state of the device is restored. If not, then the MAC address that had been set while libvirt (i.e. a virtual machine) was using the device remains in place and causes problems somewhere down the line.

I can understand if an all-0 MAC address on an actual device connected to the network causes some problem that requires forbidding it. But if that's the case, then the driver's shouldn't come up with the MAC addresses set to all-0 in the first place. And if they allow it at initialization, then they should allow having it set back to its initial value at a later time.

Note that Bug 1341248 is the same bug but for igb. I was *sure* there was a similar bug for mlx, but I can't find it.

=======================================

About my own misconceptions - a long time ago, in a galaxy far away, I made the assumption that whatever MAC address set via RTM_SETLINK to PFnetdev+VF# would immediately take effect in VFnetdev, in other words that it was just a different way of setting the same value. I much later realized that the MAC address set via PFnetdev+VF# is just stored away for later, and doesn't take effect until the next time the device is bound to a new driver. The effect is still as-intended in the case of assigning a VF to a virtual machine, since we always unbind from the host net driver and re-bind to vfio-pci.

For macvtap passthrough mode, though, this ends up producing incorrect results - the MAC address at PFnetdev+VF# is changed, but that has no effect on the MAC address that really matters - the one currently in use for VFnetdev.

I have a plan to fix *that* part of it in libvirt, but everything still depends on being able to set either MAC address with any value that it had been set to in the past. So, as Dan suggests - if it's initialized to 00:00:00:00:00:00 then we should be able to *set* it to 00:00:00:00:00:00; if we can't set it to that value, then it shouldn't be initialized to that value.

(I haven't looked, but I *think* the all 0 MAC address might be used in some SRIOV drivers as a flag to indicate "set the VFnetdev MAC address to a random value" - that's what seems to happen for igb and ixgbe anyway (but not enic; don't know about mlx).)

Comment 10 Laine Stump 2017-02-12 19:25:53 UTC
I found the Mellanox bug on the same topic: Bug 1302166. Already fixed.

Also, VDSM has asked libvirt for a workaround that sets the MAC to 00:00:00:00:00:01 when attempts to set to 00:00:00:00:00:00 fail. Bug 1415609. I referred them back to this BZ and to Bug 1341248.

Comment 11 Steve Dobbelstein 2017-02-13 16:20:38 UTC
In response to comment #10, I believe the MAC address should be reset to 02:00:00:00:00:01 as Dan does in comment #8 and not 00:00:00:00:00:01.  Bit 1 (0-based) in the first byte should be set to indicate that it is a locally administered MAC address.  If bit 1 is zero that means the address is a globally administered address.  The 00:00:00 OUI is a valid registered global address owned by Xerox.  We shouldn't be resetting the MAC address to one that is owned by a specific vendor.

Comment 12 Laine Stump 2017-02-13 21:04:00 UTC
Well, ideal  it should be allowed to set the administrative MAC address to 00:00:00:00:00:00 since that is its initial value, and it has the special meaning of "set the VF MAC address to a random value when its driver is reloaded" (i.e., if the address is *all* 0's, then that address will not actually be used as-is).

For the hack I mentioned of setting a non-0 address, you make a good point - if we set it to 00:00:00:00:00:01, then the VF *will* be set to 00:00:00:00:00:01 (although hopefully the interface will never actually be ifup'ed with that MAC). But of course, the entire point of this BZ is that it shouldn't be necessary to have such a hack at all :-)

Comment 13 Ken Cox 2017-02-15 19:50:28 UTC
The all-zero mac address that the driver comes up with is essentially an unitialized value that is not usable and must be set to some valid value before the interface is used.  Since we don't have the mac address for these VFs in a PROM somewhere, it is an administrative function to assign valid mac addresses. Allowing the mac address to be 'set' to all-zeros seems like more of a hack to me than good policy.  The ixgbe driver specifically prohibits setting the mac to all-zeros because it is an invalid mac address, but also because part of the design of the ixgbe nic includes logic to report errors if packets are sent with all-zero source addresses.  This logic in the driver helps to ensure that doesn't happen by disallowing the all-zeros mac to be set.

In the ixgbevf driver, the all-zero mac address is not really a 'special meaning', it is an unitialized address.  The test in the ixgbevf driver is not looking for a special meaning, but making sure it has a valid mac address.  The test in the driver is:

    if (!is_valid_ether_addr(netdev->dev_addr))

so it is making sure the address is a valid address.  As mentioned before, all-zeros is not valid and will cause the nic to report errors.  Of course, picking some random mac address isn't a very good approach either and could cause other problems with duplicate mac addresses also.  

In the absence of any policy mandating that NIC drivers accept invalid addresses, it makes more sense to use a different mechanism to manage the mac addresses.  This is most likely going to continue to be a problem in libvirt with new drivers.

I'm not sure where the code is that initially enables the VFs but I assume it is in some startup script.  Couldn't that same script initialize the mac addresses to some valid locally administered address?  It seems that something like that would work across all drivers as they are without requiring all of the drivers to change.

Comment 14 Laine Stump 2017-03-29 01:48:32 UTC
All the libvirt patches discussed here and in Bug 1415609 and Bug 1341248 have been pushed, including the patch to set to 02:00:00:00:00:00 when attempts at 00:00:00:00:00:00 fail (although that now rarely ever happens due to the change in save/restore algorithm). So definitely close this BZ with whatever disposition you think is most appropriate (my opinion is "WONTFIX", although yours may be "NOTABUG" :-)

Comment 15 John Feeney 2019-01-18 22:15:12 UTC
This bugzilla has been open for a while without any activity. I notice that the previous comment suggests it can be closed so I will take the suggestion.

If you feel this is not a proper response, please re-open, with a justification, and accept my apologies for closing it.

 John


Note You need to log in before you can comment on or make changes to this bug.