Bug 2123413 - persistentnetnamesconfig.py generates broken network naming file when having a bond
Summary: persistentnetnamesconfig.py generates broken network naming file when having ...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: leapp-repository
Version: 7.9
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Leapp Notifications Bot
QA Contact: upgrades-and-conversions
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-01 15:25 UTC by Renaud Métrich
Modified: 2023-07-31 07:12 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OAMG-7509 0 None None None 2022-09-01 15:32:45 UTC
Red Hat Issue Tracker RHELPLAN-133034 0 None None None 2022-09-01 15:32:43 UTC

Description Renaud Métrich 2022-09-01 15:25:50 UTC
Description of problem:

When a system has a different naming for network interfaces between RHEL7 and RHEL8, persistentnetnamesconfig.py generates network ".link" files to make sure the RHEL7 naming is kept once upgraded.

The code bases itself on the MAC address of the network interface.
Unfortunately when having a bond, the MAC address for the 2nd slave is not computed correctly: it's the MAC address of the bond (hence 1st slave) that is returned instead of the Permanent MAC address (the MAC address of the physical card).
This generates a broken ".link" file which ends up having the network card be renamed by RHEL8 through using the standard scheme, as shown in the example below:

Interfaces on RHEL7 (all Mellanox cards, which get renamed):

- bond0, MAC 0c:42:a1:1c:b6:d6 with slaves
  - eno1, MAC 0c:42:a1:1c:b6:d6
  - enp59s0f0, MAC 0c:42:a1:24:95:96 but seen as "0c:42:a1:1c:b6:d6" because of the bond
- enp59s0f1, MAC 0c:42:a1:24:95:97

Files created by Leapp for network interfaces:

- /etc/systemd/network/10-leapp-eno1.link (OK)

  [Match]
  MACAddress=0c:42:a1:1c:b6:d6
  [Link]
  Name=eno1

- /etc/systemd/network/10-leapp-enp59s0f0.link (KO because slave of the bond)

  [Match]
  MACAddress=0c:42:a1:1c:b6:d6
  [Link]
  Name=enp59s0f0

- /etc/systemd/network/10-leapp-enp59s0f1.link (OK) 

  [Match]
  MACAddress=0c:42:a1:24:95:97
  [Link]
  Name=enp59s0f1

Interfaces as seen on RHEL8:

- eno1 kept as is
- enp59s0f0 becomes ens1f0 because standard naming is used (/etc/systemd/network/10-leapp-enp59s0f0.link doesn't match because of the MAC address generated in the file)
- enp59s0f1 kept as is because of /etc/systemd/network/10-leapp-enp59s0f1.link

Digging further, it appears the following code is "faulty":

/usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
  1 import pyudev
 :
 34 def interfaces():
 35     """
 36     Generator which produces an Interface objects containing assorted interface properties relevant for network na    ming
 37     """ 
 38     for dev in physical_interfaces():
 39         attrs = {}
 40 
 41         try:
 42             attrs['name'] = dev.sys_name
 43             attrs['devpath'] = dev.device_path
 44             attrs['driver'] = dev['ID_NET_DRIVER']
 45             attrs['vendor'] = dev['ID_VENDOR_ID']
 46             attrs['pci_info'] = PCIAddress(**pci_info(dev['ID_PATH']))
 47             attrs['mac'] = dev.attributes.get('address')
 :
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

On line 47, the code relies on **address** attribute which is the NON-PERMANENT MAC when part of a bond.

Because udev has no knowledge of the Permanent Address, the only solution I found is to extract the MAC address from the ID_NET_NAME_MAC property instead, something like this patch below:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# diff -u /usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py.orig /usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py
--- /usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py.orig    2022-08-17 16:57:19.000000000 +0200
+++ /usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py    2022-09-01 16:57:58.635000000 +0200
@@ -1,4 +1,5 @@
 import pyudev
+import re
 
 from leapp.libraries.stdlib import api
 from leapp.models import Interface, PCIAddress
@@ -44,9 +45,11 @@
             attrs['driver'] = dev['ID_NET_DRIVER']
             attrs['vendor'] = dev['ID_VENDOR_ID']
             attrs['pci_info'] = PCIAddress(**pci_info(dev['ID_PATH']))
-            attrs['mac'] = dev.attributes.get('address')
-            if isinstance(attrs['mac'], bytes):
-                attrs['mac'] = attrs['mac'].decode()
+            assert dev['ID_NET_NAME_MAC'].startswith('enx')
+            attrs['mac'] = ':'.join(re.findall('..', dev['ID_NET_NAME_MAC'][3:]))
+            #attrs['mac'] = dev.attributes.get('address')
+            #if isinstance(attrs['mac'], bytes):
+            #    attrs['mac'] = attrs['mac'].decode()
         except Exception as e:  # pylint: disable=broad-except
             # FIXME(msekleta): We should probably handle errors more granularly
             # Maybe we should inhibit upgrade process at this point
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

It's ugly/hacky but seems to work but probably DOESN'T cover all cases, specially non-ethernet network interfaces.
Maybe udev needs enhancement to store the permanent address somewhere.

Version-Release number of selected component (if applicable):

leapp-upgrade-el7toel8-0.16.0-8.el7_9.noarch

How reproducible:

Always

Steps to Reproduce:
1. Add 2 network interfaces in a VM and create a bond

  enp7s0: 52:54:00:cf:e9:a8
  enp8s0: 52:54:00:23:46:01

2. Execute "leapp pre-upgrade"
3. Check DB content (/var/lib/leapp/leapp.db)

Actual results: (same MACs for enp7s0 and enp8s0)

{"devpath": "/devices/pci0000:00/0000:00:02.6/0000:07:00.0/net/enp7s0", "driver": "e1000e", "mac": "52:54:00:cf:e9:a8", "name": "enp7s0", "pci_info": {"bus": "07", "device": "00", "domain": "0000", "function": "0"}, "vendor": "0x8086"},
{"devpath": "/devices/pci0000:00/0000:00:02.7/0000:08:00.0/net/enp8s0", "driver": "e1000e", "mac": "52:54:00:cf:e9:a8", "name": "enp8s0", "pci_info": {"bus": "08", "device": "00", "domain": "0000", "function": "0"}, "vendor": "0x8086"}

Expected results: (permanent MAC for enp8s0)

{"devpath": "/devices/pci0000:00/0000:00:02.6/0000:07:00.0/net/enp7s0", "driver": "e1000e", "mac": "52:54:00:cf:e9:a8", "name": "enp7s0", "pci_info": {"bus": "07", "device": "00", "domain": "0000", "function": "0"}, "vendor": "0x8086"},
{"devpath": "/devices/pci0000:00/0000:00:02.7/0000:08:00.0/net/enp8s0", "driver": "e1000e", "mac": "52:54:00:23:46:01", "name": "enp8s0", "pci_info": {"bus": "08", "device": "00", "domain": "0000", "function": "0"}, "vendor": "0x8086"}

Additional info:

I don't know if this applies to Leapp 8 -> 9

Comment 3 Petr Stodulka 2022-09-02 10:34:59 UTC
Hi Renaud! Thank you for the detailed investigation and a proposal for a possible solution which is still improvement even when it probably does not fix all cases. Great job!
This is related to bug 1919382 which led to the current KI & workaround:
  https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/upgrading_from_rhel_7_to_rhel_8/index#known-issues_troubleshooting (search bonding)
The current workaround:
    # LEAPP_NO_NETWORK_RENAMING=1 leapp upgrade

The workaround is not ideal as it skips any actions regarding NIC names, keeping the solution purely on users.
The proposed solution could be an improvement as people could miss the documented workaround prior the upgrade
and also it could enable handling of NIC names in some additional cases with bonding present.


@msekleta : Michal, wdyt about that? Any idea how it could be improved?

Comment 4 Renaud Métrich 2022-09-02 10:58:56 UTC
I think LEAPP_NO_NETWORK_RENAMING=1 cannot be used because this will end up having new RHEL8 names once upgraded, hence break the current ifcfg-XXX files anyway.

Comment 5 Michal Sekletar 2023-02-01 17:50:59 UTC
I think the problem with what Renaud is suggesting is that ID_NET_NAME_MAC don't have to be set. udev's net_id builtin will not set up this variable in case the address which it finds is not permanent (addr_assign_type != 0). Hence if MAC is changed and then something (tool/admin) calls "udevadm trigger" then ID_NET_NAME_MAC will be dropped.

Maybe we should also take into account value of addr_assign_type and if the address isn't permanent we wouldn't use it, instead we could fallback to "ethtool -P" as that should return true hardware address. All of that has a problem that there might be cases when MAC addresses are changed on purpose and not because of bonds and such. We may want to introduce configuration option for this and set it to false by default. Here is a pseudo-code of new logic that we might introduce,

if has_permanent_mac(iface)
  use(iface.mac)
else if is_bond_slave_or_similar(iface)  // are there other cases except bonding? 
  use(ethtool_fallback(iface))
else if config_use_virtual_macs(config)
  use(iface.mac)
else
  log_and_skip(iface)

Btw, this split through the cracks terribly, you guys have to annoy me more if I don't reply in reasonable time!


Note You need to log in before you can comment on or make changes to this bug.