Description of problem: When a system has a different naming for network interfaces between RHEL7 and RHEL8, persistentnetnamesconfig.py generates network ".link" files to make sure the RHEL7 naming is kept once upgraded. The code bases itself on the MAC address of the network interface. Unfortunately when having a bond, the MAC address for the 2nd slave is not computed correctly: it's the MAC address of the bond (hence 1st slave) that is returned instead of the Permanent MAC address (the MAC address of the physical card). This generates a broken ".link" file which ends up having the network card be renamed by RHEL8 through using the standard scheme, as shown in the example below: Interfaces on RHEL7 (all Mellanox cards, which get renamed): - bond0, MAC 0c:42:a1:1c:b6:d6 with slaves - eno1, MAC 0c:42:a1:1c:b6:d6 - enp59s0f0, MAC 0c:42:a1:24:95:96 but seen as "0c:42:a1:1c:b6:d6" because of the bond - enp59s0f1, MAC 0c:42:a1:24:95:97 Files created by Leapp for network interfaces: - /etc/systemd/network/10-leapp-eno1.link (OK) [Match] MACAddress=0c:42:a1:1c:b6:d6 [Link] Name=eno1 - /etc/systemd/network/10-leapp-enp59s0f0.link (KO because slave of the bond) [Match] MACAddress=0c:42:a1:1c:b6:d6 [Link] Name=enp59s0f0 - /etc/systemd/network/10-leapp-enp59s0f1.link (OK) [Match] MACAddress=0c:42:a1:24:95:97 [Link] Name=enp59s0f1 Interfaces as seen on RHEL8: - eno1 kept as is - enp59s0f0 becomes ens1f0 because standard naming is used (/etc/systemd/network/10-leapp-enp59s0f0.link doesn't match because of the MAC address generated in the file) - enp59s0f1 kept as is because of /etc/systemd/network/10-leapp-enp59s0f1.link Digging further, it appears the following code is "faulty": /usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 1 import pyudev : 34 def interfaces(): 35 """ 36 Generator which produces an Interface objects containing assorted interface properties relevant for network na ming 37 """ 38 for dev in physical_interfaces(): 39 attrs = {} 40 41 try: 42 attrs['name'] = dev.sys_name 43 attrs['devpath'] = dev.device_path 44 attrs['driver'] = dev['ID_NET_DRIVER'] 45 attrs['vendor'] = dev['ID_VENDOR_ID'] 46 attrs['pci_info'] = PCIAddress(**pci_info(dev['ID_PATH'])) 47 attrs['mac'] = dev.attributes.get('address') : -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- On line 47, the code relies on **address** attribute which is the NON-PERMANENT MAC when part of a bond. Because udev has no knowledge of the Permanent Address, the only solution I found is to extract the MAC address from the ID_NET_NAME_MAC property instead, something like this patch below: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- # diff -u /usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py.orig /usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py --- /usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py.orig 2022-08-17 16:57:19.000000000 +0200 +++ /usr/share/leapp-repository/repositories/system_upgrade/common/libraries/persistentnetnames.py 2022-09-01 16:57:58.635000000 +0200 @@ -1,4 +1,5 @@ import pyudev +import re from leapp.libraries.stdlib import api from leapp.models import Interface, PCIAddress @@ -44,9 +45,11 @@ attrs['driver'] = dev['ID_NET_DRIVER'] attrs['vendor'] = dev['ID_VENDOR_ID'] attrs['pci_info'] = PCIAddress(**pci_info(dev['ID_PATH'])) - attrs['mac'] = dev.attributes.get('address') - if isinstance(attrs['mac'], bytes): - attrs['mac'] = attrs['mac'].decode() + assert dev['ID_NET_NAME_MAC'].startswith('enx') + attrs['mac'] = ':'.join(re.findall('..', dev['ID_NET_NAME_MAC'][3:])) + #attrs['mac'] = dev.attributes.get('address') + #if isinstance(attrs['mac'], bytes): + # attrs['mac'] = attrs['mac'].decode() except Exception as e: # pylint: disable=broad-except # FIXME(msekleta): We should probably handle errors more granularly # Maybe we should inhibit upgrade process at this point -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- It's ugly/hacky but seems to work but probably DOESN'T cover all cases, specially non-ethernet network interfaces. Maybe udev needs enhancement to store the permanent address somewhere. Version-Release number of selected component (if applicable): leapp-upgrade-el7toel8-0.16.0-8.el7_9.noarch How reproducible: Always Steps to Reproduce: 1. Add 2 network interfaces in a VM and create a bond enp7s0: 52:54:00:cf:e9:a8 enp8s0: 52:54:00:23:46:01 2. Execute "leapp pre-upgrade" 3. Check DB content (/var/lib/leapp/leapp.db) Actual results: (same MACs for enp7s0 and enp8s0) {"devpath": "/devices/pci0000:00/0000:00:02.6/0000:07:00.0/net/enp7s0", "driver": "e1000e", "mac": "52:54:00:cf:e9:a8", "name": "enp7s0", "pci_info": {"bus": "07", "device": "00", "domain": "0000", "function": "0"}, "vendor": "0x8086"}, {"devpath": "/devices/pci0000:00/0000:00:02.7/0000:08:00.0/net/enp8s0", "driver": "e1000e", "mac": "52:54:00:cf:e9:a8", "name": "enp8s0", "pci_info": {"bus": "08", "device": "00", "domain": "0000", "function": "0"}, "vendor": "0x8086"} Expected results: (permanent MAC for enp8s0) {"devpath": "/devices/pci0000:00/0000:00:02.6/0000:07:00.0/net/enp7s0", "driver": "e1000e", "mac": "52:54:00:cf:e9:a8", "name": "enp7s0", "pci_info": {"bus": "07", "device": "00", "domain": "0000", "function": "0"}, "vendor": "0x8086"}, {"devpath": "/devices/pci0000:00/0000:00:02.7/0000:08:00.0/net/enp8s0", "driver": "e1000e", "mac": "52:54:00:23:46:01", "name": "enp8s0", "pci_info": {"bus": "08", "device": "00", "domain": "0000", "function": "0"}, "vendor": "0x8086"} Additional info: I don't know if this applies to Leapp 8 -> 9
Hi Renaud! Thank you for the detailed investigation and a proposal for a possible solution which is still improvement even when it probably does not fix all cases. Great job! This is related to bug 1919382 which led to the current KI & workaround: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/upgrading_from_rhel_7_to_rhel_8/index#known-issues_troubleshooting (search bonding) The current workaround: # LEAPP_NO_NETWORK_RENAMING=1 leapp upgrade The workaround is not ideal as it skips any actions regarding NIC names, keeping the solution purely on users. The proposed solution could be an improvement as people could miss the documented workaround prior the upgrade and also it could enable handling of NIC names in some additional cases with bonding present. @msekleta : Michal, wdyt about that? Any idea how it could be improved?
I think LEAPP_NO_NETWORK_RENAMING=1 cannot be used because this will end up having new RHEL8 names once upgraded, hence break the current ifcfg-XXX files anyway.
I think the problem with what Renaud is suggesting is that ID_NET_NAME_MAC don't have to be set. udev's net_id builtin will not set up this variable in case the address which it finds is not permanent (addr_assign_type != 0). Hence if MAC is changed and then something (tool/admin) calls "udevadm trigger" then ID_NET_NAME_MAC will be dropped. Maybe we should also take into account value of addr_assign_type and if the address isn't permanent we wouldn't use it, instead we could fallback to "ethtool -P" as that should return true hardware address. All of that has a problem that there might be cases when MAC addresses are changed on purpose and not because of bonds and such. We may want to introduce configuration option for this and set it to false by default. Here is a pseudo-code of new logic that we might introduce, if has_permanent_mac(iface) use(iface.mac) else if is_bond_slave_or_similar(iface) // are there other cases except bonding? use(ethtool_fallback(iface)) else if config_use_virtual_macs(config) use(iface.mac) else log_and_skip(iface) Btw, this split through the cracks terribly, you guys have to annoy me more if I don't reply in reasonable time!