Description of problem: kdump created a udev rule to rename the NIC by MAC address if the NIC name matches eno*. On Azure, the master and slave NIC have the same MAC address and name. After renaming the master NIC to kdump-eth0, udev would try to rename the slave NIC to kdump-eth0 and the following errors should show, May 17 22:37:08 localhost.localdomain systemd-udevd[284]: eth0: Failed to rename network interface 3 from 'eth0' to 'kdump-eth0': File exists May 17 22:37:08 localhost.localdomain systemd-udevd[284]: eth0: Failed to process device, ignoring: File exists Version-Release number of selected component (if applicable): How reproducible: always Steps to Reproduce: Check the steps on https://bugzilla.redhat.com/show_bug.cgi?id=1950932. Actual results: Expected results: Additional info:
Hi Dexuan, To make the scary but harmless message message disappear, I need some input from you. It seems an exception needs to be made for the hv_netvsc NIC, i.e., don't add "kdump-" prefix to a hv_netvsc NIC with name matching eth*. But for the case of multiple hv_netvsc NIC on a system as shown in https://bugzilla.redhat.com/show_bug.cgi?id=1720157, I'm not sure about the correct fix. Say we have two hv_netvsc NICs, one has MAC address A and the other has MAC address B. In the first kernel, A has the name eth0 (the slave NIC has name eth1) and B has the name eth2 (the slave NIC has name eth3). Is it possible that A gets renamed to eth2 and B gets renambed to eth0 in the 2rd kernel?
Hi Coiby, I'm not really familiar with how the NICs are managed and configured by NetworkManager or other programs. IMO, from the kernel/driver's perspective, renaming the NICs itself is not an issue; we just need to make sure NetworkManager or other programs don't try to configure an IP to a VF-based network interface, i.e. the netvsc network interfaces have the IPs, and the VF network interfaces should never have IPs. According to my experience, the interface names of the netvsc NICs are alwaye ethX, but the interface names of the VF NICs can be ethX (e.g. in RHEL 7.x?) or enPXXXXXX (e.g. in RHEL 8.x?), depending on the distro name/version. I don't know how exactly the interface names of VF NICs on Azure/Hyper-V are determinded in different distro names/versions. If we only consider the interface names of the netvsc NICs, there can race conditions in both the first kernel and the second kernel that make the naming inconsistent, e.g. given a VM with 4 netvsc NICs (A/B/C/D), their interface names can be eth0/eth1/eth2/eth3, but the names can change to eth0/eth2/eth1/eth3 after the reboots. I described one of the race condtions here: https://bugzilla.redhat.com/show_bug.cgi?id=1906870#c29 > Is it possible that A gets renamed to eth2 and B gets renambed to eth0 in the 2rd kernel? IMO this is possible, though it may be difficult to reproduce this in practice.
(In reply to Dexuan Cui from comment #2) > Hi Coiby, I'm not really familiar with how the NICs are managed and > configured by NetworkManager or other programs. IMO, from the > kernel/driver's perspective, renaming the NICs itself is not an issue; we > just need to make sure NetworkManager or other programs don't try to > configure an IP to a VF-based network interface, i.e. the netvsc network > interfaces have the IPs, and the VF network interfaces should never have IPs. > > According to my experience, the interface names of the netvsc NICs are > alwaye ethX, but the interface names of the VF NICs can be ethX (e.g. in > RHEL 7.x?) or enPXXXXXX (e.g. in RHEL 8.x?), depending on the distro > name/version. I don't know how exactly the interface names of VF NICs on > Azure/Hyper-V are determinded in different distro names/versions. > > If we only consider the interface names of the netvsc NICs, there can race > conditions in both the first kernel and the second kernel that make the > naming inconsistent, e.g. given a VM with 4 netvsc NICs (A/B/C/D), their > interface names can be eth0/eth1/eth2/eth3, but the names can change to > eth0/eth2/eth1/eth3 after the reboots. I described one of the race condtions > here: https://bugzilla.redhat.com/show_bug.cgi?id=1906870#c29 Thanks for the explanation! This race condition is exactly the reason why kexec-tools adds "kdump-" prefix to the NIC name. Because kexec-tools needs to know which NIC is used from transferring vmcore to a remote FS. So adding "kdump-" prefix would help us tell which NIC is exactly needed by kexec-tools. I will figure out a solution that fixes https://bugzilla.redhat.com/show_bug.cgi?id=1958587 which could fix this bug as well. > > > Is it possible that A gets renamed to eth2 and B gets renambed to eth0 in the 2rd kernel? > IMO this is possible, though it may be difficult to reproduce this in > practice. THanks for confirming it!
The kexec-tools version on RHEL9.2 [azureuser@controller-vm ~]$ rpm -qa kexec-tools kexec-tools-2.0.25-7.el9.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (kexec-tools bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2463