Description of problem: I have a system with 2 network ports on the mainboard and 4 more ports on a quad network adapter card. Depending on the timing, udev and it's renaming orgy runs into race conditions with the driver init code, which can cause "lost" interfaces. Version-Release number of selected component (if applicable): udev-173-3.fc16.x86_64 kernel-3.1.6-1.fc16.x86_64 How reproducible: Race condition. Happened once out of 4 boot sequences so far... Steps to Reproduce: 1. Reboot the system. Actual results: Only file network interfaces were visible: - eth0 and eth1 for the interfaces on the mainboard - eth2, eth3 and eth4 for interfaces on the quad card - eth5 was missing Expected results: - eth 0 and eth 1 on the mainboard - eth2, eth3, eth4 and eth5 on the quad card. Additional info: Closer inspection showded that the missing interface was present, although under the unexpected name "rename7"; I was able to bring it up as needed using a "ip link set rename7 name eth5 ; ifup eth5" command sequence. The following extract from the system log shows how the kernel and udev step on each other's feet when naming / renaming interfaces: [ 11.140816] igb 0000:06:00.0: eth0: (PCIe:2.5Gb/s:Width x4) 90:e2:ba:02:be:54 [ 11.149784] igb 0000:06:00.0: eth0: PBA No: E91609-005 [ 11.170289] e1000e 0000:09:00.0: eth1: (PCI Express:2.5GT/s:Width x1) 00:30:48:d5:7b:2c [ 11.170291] e1000e 0000:09:00.0: eth1: Intel(R) PRO/1000 Network Connection [ 11.170368] e1000e 0000:09:00.0: eth1: MAC: 3, PHY: 8, PBA No: 0101FF-0FF [ 11.321775] e1000e 0000:0a:00.0: eth2: (PCI Express:2.5GT/s:Width x1) 00:30:48:d5:7b:2d [ 11.330970] e1000e 0000:0a:00.0: eth2: Intel(R) PRO/1000 Network Connection [ 11.339630] e1000e 0000:0a:00.0: eth2: MAC: 3, PHY: 8, PBA No: 0101FF-0FF [ 11.353905] udevd[710]: renamed network interface eth0 to rename2 [ 11.365854] udevd[724]: renamed network interface eth2 to rename4 [ 11.381840] udevd[712]: renamed network interface eth1 to eth0 [ 11.415701] udevd[710]: renamed network interface rename2 to eth2 [ 11.415922] igb 0000:06:00.1: eth1: (PCIe:2.5Gb/s:Width x4) 90:e2:ba:02:be:55 [ 11.416004] igb 0000:06:00.1: eth1: PBA No: E91609-005 [ 11.616251] igb 0000:07:00.0: eth4: (PCIe:2.5Gb/s:Width x4) 90:e2:ba:02:be:56 [ 11.624478] igb 0000:07:00.0: eth4: PBA No: E91609-005 [ 11.658062] udevd[709]: renamed network interface eth3 to eth4 [ 11.861296] igb 0000:07:00.1: rename7: (PCIe:2.5Gb/s:Width x4) 90:e2:ba:02:be:57 [ 11.870485] igb 0000:07:00.1: rename7: PBA No: E91609-005 [ 11.904426] udevd[709]: renamed network interface eth3 to rename7 [ 11.922283] udevd[712]: renamed network interface eth1 to eth3 [ 11.966126] udevd[724]: renamed network interface rename4 to eth1 [ 102.021874] udevd[709]: error changing net interface name rename7 to eth4: File exists [ 106.506673] bonding: bond0: Adding slave eth2. [ 106.595914] bonding: bond0: enslaving eth2 as a backup interface with a down link. [ 106.657064] bonding: bond0: Adding slave eth3. [ 106.746950] bonding: bond0: enslaving eth3 as a backup interface with a down link. [ 106.794387] bonding: bond0: Adding slave eth4. [ 106.885032] bonding: bond0: enslaving eth4 as a backup interface with a down link. [ 106.996730] network[1339]: Bringing up interface bond0:r ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Device eth5 does not seem to be present, delaying initialization. [ 108.901785] igb: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 108.919273] bonding: bond0: link status definitely up for interface eth3, 1000 Mbps full duplex. [ 109.050373] igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 109.060950] igb: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 109.129793] bonding: bond0: link status definitely up for interface eth2, 1000 Mbps full duplex. [ 109.140443] bonding: bond0: link status definitely up for interface eth4, 1000 Mbps full duplex. Maybe it would be better if udev waited until all network drivers have completed their initialization sequence, before it starts renaming? Also, I wonder why the Intel network driver chooses "rename7" as interface name here (see 11.861296 time stamp).
The automatic udev network renaming is removed in rawhide already, and will not exist in future releases. It causes more problems than it solves. I doubt the problems in earlier releases can or will ever be fixed properly. Sorry for the mess, if you need predictable names, please edit the rules file to use other names as the the kernel names (not ethX) to name the devices. We can not operate in the same namespace as the kernel and expect it to work.
The network renaming feature is proving troublesome for certain deployment scenarios, with the effects extending beyond Fedora. Kay, would you be so kind as to you advise as to which version of udev no longer contains this feature - or perhaps point to the relevant commit hash?
It was disabled in Fedora 17, and Fedora 18 will have systemd's udev which will not provide the old and racy renaming logic triggered by udev. Use biosdevname, or HWADDR= in the sysconfig scripts, or write your own udev rules which rename the devices. But better never try to use the ethX or any other kernel namespace, name the devices after their function like internal, dmz, or whatever fits, but trying to keep ethX stable can never work reliably, and will not be supported by any future tool.
Kay Sievers wrote: > It was disabled in Fedora 17 I have the same issue on Fedora 17.
(In reply to comment #4) > Kay Sievers wrote: > > > It was disabled in Fedora 17 > > I have the same issue on Fedora 17. Sure, you do, if you have old rules files which try to rename kernel-created interface names to other names in the same kernel ethX namespace. This can all no longer work. The rules file needs to be manually removed, or edited to contain names other than ethX as target names. Sorry, this can only be solved manually, there is no way to mess from RPN with user config.
> Kay Sievers quoted/wrote: >>> It was disabled in Fedora 17 >> I have the same issue on Fedora 17. > Sure, you do, if you have old rules files which try to rename kernel-created > interface names to other names in the same kernel ethX namespace. > This can all no longer work. The rules file needs to be manually removed, > or edited to contain names other than ethX as target names. > Sorry, this can only be solved manually, there is no way to mess from RPN > with user config. There is no user config. It is a fresh installed Fedora 17. In the most cases after a boot I have following interfaces: | em1 | lo | p2p1 | p2p2 | p2p3 | p2p4 | p4p1 After other reboots I have following interfaces (example): | em1 | lo | p2p1 | p2p3 | p2p4 | p4p1 | rename2
Hmm, if it's there, what's the content of: /etc/udev/rules.d/70-persistent-net.rules ? If it isn't there, some other rule is trying that, check: grep NAME= /etc/udev/rules.d/*.rules /lib/udev/rules.d/*.rules If there is only 60-net.rules left, check your: /etc/sysconfig/network-scripts/ifcfg-* files, if they contain instructions to rename things to kernel names. These need to be fixed then, we cannot rename *to* ethX, only *from*.
Kay Sievers wrote: > Hmm, if it's there, what's the content of: > /etc/udev/rules.d/70-persistent-net.rules > ? It's not there. > If it isn't there, some other rule is trying that, check: > grep NAME= /etc/udev/rules.d/*.rules /lib/udev/rules.d/*.rules The result: | /lib/udev/rules.d/10-dm.rules:KERNEL=="device-mapper", NAME="mapper/control" | /lib/udev/rules.d/71-biosdevname.rules:NAME=="?*", GOTO="netdevicename_end" | /lib/udev/rules.d/71-biosdevname.rules:# using NAME= instead of setting INTERFACE_NAME, so that persistent | /lib/udev/rules.d/71-biosdevname.rules:PROGRAM="/sbin/biosdevname --policy physical -i %k", NAME="%c", OPTIONS+="string_escape=replace"
What's the output of: biosdevname -d ?
Created attachment 678986 [details] Output of "biosdevname -d"
There seem no other sources of device naming on the system than biosdevname. One possible explanation would be, that biosdevname returns identical names for two different devices. The debug output looks suspicious, that there are two e1000 devices with consecutive MAC address numbers, but one of them gets an onboard name, and the other one doesn't. Re-assigning to biosdevname, as it seems to be the only active component here.
This message is a reminder that Fedora 16 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 16. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '16'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 16's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 16 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Hi, could you please attach the output of the following from the system 1. dmidecode 2. biosdecode 3. lspci -tv and lspci -xxxvvv 4. The content of /etc/sysconfig/network-scripts/ifcfg-* (corresponding to comment #10) The issue is that one of the interfaces is named as 'renameN' across multiple reboots. Is the understanding correct ?
Created attachment 684472 [details] Output of "dmidecode"
Created attachment 684473 [details] Output of "biosdecode"
Created attachment 684474 [details] Output of "lspci -tv"
Created attachment 684475 [details] Output of "lspci -xxxvvv"
Created attachment 684476 [details] Content of "/etc/sysconfig/network-scripts/ifcfg-*"
Narendra K. wrote: > The issue is that one of the interfaces is named as 'renameN' across > multiple reboots. Is the understanding correct ? Yes, this is correct.
(In reply to comment #6) > > Kay Sievers quoted/wrote: > There is no user config. It is a fresh installed Fedora 17. In the most > cases after a boot I have following interfaces: > > | em1 > | lo > | p2p1 > | p2p2 > | p2p3 > | p2p4 > | p4p1 Looking at the attached 'dmidecode' output the above names seem to be correct. Biosdevname depends on BIOS provided SMBIOS type 41 records to name onboard interfaces and type 9 records to name add-in interfaces.In the absence of type 9 records, biosdevname uses the 'slot #' from the PCI 'SltCap' structure of the parent device. The issue description states that the system has two onboard network interfaces. But the 'dmidecode' output shows that the system has only one 'type 41' record. So biosdevname has named only one interface as 'em1'. The 'dmidecode' shows that there are no type 9 records in the system for the add-in network adapters. From the attached 'lspci -xxxvvv' output, observe the Slot #2 and Slot #4 (being used to name p2p1 and p4p1) 00:01.1 PCI bridge: Intel Corporation Ivy Bridge PCI Express Root Port (rev 09) (prog-if 00 [Normal decode]) [...] SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise- ------------> Slot #2, PowerLimit 75.000W; Interlock- NoCompl+ 00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b5) (prog-if 00 [Normal decode]) [...] SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise- ------------> Slot #4, PowerLimit 10.000W; Interlock- NoCompl+ > > After other reboots I have following interfaces (example): > > | em1 > | lo > | p2p1 > | p2p3 > | p2p4 > | p4p1 > | rename2 It would be helpful to know 1. Is the issue seen if 'biosdevname' version '0.4.0' is used ? It is available download at the following link - http://linux.dell.com/biosdevname/biosdevname-0.4.0/biosdevname-0.4.0.tar.gz a) Please uninstall 'biosdevname-0.4.1' from Fedora 17 (yum remove biosdevname) b) delete /etc/udev/rules.d/70-persistent-net.rules' file if any c) tar zxvf biosdevname-0.4.0 cd biosdevname-0.4.0 ./configure make && make install It is required to install "pciutils-devel" and "zlib-devel" for compilation to succeed. 2. I observe that the issue description mentions ethN names on Fedora 16. Fedora 16 also has biosdevname. Was biosdevname=0 passed in Fedora 16 to get ethN names ? If yes, it would be great if you could pass 'biosdevname=0' to Fedora 17 and verify if the issue is seen with ethN names also. This would eliminate/confirm biosdevname from the scenario. ( On a fresh install of Fedora 17 it is enough to pass 'biosdevname=0' to get eth names. But on an alredy installed system with 'em' names, all the corresponding ifcfg-* files need to altered to suit the 'eth' names and any existing '70-persistent-net.rules' needs to be removed)
Narendra K wrote: > 1. Is the issue seen if 'biosdevname' version '0.4.0' is used ? No. I have installed "biosdevname" version "0.4.0". The issue is not seen after ten reboots with ten checks. The installation of "biosdevname" version "0.4.0" was the solution for the problem on Fedora 17. The names of the interfaces have changed with using the "old" "biosdevname": | em1 | lo | p2p4 | p2p5 | p2p6 | p2p7 | p4p1
(In reply to comment #21) > Narendra K wrote: > > > 1. Is the issue seen if 'biosdevname' version '0.4.0' is used ? > > No. > > I have installed "biosdevname" version "0.4.0". The issue is not seen after > ten reboots with ten checks. The installation of "biosdevname" version > "0.4.0" was the solution for the problem on Fedora 17. > Thanks. Could you please share the findings from trying Point 2 from comment #20 ?
Created attachment 686918 [details] Biosdevname 0.5.0 Can you try the newest version of biosdevname?
Narendra K. wrote: > Thanks. Could you please share the findings from trying Point 2 from comment > #20 ? Where I have to write "biosdevname=0" to test this?
(In reply to comment #24) > Narendra K. wrote: > > > Thanks. Could you please share the findings from trying Point 2 from comment > > #20 ? > > Where I have to write "biosdevname=0" to test this? It needs to be added as a kernel command line parameter in the GRUB. (Please ensure that all the relavant ifcfg-em* and ifcfg-p* files are modified to suit the ethN naming). For a fresh install, it needs to be passed to the installer (like any other parameter).
Jordan Hargrave wrote: > Biosdevname 0.5.0 > Can you try the newest version of biosdevname? Done. Also "biosdevname" version "0.5.0" seems to have the same problem. The interface-names after three reboots: | em1 | lo | p2p1 | p2p2 | p2p3 | p4p1 | rename7 I have changed back to "biosdevname" version "0.4.0" and have made 10 reboots and 10 checks again. The problem no more exists.
(In reply to comment #20) [...] > 2. I observe that the issue description mentions ethN names on Fedora 16. > Fedora 16 also has biosdevname. Was biosdevname=0 passed in Fedora 16 to get > ethN names ? If yes, it would be great if you could pass 'biosdevname=0' to > Fedora 17 and verify if the issue is seen with ethN names also. This would > eliminate/confirm biosdevname from the scenario. ( On a fresh install of > Fedora 17 it is enough to pass 'biosdevname=0' to get eth names. But on an > alredy installed system with 'em' names, all the corresponding ifcfg-* files > need to altered to suit the 'eth' names and any existing > '70-persistent-net.rules' needs to be removed) I have added "biosdevname=0" as kernel-parameter. After ten reboots and ten checks the interface-naming was always: | eth0 | eth1 | eth2 | eth3 | eth4 | eth5 | lo Now I using "biosdevname" in version "0.4.0" again without problems.
Seeing the same problem with Fedora 18. For me it messes up if the machine is rebooted but not if it is shutdown and booted again. I'm using a twin-port Intel card which uses the e1000e driver.
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.
*** Bug 911012 has been marked as a duplicate of this bug. ***
I just came across this issue in Fedora 17, on a system with 4 ethernet connections in a single PCI slot. In my case, udevd was trying to rename the 4th device em3, even though the 3rd device already had the name em3, even though biosdevname returns em4 for that device. I am not sure how often this occurs because the 4th device is not used. From /var/log/messages: Mar 11 14:25:52 spiega48 NetworkManager[1286]: <info> (rename5): bringing up device. Mar 11 14:25:52 spiega48 udevd[775]: error changing net interface name rename5 to em3: Device or resource busy Mar 11 14:25:52 spiega48 NetworkManager[1286]: <info> (rename5): preparing device. # /sbin/biosdevname --policy physical -i rename5 em4 # lspci|grep Ether 04:00.0 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42) 04:00.1 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42) 04:00.2 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42) 04:00.3 Ethernet controller: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter (rev 42)
Same here... on Fedora 17 x86_64. # rpm -q biosdevname kernel biosdevname-0.4.1-1.fc17.x86_64 kernel-3.7.9-101.fc17.x86_64 From dmesg output: [ 109.900717] ixgbe 0000:04:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [ 109.900842] IPv6: ADDRCONF(NETDEV_CHANGE): p1p1: link becomes ready [ 110.001483] ixgbe 0000:04:00.0 p1p1: NIC Link is Down [ 110.801560] ixgbe 0000:04:00.0 p1p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [238340.364987] device p1p1 entered promiscuous mode [238340.420691] device p1p1 left promiscuous mode [238469.243312] device p1p1 entered promiscuous mode [238469.285261] device rename4 entered promiscuous mode [239729.406751] device p1p1 left promiscuous mode [239729.444706] device rename4 left promiscuous mode # biosdevname --policy physical -i rename4 p1p2 This is clearly a problem for many people across several releases of Fedora. Can we get some traction on this?
Could this be related to BZ # 831955 ? If so, the version of dracut I'm using is dracut-018-105.git20120927.fc17.noarch
Hello, Could you please try the latest biosdevname from the repository here - http://linux.dell.com/cgi-bin/gitweb/gitweb.cgi?p=biosdevname.git;a=summary ?
Is there any kind of debug logging that could be enabled to trace this during boot? Could the presence of HWADDR in ifcfg-* files somehow be interfering?
(In reply to Orion Poplawski from comment #35) > Is there any kind of debug logging that could be enabled to trace this > during boot? > Enabling udev debugging might provide more details. > Could the presence of HWADDR in ifcfg-* files somehow be interfering? I am not sure how this could have an impact unless what biosdevname suggests and DEVICE= name as configured in ifcfg-* file differ. It would be great if you could give the latest biosdevname from here a try and let us know the result - http://linux.dell.com/cgi-bin/gitweb/gitweb.cgi?p=biosdevname.git;a=summary ?
While looking into the issue, it seems like there is possibility of 'addslot' function returning same value for two or more interfaces (same dev->index_in_slot), causing the interfaces on a given PCI slot get same port number. This could trigger a rename to a different name space (such as renameN). get_pci_devices set_pci_slots addslot (point 1 below) if (dev->physical_slot == 0) { dev->embedded_index_valid = 1; dev->embedded_index = addslot(state, 0); } else if (dev->physical_slot != PHYSICAL_SLOT_UNKNOWN) { dev->index_in_slot = addslot(state, dev->physical_slot); <------- 1 } Looking into this further.
Created attachment 769344 [details] test patch
Hello, could you please try the test patch from comment #38 and share the results ?
Created attachment 774360 [details] /var/log/messages from boot Sorry for the delay, the machine with the problem here is our main firewall so I can't reboot it too much. First a note - after updating biosdevname or modifying /etc/udev/udev.conf, you need to recreate the initramfs in order to get the changes active at boot. I tried one reboot with a patched biosdevname and udev_log="info" and it worked, so so far so good. I'll keep a close eye on it going forward. I'm attaching the contents of /var/log/messages from the boot. Perhaps the udev info there will be instructive.
Created attachment 774460 [details] test rpm
Hello Rudolf/Wolfgang, Could you please give the test rpm from comment #41 a try and share the results ?
(In reply to Orion Poplawski from comment #40) > Created attachment 774360 [details] > /var/log/messages from boot > > Sorry for the delay, the machine with the problem here is our main firewall > so I can't reboot it too much. > > First a note - after updating biosdevname or modifying /etc/udev/udev.conf, > you need to recreate the initramfs in order to get the changes active at > boot. > > I tried one reboot with a patched biosdevname and udev_log="info" and it > worked, so so far so good. I'll keep a close eye on it going forward. > Thank you for testing the patch. With the patch applied to biosdevname, was the issue seen ? Also, while testing the patch, it will also be useful to test without udev debugging enabled (without udev_log="info" passed. It is useful when the issue is seen). Also, i have attached a test rpm containing the patch in comment #41. Thought it might be useful.
(In reply to Narendra K from comment #42) > Could you please give the test rpm from comment #41 a try and share the > results ? Sorry, I have no machine with many ethernet-interfaces to test this at this time.
FWIW - I has since upgraded to F19 + the patched biosdevname and have not seen this problem since with a handful of reboots.
This message is a reminder that Fedora 18 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 18. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '18'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 18's end of life. Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 18 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 18's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.