Bug 531074

Summary: ethernet device renaming fails
Product: [Fedora] Fedora Reporter: Bill Nottingham <notting>
Component: udevAssignee: Harald Hoyer <harald>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: a3, dipak.dudhabhate, dougsland, dpicard, eric-bugs2, felix, gansalmon, gianluca.cecchi, harald, itamar, jason.donald.burgess, jcm, jcm, jik, jonathan, jvillalo, kelvin, kernel-maint, narf, onion, oron, pasik, redhat-bugzilla, rvokal, seansh, thatch45, thieme.reis, trygve.l.thorkildson, www05, yaneti
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: udev-145-19.fc12 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-12-03 22:40:46 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Bill Nottingham 2009-10-26 16:00:58 EDT
Description of problem:

I have in this box:
00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network Connection (rev 02) - 00:16:76:D6:CA:82
07:04.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 05) - 00:02:B3:9D:C0:AE
07:05.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 05) - 00:02:B3:9D:C0:AF

My udev persistent net rules are:

# PCI device 0x8086:0x104b (e1000e) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:16:76:d6:ca:82", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x8086:0x1229 (e100)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:02:b3:9d:c0:af", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2"

# PCI device 0x8086:0x1229 (e100)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:02:b3:9d:c0:ae", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

With the switch to non-debugging kernels, this sporadically fails when e100 is loaded first:
udevd-work[761]: error changing netif name eth2 to _rename: File exists

Normally, this works as follows:
e1000e: Copyright (c) 1999-2008 Intel Corporation.
e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
e1000e 0000:00:19.0: pci_enable_pcie_error_reporting failed 0xfffffffb 
e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
e100: Copyright(c) 1999-2006 Intel Corporation
e100 0000:07:04.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
e100 0000:07:04.0: PME# disabled
e100: eth0: e100_probe: addr 0x90001000, irq 21, MAC addr 00:02:b3:9d:c0:ae
e100 0000:07:05.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
e100 0000:07:05.0: PME# disabled
e100: eth1: e100_probe: addr 0x90000000, irq 22, MAC addr 00:02:b3:9d:c0:af
0000:00:19.0: eth2: (PCI Express:2.5GB/s:Width x1) 00:16:76:d6:ca:82
0000:00:19.0: eth2: Intel(R) PRO/1000 Network Connection
0000:00:19.0: eth2: MAC: 6, PHY: 6, PBA No: ffffff-0ff  
udev: renamed network interface eth0 to eth1
udev: renamed network interface eth2 to eth0
udev: renamed network interface _rename to eth2

This works with kernel-2.6.31.5-91.rc1.fc12.x86_64.

Version-Release number of selected component (if applicable):

kernel-2.6.31.5-96.fc12.x86_64

How reproducible:

50%

Steps to Reproduce:
1. Reboot
Comment 1 Bug Zapper 2009-11-16 09:22:23 EST
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 2 Alexandre Thieme Reis 2009-11-22 11:59:06 EST
I have the same problem
Comment 3 Jonathan McDowell 2009-11-25 14:44:51 EST
I have this problem on a clean Fedora 12 install.
Three eth devices - they were eth0, eth1 eth2
The boot process gives the error described above and I end up
with _rename, eth0 and eth2.

Not sure what the workaround is... eth2 does seem to have an ipv6 address
but no ipv4 address, and is not successfully doing anything, so I am pretty hosed right now. How do I get _rename back to being called eth1??

This seems a significant problem to me since it happened on a simple install from the FC12 install DVD iso with no user tweaking.
Comment 4 Jonathan McDowell 2009-11-25 16:36:15 EST
Hmm... seemed to go away after rebooting - now all happy. But I don't know why it happened or how to recover again if it comes back :-)
Comment 5 chippey 2009-11-29 02:15:45 EST
Same problem.  Rebooting does not help.  Seems to be random over my 6 interfaces (eth0-eth5), but it happens every start up.
Comment 6 Alexandre Thieme Reis 2009-11-29 06:33:55 EST
Upgrade to udev from update-testing repository solve this problem.
Comment 7 Alexandre Thieme Reis 2009-12-01 07:05:31 EST
correction, upgrade to udev from update-testing repository DO NOT solve this problem, on boot sometimes work, sametimes do not work
Comment 8 Pasi Karkkainen 2009-12-05 08:35:07 EST
I'm also seeing this problem on F12.
Comment 9 Pasi Karkkainen 2009-12-05 08:40:43 EST
I just tried this:

# yum --enablerepo=updates-testing update udev
..
No Packages marked for Update

So I'm already running the latest udev package. This problem happens on _most_ reboots..
Comment 10 Nathan G. Grennan 2009-12-11 02:23:59 EST
I am seeing this bug too on a server with four ethernet adapters.
Comment 11 Pasi Karkkainen 2009-12-13 11:22:56 EST
My F12 box having this problem has four nics aswell.. sometimes renaming fails for eth0, sometimes for eth2..
Comment 12 John Villalovos 2009-12-14 14:10:18 EST
I found this work around on another site:
---------------------------------------------------------------
Use /etc/udev/rules.d/10-local.rules (yes, that number in the filename is important).

# http://www.reactivated.net/writing_u...#example-netif
# Match based on MAC address:
# udevinfo -a -p /sys/class/net/eth0 | grep -i address
# udevinfo -a --name /dev/ttyUSB0
# http://forums.gentoo.org/viewtopic-t-489863.html (howto guide)
# http://forums.gentoo.org/viewtopic-t-512201.html
# MAC addresses must be in lower-case.
SUBSYSTEM=="net", ATTR{address}=="00:1e:60:51:fa:a0", NAME="eth0"
SUBSYSTEM=="net", ATTR{address}=="00:1c:60:52:07:1c", NAME="eth1"
---------------------------------------------------------------

This worked for me.  Not sure what the underlying issue is.
Comment 13 Bill Nottingham 2009-12-14 14:29:33 EST
That seems odd, as that's the same sort of rules that are already in 70-persistent-net.rules.
Comment 14 Trygve Thorkildson 2009-12-16 10:55:27 EST
I cleared the NIC descriptions from 70-persistent-net.rules and found that 70-persistent-net.rules is no longer updated during a reboot. I added and removed cards with no change in 70-persistent-net.rules. I tried the 10-local.rules work around with intermittent results. It seemed to reduce the frequency of the _rename, but it still occasionally occurs.
Comment 15 Fran Taylor 2009-12-17 03:45:18 EST
This is happening to me on two different systems.  The behavior is random.  

Sometimes the machine boots up fine.  

Sometimes it takes a very long time to boot, the network interface names are messed up, and networking just doesn't work at all because none of the interface names line up with their defined MAC addresses.  

It seems to happen more frequently on fresh power-ups and less frequently on reboots, but maybe that's just my bad luck.

This is a show-stopper bug for me, it is cutting badly into my limited time.  If it isn't fixed soon I will be forced to downgrade to Fedora 11.
Comment 16 Gianluca Cecchi 2009-12-17 04:21:46 EST
I have this problem too, after upgrading from F11.
A question: I have in my ifcfg-eth* files something like this:
HWADDR=xxxxx
NM_CONTROLLED=no
IPADDR=.....

etc

Why udev tries to change assignement at all?
What changed from F11? Where to find docs about these changes?
I thought that if you have HWADDR in ifcfg-eth file, this drives the assignement...
Is it not so anymore?
Comment 17 Gianluca Cecchi 2009-12-17 05:19:23 EST
Probably there are some problems with the cache dir /dev/.udev???
In my case after removing completely it, make desired changes to 70-persistent-net.ruls file and rebooting, it seems ok.
The .udev dir gets regenerated, so probably it you makes any change lately to your interface config, you have to delete again the /dev/.udev tree....?
Comment 18 Trygve Thorkildson 2009-12-18 06:57:44 EST
I have tried both work arounds suggested and still have the same _rename problem. I am going back to Fedora 11!!
Comment 19 Thomas S Hatch 2009-12-18 17:05:51 EST
This bug is a serious show stopper, and none of these work arounds seem to fix it.
I upgraded my kernel and udev to rawhide, but the problem persists.
Comment 20 Kelvin J. Hill 2009-12-20 05:28:26 EST
You can force the order in which modules are loaded into the kernel to probe ethernet cards. What you do is edit /lib/modules/`uname -r`/modules.dep and make the first ethernet driver a dependency of the second one, and so on.

While there is actually no real dependency, it can alter the timing such that they always enumerate in a known order. This prevents udev having to perform the rename function as long as the 70-persistent-net.rules file is already in the order you desire.

Your mileage may vary.
Comment 21 Fran Taylor 2009-12-22 09:06:01 EST
I tried the suggestion in #20, but I still had problems.

It appears that the NIC drivers are loaded in parallel threads, so they contend for the device names.

Perhaps this comes from an overzealous attempt to speed up boot times?

I was able to eliminate the contention by staggering the loading of the modules.

You can do this in /etc/modprobe.conf, for example:

install e1000 sleep 2 ; /sbin/modprobe --ignore-install e1000

This will delay the loading of the e1000 module by 2 seconds, allowing the other driver to load first.
Comment 22 Thomas S Hatch 2009-12-22 10:30:30 EST
I got a workaround working, see bug 544357
https://bugzilla.redhat.com/show_bug.cgi?id=544357
Comment 23 Bill Nottingham 2010-01-04 14:56:59 EST
(In reply to comment #21)
> I tried the suggestion in #20, but I still had problems.
> 
> It appears that the NIC drivers are loaded in parallel threads, so they contend
> for the device names.

This is known; the udev device renaming rules account for this and should handle the situation properly and rename the devices to their persistent names. For some reason, that's not working.
Comment 24 John Villalovos 2010-01-04 15:02:51 EST
So my workaround in comment #12 did not keep working for me.  So I am back to square one.
Comment 25 Thomas S Hatch 2010-01-04 15:44:17 EST
John, my workaround works well, it is hackish, but it works.  Check the link in comment #22
Comment 26 David Picard 2010-01-12 11:47:55 EST
I have the same problem with 7 ethernet devices in my machine - something appears to initialize the devices with the wrong settings before udev reads my network rules.

My tulip cards have the lower device names (eth0-eth3) while my sky2 devices have the upper device names (eth4-eth6)

To avoid this issue, I added the following content to a file in /etc/modprobe.d to force the tulip drivers to load first:

install sky2 /sbin/modprobe tulip; /bin/sleep 2; /sbin/modprobe --ignore-install
 sky2

... and execute the following on every reboot (thinking of making an init script for this with just the stop option):

rm -rf /dev/.udev

It is pretty clear udev no longer applies the configuration settings specified in the rules any longer - this is a very serious defect. Since the comments here indicate the /dev/.udev is a cache, it seems likely that we have several  potential problems.
1)  The write to cache is corrupted - the devices are initialized with an erroneously saved device name from the cache.
2)  Device names are not read from the cache and some arbitrary mechanism is used to issue device names.
3)  The load from cache may not accommodate configuration changes in the udev rules.d configuration
4)  Some portion of the code is aggressively forcing consecutive numbering of devices without considering whether or not all devices of a class are loaded - possibly an old defect introduced when threading the device module loading was introduced (assuming the comments in this and other discussions regarding threading are correct)

Note that I have the same issue with my webcams and TV card not maintaining consistent video* naming despite also having rules defined regarding the device naming and also having device ordering set in the modprobe configuration for those modules that support it.
Comment 27 Eric Hopper 2010-03-01 20:38:46 EST
Thank you, the comments here have helped.  I got my 5 ethernet card setup working, and I would direct you to this comment on Bug #544357 (which is related) to learn how:

https://bugzilla.redhat.com/show_bug.cgi?id=544357#c7
Comment 28 Fedora Update System 2010-03-19 07:13:36 EDT
udev-145-16.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/udev-145-16.fc12
Comment 29 Fedora Update System 2010-03-23 19:39:14 EDT
udev-145-19.fc12 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update udev'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/udev-145-19.fc12
Comment 30 Fedora Update System 2010-03-25 18:34:38 EDT
udev-145-19.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 31 Dipak Dudhabhate 2010-05-12 12:14:43 EDT
Even with the udev-145-19.f12 i am getting same problem
Comment 32 Dipak Dudhabhate 2010-05-12 13:25:19 EDT
Even with the udev-145-20.f12 i am getting same problem. Any workaround above mentioned by people is not  worked for me.
Comment 33 Thomas S Hatch 2010-05-12 13:32:26 EDT
Same problem here on Fedora 13, this bug needs top be reopened!  Also, my earlier workaround no longer works because dracut loads modules that are blacklisted, which is bogus.
Comment 34 Felix Kaechele 2010-05-13 05:38:07 EDT
Then how about just reopening the bug? If the person responsible thinks it should be closed again he can do so. Don't be so shy ;)
Comment 35 Dipak Dudhabhate 2010-05-15 04:39:19 EDT
My problem solved, i have update kernel-firmware rpm. There was problem with Boradcom Network firmware which solved after installation of this rpm.

This updated rpm. Thanks to fedora for updates
Comment 36 Oron Peled 2010-10-03 18:10:45 EDT
* This looks like a duplicate of bug 544357. And it also appear in Debian
  lenny (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=555960)
  However...

* One of my clients had a testing server with 9 ethernet cards (of two
  different types -- two different drivers):
   - The bug recurred many times on this machine (was F12, not updated)
   - After updating to udev-145-21 the situation improved, but there were
     still cases where the bug occurred.
   - I rebuilt a new RPM with the upstream patch:
         http://git.kernel.org/?p=linux/hotplug/udev.git;a=commit;h=09c03103028011935044bbade29a602925898f27
         [rename interface to <src>-<dest>, if <dest> taken
          from 2010-08-10, by Harald Hoyer]
   - That 3-liner patch seems to finally kill this bug.

* I suggest applying this small patch until Fedora package is made of an
  upstream version that already include the fix.
Comment 37 Bug Zapper 2010-11-04 05:05:48 EDT
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 38 Oron Peled 2010-11-06 12:48:00 EDT
The test system I mentioned in comment 36 is not upgraded, so I cannot
test this in newer releases for now.

However, two data points for anyone reading it later:
 * My client encountered the renaming problem was recurring if the boot
   happened while the network interfaces were connected (every 2-3 boots).
   So even with the patch linked in this comment, the solution is not perfect.
   [my guess is that the connection simply changed the timing and as a result
    simply made the race condition show]

 * Matt Domsch slides from LPC-2010 suggest the problem is still not solved
   even in newest kernels/udev:
     http://domsch.com/linux/lpc2010/lpc2010-network-device-naming.pdf
Comment 39 Jon Masters 2010-11-18 03:33:57 EST
Matt's talk is mostly focused on naming supplied by the hardware, not this bug.
Comment 40 Bug Zapper 2010-12-03 22:40:46 EST
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.