Bug 491432 - Reboots causes unpredictable enumeration of Ethernet drivers causes eth0 to be random
Summary: Reboots causes unpredictable enumeration of Ethernet drivers causes eth0 to b...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.3
Hardware: All
OS: Linux
low
urgent
Target Milestone: rc
: ---
Assignee: Bill Nottingham
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 533192
TreeView+ depends on / blocked
 
Reported: 2009-03-20 22:53 UTC by Mick Russom
Modified: 2018-12-02 17:50 UTC (History)
33 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-10-15 19:27:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Mick Russom 2009-03-20 22:53:22 UTC
Description of problem:

I have several systems experiencing this issue, but the "worst case" system at the moment is a system where there are 2 dual-port PCI-X e1000 cards, and two on-board e1000e PCI-express card. 

What happens is maddening. If the /etc/sysconfig/hwconf is removed, and kudzu is not run, and the 6 interfaces each have a corresponding script:
/etc/sysconfig/network-script/ifcfg-eth0 
/etc/sysconfig/network-script/ifcfg-eth1
/etc/sysconfig/network-script/ifcfg-eth2
/etc/sysconfig/network-script/ifcfg-eth3
/etc/sysconfig/network-script/ifcfg-eth4
/etc/sysconfig/network-script/ifcfg-eth5

with each one looking like (eth0 as an example):
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
NETMASK=255.255.255.0
IPADDR=172.30.0.1
TYPE=Ethernet

(note intentional omission of HWADDR)

And the modprobe.conf looks like:
alias eth0 e1000e
alias eth1 e1000e
alias eth2 e1000
alias eth3 e1000
alias eth4 e1000
alias eth5 e1000


Now what happens defies explanation. The only way I can be sure to make the e1000e first-listed-in-pci-order e1000e device always eth0 is to remake the initrd and add e1000e to the initrd. 

If I do not do that, eth0 might be the onboard night, it can also be the first port on either of the dual port cards. So eth0, eth2 and eth4 all take turns being eth0. 

This was not an issue in the EL-5.1 timeframe. This is certainly new to the EL-5.2 timeframe and thus far is not possible to control because the system does not load eth0's driver alias in order, and the enumeration of the PCI devices causes what seems to be random labeling of the first interface.

This is a particular problem in that identical hardware systems will come up with different eth0 interfaces , and on a single system, without the /etc/sysconfig/hwconf (which calls "eth0" something different from what was originally in the /etc/sysconfig/network-scripts/ifcfg-eth0 HWADDR directive).

For my needs, I need as little auto-magic as possible, I can live with PCI-bus order and driver load order is the order ethernet devices appear in, I cannot use the software in a state where what is "eth0" is randomly determined and the only way I can nail it down is to run kudzu, generate a hwconf AND brand the network interfaces in the network-scripts with HWADDR. I've been using EL since 6.2 and this is the first time I've ever encountered a situation where the enumeration rules (which are ancient at this point) over-ridden by something. 


Version-Release number of selected component (if applicable):
EL-5.2 series. 


How reproducible:
The random enumeration of PCI devices and the subsequent assigning of eth0 to a random NIC is not on every reboot, and does not seem tied to a power off scenario vs a simple reset or warm reboot. Over the course of 0 reboots , eth0 should have moved between 2-3 times. 


Steps to Reproduce:
1. Install a system with a mix of e1000e and e1000 nics.
2. Remove /etc/sysconfig/hwconf ; manually assign IPs in /etc/sysconfig/network-scripts/ifcfg-ethX, make sure HWADDR is not present.
3. Be sure the order of the drivers is correct in /etc/modprobe.conf
4. Reboot and look for eth0 being assigned to a random PCI-device.

Actual results:
Results are eth0 moves. 

Expected results:
For every version of Redhat previous to EL 5.2, the enumeration of eth0 was 
a) First loaded drive
b) That first loaded driver attaches itself to the first device in the PCI-bus-order and calls that eth0

Additional info:
SysVinit-2.86-14
initscripts-8.45.19.1.EL-1
kernel-2.6.18-92.1.22.el5
net-tools-1.60-78.el5
system-config-network-1.3.99.10-2.el5
udev-095-14.16.el5

Comment 1 Prarit Bhargava 2009-03-26 17:51:21 UTC
Hi Mick,

>(note intentional omission of HWADDR)

You *must* specify the HWADDR field in the ifcfg-* files in order to have persistent ethernet naming.

While it may have worked in the past there is no guarantee (in RHEL) that the timing of device driver loading will have consistency from release-to-release.

P.

Comment 2 lejeczek 2009-03-26 18:12:58 UTC
yeah, me to, I'm having this problem in f10
it's a jetway mobo JNC62K MCP78S [GeForce 8200], and both nics get swapped around ever once(randomly not with each and every reboot) in a while
I'm not sure if iftab didn't help in my case, cannot confirm though
and not, it did not work - I've had HWADDR in ifcfg* always, and upon init bringing inet up errors about misconfiguration in mac addr were risen
cheers

Comment 3 Mick Russom 2009-03-27 05:18:46 UTC
@  Prarit Bhargava 
This is a bug. If udev is broken and disobeys /etc/modprobe.conf, can Redhat then please document udev such that we can have instructions as to how and write rules in order to un-break udev's broken-ness.

Prarit; you can't seriously suggest that RANDOM instantiation of interfaces is something that is a characteristic of an "Enterprise" OS? Seriously, I needed to make a custom initrd to forceload drivers to get past this, its really very undesirable, undocumented buggy behavior.

Comment 4 Mick Russom 2009-03-27 05:19:24 UTC
If this bug is closed again, I am never renewing my RHEL Premium server license again.

Comment 5 Prarit Bhargava 2009-03-27 10:33:53 UTC
(In reply to comment #3)
> @  Prarit Bhargava 
> This is a bug. If udev is broken and disobeys /etc/modprobe.conf, can Redhat
> then please document udev such that we can have instructions as to how and
> write rules in order to un-break udev's broken-ness.
> 

Mick, I certainly understand your point, however, please note that specifying the HWADDR is *required* in order to maintain ethernet naming.  This has changed upstream and AFAIK will no longer be the case in upcoming RHEL6.

> Prarit; you can't seriously suggest that RANDOM instantiation of interfaces is
> something that is a characteristic of an "Enterprise" OS? Seriously, I needed
> to make a custom initrd to forceload drivers to get past this, its really very
> undesirable, undocumented buggy behavior.  

I'm not.  And that's why RHEL5 *requires* the HWADDR field be specified.

In the past, we've seen ethernet ports "flip" because of a minor change to a driver which caused a delay in the module load and port activation.  Obviously, this is undesired so RHEL requires the the HWADDR to be specified for persistent ethernet naming.

The RHEL5 Deployment Guide notes that:

"HWADDR=<MAC-address>
    where <MAC-address> is the hardware address of the Ethernet device in the form AA:BB:CC:DD:EE:FF. This directive is useful for machines with multiple NICs to ensure that the interfaces are assigned the correct device names regardless of the configured load order for each NIC's module."

Also, the Red Hat Knowledge Base contains a mini-FAQ on this subject:

http://kbase.redhat.com/faq/docs/DOC-15331

P.
P.

Comment 6 Prarit Bhargava 2009-03-27 10:37:35 UTC
Oops -- I didn't answer your question about udev.  I'll ping internally to see if there are some rules about setting udev up, or if the only option is to specify the HWADDR field.

Just curious though, Mick, why aren't you specifying the HWADDR field?

P.

Comment 7 Prarit Bhargava 2009-03-27 10:43:45 UTC
"@  Prarit Bhargava 
This is a bug. If udev is broken and disobeys /etc/modprobe.conf, can Redhat
then please document udev such that we can have instructions as to how and
write rules in order to un-break udev's broken-ness."

It turns out what I suspected was correct.  udev simply renames the interfaces with temporary names, and the network scripts (ifup) actually permanently name the interfaces.  That naming happens when the network scripts look at the HWADDR field -- so udev rules won't help you here.

P.

Comment 8 Mick Russom 2009-03-29 05:49:46 UTC
No, udev is doing something where it inserts the e1000e and e1000 drivers in random order.

Period. If you close this bug again, I'm not only not going to buy RHEL, but I'm going to really go on to expose this as a reason NOT to buy support from RH to others. 

Its simple: 
1) /etc/rc.d/init.d/network stop
2) remove e1000,e1000e or in all network modules modules in a given system 
3) /etc/rc.d/init.d/network start

Now, in this situation, the modprobe.conf order WORKS (not like when udev probes the drivers in) and eth0 is ALWAYS the same for a given system (based on pci-order and driver instantiation order in modprobe.conf). 

You are WRONG. I can prove this, and I've been using Redhat Enterprise since 6.2E (enterpise) and I KNOW based on now NINE years of using Redhat what the correct behavior is. And now Mister Johnny Come Lately is telling me NOT A BUG?

Seriously, do you have customer-relations people there or is this place just a bunch of kids doing internships?

Comment 9 Prarit Bhargava 2009-03-29 14:06:00 UTC
Hi Mick,

I'm sorry you're finding this frustrating.

The RHEL5.2 Release Notes contains some additional information that you will find useful:

https://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.2/html/Release_Notes/RELEASE-NOTES-U2-x86-en.html

"n this release, the Intel 82540 – 82547 network cards are supported by the e1000 driver, while the following network cards are supported by the e1000e driver:

    *

      Intel 82571 – 82573
    *

      Intel 82562
    *

      Intel 82566
    *

      Intel 80003eslan

In Red Hat Enterprise Linux 5.1, the e1000 driver supported all the aforementioned network cards.

If your system contains a combination of both types of network cards, ethernet devices may be enumerated in a different order in this release. You may need to modify your network configuration after installation in order to remap each card's hardware address to specific device names if you need to preserve a particular order.

Note that properly configured systems should not have device names changed on upgrade. To ensure this, the network configuration files (/etc/sysconfig/network-scripts/ifcfg-[device name]) should contain the HWADDR parameter, binding the device name to a specific hardware address."


Please note as well, that the version of udev in RHEL5 does NOT support persistent 'net device naming, however, upstream (ie, Fedora) does.

P.

Comment 10 Andy Gospodarek 2009-03-30 14:15:49 UTC
Mick, I've read your description and comments and understand your frustration that what you want to work does not.  Prarit's description in comment #9 of why you are seeing this in 5.2 and it did not appear in 5.1 is absolutely correct.  The movement of PCI-e devices from e1000 to e1000e in 5.2 caused this.  If I didn't make the decision to move those devices you would be yelling *much* louder about the performance of the PCI-e devices that were still using e1000.

In an effort to understand your configuration better, can you tell me why you are doing the following

> 2. Remove /etc/sysconfig/hwconf ; manually assign IPs in
> /etc/sysconfig/network-scripts/ifcfg-ethX, make sure HWADDR is not present.

I'm honestly not trying to be confrontational, but understand why this is helpful for your particular setup.  From what I've seen everything works fine with these files and configuration options in place, but if there is a good reason for you to delete them we would like to understand it so we can:

1.  Work around the issue in this update with the hopes of finding a solution in the next RHEL update.

and

2.  Be as sure as we can be that we create RHEL6 test cases to cover the issue so you or anyone else doesn't have to deal with this again.

Thanks!

Comment 11 static 2009-04-02 18:40:50 UTC
I have a similar issue.  I have a few firewalls deployed that use tg3 (2 internal interfaces on Dell 1950) and igb (4 port card Intel 1000/Pro PT).  After upgrading to 5.3 from 5.2 I noticed that at random on bootup eth0 comes up as tg3 and other times it comes up as igb.  This of course is really bad for a firewall/router with multiple interfaces.

I finally decided to just create a script that I put into rc.local.  It does a dmesg and greps to see if the igb driver is assigned to eth0.  If it is not then the script ifdowns all the interfaces and then rmmod tg3 igb.  It then does a modprobe on tg3 to force that driver to initialize as eth0 and eth1 and then modprobe igb for the other 4 ports.

I do not have the HWADDR put into the ifcfg-eth* files because of issues I had in previous versions of centos that would renumber the interfaces(I don't remember the details unfortunately because it was several years ago).

Is the HWADDR required not to make sure that eth0 comes up with the correct network port as assigned by modprobe.  For example modprobe.conf has eth0 assigned to tg3.

What determines which module gets loaded first?  I am just confused why eth0's driver in modprobe.conf wouldn't get loaded first always.

Comment 12 static 2009-04-02 19:47:50 UTC
Putting in HWADDR seems to make tg3 load first all the time now.  It turns out I had HWADDR on all the other firewalls except this one because this one was a cold standby.  So the ifcfg-eth* was just copied from the other server and the HWADDR was removed(because it would not be correct on the secondary member).

It would be nice to understand why HWADDR and have it documented.  After searching the net this bug report was the only post I could find on the subject.

Comment 13 static 2009-04-02 21:25:26 UTC
Well I was wrong.  The order still gets messed up sometimes on reboot even with HWADDR correctly setup in the ifcfg-eth* files.

Grep for eth from dmesg immediately after reboot when the order is correct:
eth0: Tigon3 [partno(BCM95721) rev 4201 PHY(5750)] (PCI Express) 10/100/1000Base-T Ethernet 00:21:9b:fc:2a:05
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] WireSpeed[1] TSOcap[1]
eth0: dma_rwctrl[76180000] dma_mask[64-bit]
eth1: Tigon3 [partno(BCM95721) rev 4201 PHY(5750)] (PCI Express) 10/100/1000Base-T Ethernet 00:21:9b:fc:2a:06
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] WireSpeed[1] TSOcap[1]
eth1: dma_rwctrl[76180000] dma_mask[64-bit]
igb 0000:04:00.0: eth2: (PCIe:2.5Gb/s:Width x4) 00:1b:21:28:0c:b0
igb 0000:04:00.0: eth2: PBA No: d96950-006
igb 0000:04:00.1: eth3: (PCIe:2.5Gb/s:Width x4) 00:1b:21:28:0c:b1
igb 0000:04:00.1: eth3: PBA No: d96950-006
igb 0000:05:00.0: eth4: (PCIe:2.5Gb/s:Width x4) 00:1b:21:28:0c:b4
igb 0000:05:00.0: eth4: PBA No: d96950-006
igb 0000:05:00.1: eth5: (PCIe:2.5Gb/s:Width x4) 00:1b:21:28:0c:b5
igb 0000:05:00.1: eth5: PBA No: d96950-006


Grep for eth from dmesg immediately after reboot when the order is messed up:
igb 0000:04:00.0: eth0: (PCIe:2.5Gb/s:Width x4) 00:1b:21:28:0c:b0
igb 0000:04:00.0: eth0: PBA No: d96950-006
igb 0000:04:00.1: eth1: (PCIe:2.5Gb/s:Width x4) 00:1b:21:28:0c:b1
igb 0000:04:00.1: eth1: PBA No: d96950-006
igb 0000:05:00.0: eth1: (PCIe:2.5Gb/s:Width x4) 00:1b:21:28:0c:b4
igb 0000:05:00.0: eth1: PBA No: d96950-006
igb 0000:05:00.1: eth2: (PCIe:2.5Gb/s:Width x4) 00:1b:21:28:0c:b5
igb 0000:05:00.1: eth2: PBA No: d96950-006
eth4: Tigon3 [partno(BCM95721) rev 4201 PHY(5750)] (PCI Express) 10/100/1000Base
-T Ethernet 00:21:9b:fc:2a:05
eth4: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] WireSpeed[1] TSOcap[1]
eth4: dma_rwctrl[76180000] dma_mask[64-bit]
eth5: Tigon3 [partno(BCM95721) rev 4201 PHY(5750)] (PCI Express) 10/100/1000Base
-T Ethernet 00:21:9b:fc:2a:06
eth5: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] WireSpeed[1] TSOcap[1]
eth5: dma_rwctrl[76180000] dma_mask[64-bit]

[root@host ~]# grep HWADDR /etc/sysconfig/network-scripts/ifcfg-*
/etc/sysconfig/network-scripts/ifcfg-eth0:HWADDR=00:21:9B:FC:2A:05
/etc/sysconfig/network-scripts/ifcfg-eth1:HWADDR=00:21:9B:FC:2A:06
/etc/sysconfig/network-scripts/ifcfg-eth2:HWADDR=00:1B:21:28:0C:B0
/etc/sysconfig/network-scripts/ifcfg-eth3:HWADDR=00:1B:21:28:0C:B1
/etc/sysconfig/network-scripts/ifcfg-eth4:HWADDR=00:1B:21:28:0C:B4
/etc/sysconfig/network-scripts/ifcfg-eth5:HWADDR=00:1B:21:28:0C:B5

[root@host ~]# more /etc/modprobe.conf 
alias net-pf-10 off
alias ipv6 off
alias eth0 tg3
alias eth1 tg3
alias eth2 igb
alias eth3 igb
alias eth4 igb
alias eth5 igb
alias scsi_hostadapter mptbase
alias scsi_hostadapter1 mptsas
alias scsi_hostadapter2 ata_piix

Comment 14 Mick Russom 2009-04-03 02:35:39 UTC
@Andy Gospodarek regarding the "why" behind this.

I have been for many years maintained a central image used on a large network of lab machines. This allows us to re-deploy the hard drive images for this farm of computers. Now the current image is actually based on Redhat 7.3 with a large amount of fixes derived at mitigating NESSUS identified security issues, and upgrades to the various applications. The image has in place a large number of driver support for storage subsystems based on a 2.4.30 kernel, along with other customizations such as PF_RING.

Now in the case of HWADDR, kudzu and dynamic initrds, I've found ways to support the various hardware configurations with RHEL 5.00-5.1 However, recently, despite having a beefy initrd to support the storage, I've noticed that my configuration which used to support the myriad of hardware in the farm has resulted in random eth0 assignment.

Now the problem is is that I setup everything to work with a small amount of post-deployment work. However, I don't see a clear path to script up the post install changes to fixup the eth? assignment for the various machines. Before we had a system in place where the eth0 in the various configurations (all e1000/e1000e and some tg3 interfaces) were always doing the right thing by default, and either DHCP or a quick post-install static address assignment would take care of getting eth0 connectible.

Now I need to figure out a way to make eth0 the first lan interface on the various motherboards that is connected to the management network. Before, our myriad of configs led to a very deterministic way to do this, and with DHCP as the default on eth0, the image could be deployed to these machines and receive an upgraded or refreshed image without issue.

Now I would have to go and either attempt to do kudzu or write some script that gets all the ordering right per machine and assign eth0 and put in the HWADDR. Its certainly doable, but the previous default eth0 behavior was highly desirable, consistent and consistently picked LAN1 (first lan on motherboard) as eth0 across lots of different platforms.

The /etc/sysconfig/network-scripts/ifcfg-eth* files are pre-populated with configuration for eth0 through eth5, and this would take care of assigning a dummy-ip to each of the extra interfaces and then the extra interfaces could be configured post install.

With each machine, it became "common knowledge" which of the ethX interfaces was what on the various machines, and this always seem to reflect PCI device order in terms of number; e.g., lspci | grep -i net, and the order of that list is the order in which eth assignments were made.

Lately, udev or whatever is behind this, eth0 is randomly assigned if there is no MAC address, and I've tried a number of things to get the systems to respect /etc/modprobe.conf and the order of interfaces there.

Interestingly, if the networks is stopped, the ethernet modules are removed and then the network is restarted the thing works perfect, its something that happens only at boot that cause this eth0 moving around issue, and its really hard to pin down why this behavior is random. I don't get why anything would ever prefer to be random and choose to ignore the hints in /etc/modprobe.conf.

Ultimately the problem is udev's general disregard for /etc/modprobe.conf, and between /etc/sysconfig/hwconf, HWADDR and /etc/modprobe.conf, it seems that all three need to agree before any level of consistency is  achieved.

My biggest problem with the kudzu route is there is no control of what /etc/sysconfig/hwconf refers to as eth0.

For installing RHEL, whatever system is in place seems to work, however, when the normal mechanisms are abandoned in favor of scalability without having to install each machine individually, massive problems arise in the system in place since 5.2. I have not confirmed that 5.3 retains this issue, but it seems based on other comments here it does.

Lets all agree that /etc/modprobe.conf must mean something in the long run. /etc/modules.conf and /etc/modprobe.conf always meant something in the past.

Comment 15 Mick Russom 2009-04-03 02:43:44 UTC
In the past, the following have been offered as solutions for this problem:

in /etc/udev/rules.d/10-local.rules  :

SUBSYSTEM=="pci", SYSFS{class}=="0x020000",             OPTIONS="ignore_device"

I may try this, in addition, these have been suggested:
1. probe driver for eth0 in rc.sysinit before udev is started
2. have udev ignore network cards (ifup will probe in right order)
3. use netdev=irq,io,eth0 and netdev=...,eth1 kernel parameters

I like having udev ignore NIC cards, and if I can workaround this by having udev butt-out, it seems the ifup/etc/sysconfig/network-scripts do the right thing and respect /etc/modprobe.conf

What's really frustrating here is that udev is supposed to manage /dev/, and its trying to be too clever here, since ethernet devices don't even appear in /dev/ anywhere.

Comment 16 Mick Russom 2009-04-03 03:00:31 UTC
Quick comment here ; 

If one doesn't like the udev invoked renaming of devices, and one wants to simply stop anything from happening in terms of modprobing ethernet drivers until the normal long standing ifup scripts start to do their thing, I had an idea as a workaround, which related to point 2. above which is an extension of the concept behind the rewrite above (ignore_device). 

In RHEL 5.x , the file that controls udev's behavior regarding this is:
/etc/udev/rules.d/60-net.rules 

ACTION=="add", SUBSYSTEM=="net", IMPORT{program}="/lib/udev/rename_device"
SUBSYSTEM=="net", RUN+="/etc/sysconfig/network-scripts/net.hotplug"

My plan is to replace /lib/udev/rename_device (which is part of the initscripts package) with a shell script "/bin/true" . There is , according to the source  rename_device.c, an attempt to manage the case where HWADDR is not present, but I'm not sure that what it is doing is what I want. This for me would be an acceptable work around long term - the idea is to pinpoint the exact culprit as to how devices are modprobed out of order and outside the purview of /etc/modprobe.conf, and if this is the exact place - let me know if this is the place where this is going on. 

My current understanding is:
a) initrd loads modules
b) rc.sysinit goes before udev, and could load modules
c) udev does its thing, which in turn bring rename_device in which may be causing the issue
d) the rest of the scripts are run (including the ifup scripts).

This information would be useful in a general document - the exact steps of the boot procedure, at least in regards to network devices, in addition to its usefulness in the context of this bug.

Comment 17 Bill Nottingham 2009-04-03 04:26:46 UTC
(In reply to comment #16)
> My current understanding is:
> a) initrd loads modules
> b) rc.sysinit goes before udev, and could load modules
> c) udev does its thing, which in turn bring rename_device in which may be
> causing the issue
> d) the rest of the scripts are run (including the ifup scripts).
> 
> This information would be useful in a general document - the exact steps of the
> boot procedure, at least in regards to network devices, in addition to its
> usefulness in the context of this bug.  

What happens is:

1) initrd loads modules. Generally not network ones, unless you've specifically configured it to do so.
2) udev loads modules. It loads them in parallel. This means that if you have multiple network drivers in the machine, they will race against each other for device assignment.
3) when a network device is created, /lib/udev/rename_device is run to attempt to coerce the device name into whatever device name is mapped to that MAC address in the ifcfg files. Otherwise, you just get the order they happen to initialize in (which can change from boot to boot)
4) rest of boot runs (during which you can do other things)

Due to the fact udev does not load modules sequentially (#2), you *will* get non-deterministic ethernet device ordering if you have multiple network drivers in the machine. It's inevitable. So, we use HWADDR= in the ifcfg files (documented insufficiently in sysconfig.txt) to accomplish that mapping. You could also use udev rules directly (with NAME=), or /etc/mactab and nameif, if you wanted.

modprobe.conf aliases are only used as a hint to load modules if the device isn't there when ifup runs. Given that a modprobe.conf configured like:

alias eth0 e1000
alias eth1 e1000e
alias eth2 e1000

wouldn't ever work without additional configuration such as HWADDR (as loading e1000 for eth0 would initialize the second interface as eth1, not eth2), it's not really a good mechanism compared to something that reads the configuration.

With respect to comment #13, the 'ethX: ...' messages in dmesg can't be used as the sole mechanism of determining what devices ended up as what, if HWADDR is used - you do not get kernel messages when the device is named to match the configured HWADDR.

Comment 18 Mick Russom 2009-04-03 06:43:07 UTC
"It loads them in parallel. This means that if you have multiple network drivers in the machine, they will race against each other for device assignment."

udev is severely broken and has a horrible bug. Its that simple. 

If this is to remain unpatched, we (the growing throngs of people who are being affected by this bug) need a way to allow for the old behavior to prevail.


Showing people how the old system "breaks down" in the case of:
alias eth0 e1000
alias eth1 e1000e
alias eth2 e1000

Is not an excuse for udev being written with severe flaws with regard to network device instantiation.

Comment 19 Mick Russom 2009-04-03 09:00:29 UTC
"udev loads modules. It loads them in parallel. This means that if you have
multiple network drivers in the machine, they will race against each other for
device assignment."

How can this be stopped? Writing rules? I cannot find out how udev when invoked is finding the ethernet devices and loading the drivers, I want to prevent this.

How can this be prevented?

Comment 20 static 2009-04-05 19:48:58 UTC
in /etc/udev/rules.d/10-local.rules  :

SUBSYSTEM=="pci", SYSFS{class}=="0x020000",             OPTIONS="ignore_device"

Mick, Thanks for that.  That seems to keep udev from loading the modules so that ifup scripts do it which always brings up the interfaces in order.


"With respect to comment #13, the 'ethX: ...' messages in dmesg can't be used as
the sole mechanism of determining what devices ended up as what, if HWADDR is
used - you do not get kernel messages when the device is named to match the
configured HWADDR."

Bill,  thanks for that.  The order was probably fixed by renaming after I put the HWADDR in but I assumed that dmesg would show the changes.  That explains why things worked but the dmesg output showed a different assignment that was REALLY confusing the @#$@# out of me.  I was really getting frustrated over not knowing how dmesg showed the interface assigned incorrectly but it worked.  To make sure there is no confusion though I decided to put i the 10-local.rules fix so that what dmesg shows is correct.

This change in udev definitely caught me by surprise.... and I have been using redhat in production since 4.x days.

Comment 21 Mick Russom 2009-04-06 17:35:26 UTC
We should all really sit down and think of the sanity of over-riding what the kernel thinks is a device name and what the rest of the system has been fooled into thinking. The idea the kernel knows a device as "eth0" but the system just feels like calling it eth1 or whatever seems really untenable as things get more complicated. A user "thinks" eth0 is doing on thing, but its been aliased to something else. 

I guess I'm getting old, but the discipline of dealing with naming conventions that are deterministic seems a lot more sane than than this new era of feel-good abstraction simply to rename something from one meaningless name to another meaningless name. (FreeBSD/Solars is probably closer to sane on this issue, drivername+instancenumber = ifname)

Comment 22 Mick Russom 2009-04-06 17:42:41 UTC
@Adam Gibson:

Let's assume "SUBSYSTEM=="pci", SYSFS{class}=="0x020000",             OPTIONS="ignore_device"" works.

This is pure voodoo, I don't know where this came from, and I read the sources and the docs on udev. If RedHat wants to do this whole udev/nameif mess, the very least that could be expected is that driver instantiation be clearly documented, which it isn't.

Comment 23 Chad Farmer 2009-05-16 00:02:39 UTC
Anyone had a problem with Ethernet bonding and the ifcfg HWADDR strategy?  

As interfaces are added to a bond, SYSFS{address} contents may be changed from the actual MAC address of the interface to the "effective" MAC address of the bond.

Comment 27 Jason Landstrom 2009-08-11 14:16:26 UTC
So, is there a work around that is known to work?

Comment 28 James M. Leddy 2009-08-11 16:37:38 UTC
(In reply to comment #27)
> So, is there a work around that is known to work?  

Please refer to comment #5.  It is working as designed.  If you would like the design to be changed, escalate through your support representative.

Comment 29 Andy Gospodarek 2009-08-11 17:28:16 UTC
(In reply to comment #27)
> So, is there a work around that is known to work?  

udev rules can also be used if you are unable to have HWADDR in ifcfg-ethX files.  If you have systems that are identical, then udev rules based on pci bus/device/functions would work too.

Comment 30 Chad Farmer 2009-08-11 17:54:24 UTC
I believe the sequence in Mick Russom's comment #16 is correct.  There are several work arounds depending on your preference.

a) initrd loads modules.

So use mkinitrd to customize the initrd to load the one driver you want loaded first.

b) rc.sysinit goes before udev, and could load modules.

You could add lines to load Ethernet drivers in the sequence you want, possibly separated by "sleep 1" to give them time to initialize.

c) udev does its thing.  

uDev is loads Ethernet drivers concurrently, so it is guaranteed to mess things up if it loads drivers.  If you have not already loaded all Ethernet drivers before udev, add the udev rule "10-local.rules, ignore_device" described above.

d) the rest of the scripts are run (including the ifup scripts).

At this point options in /etc/modprobe.conf are honored.  Alias statements names are not very effective.  If the named driver is not loaded when the alias is processed, it will be loaded, but it will grab a set of names in sequence.

If you want a really ugly workaround, add an install rule like the following to modprobe.conf.  This makes the load of e1000e occur only after igb is loaded. 

# Force igb load before e1000e so built-in Ethernet is eth0.
install e1000e /sbin/modprobe igb; sleep 1; /sbin/modprobe --ignore-install e1000e

I believe that using the MAC address in ifcfg-ethN to rename the interface works.  I don't like the side-effect that an Ethernet interface will be called by both its original name and its renamed name at different times in /var/log/messages.

You can use any of these techniques to have a repeatable sequence from boot to boot.  This sequence is likely to change if NICs are added, removed or replaced.

Comment 31 James Ralston 2009-08-18 15:21:46 UTC
Just out of curiosity, how has upstream (Fedora / planned RHEL6) changed such that HWADDR (is no longer / will no longer be) necessary?

Comment 32 James M. Leddy 2009-08-18 16:31:18 UTC
(In reply to comment #31)
Almost all major distros are using persistent-net.rules and persistent-net-generator.rules files in udev.  These are included in the udev package in fedora.

Comment 33 Bill Nottingham 2009-08-20 17:47:52 UTC
Correct - that just moved the HWADDR <-> device mapping to a different file, though.

Comment 36 Tim 2009-12-03 05:10:19 UTC
So I'm in a similar situation to others. We have several thousand systems deployed across Asia. Each has an identical software build, but may have slightly different hardware based on age and requirements. Each has 4-6 Ethernet ports. Generally the systems are installed by engineers who do not have access to the OS. We need to ensure that the internal system name for the Ethernet interfaces match what is printed on the external hardware ports, otherwise, well, it would be very bad.

The external ports themselves are always found to be in bus order, so it seems like the best method to get what we need would be what Andy described in comment 29: write rules matching on PCI domain/bus/device/function. Could you give an example of such a rule? I've been playing around. Something like

SUBSYSTEM=="net", ACTION=="add", DEVPATH=="/sys/bus/pci/devices/0000:05:08.0", KERNEL=="eth*", NAME="eth4"

doesn't seem to work. I wonder what would?

It seems as though a set of rules like this could solve the problems people are talking about. Ultimately, though, I think a long-term solution would be for udev to treat permanently wired network interfaces in the same special way that it handles /sys/disk/*, and give them fixed names based on bus order. I can't think of a good reason to give dynamic names to permanently wired devices.

Thanks for any hints!

Comment 37 Tim 2009-12-03 05:19:19 UTC
Well, having said that, this seems to work:

SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:05:08.0"
,NAME="eth4"

so now we can do:

SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:01:00.0"
,NAME="eth0"                                                                    
SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:02:00.0"
,NAME="eth1"                                                                    
SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:03:00.0"
,NAME="eth2"                                                                    
SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:04:00.0"
,NAME="eth3"                                                                    
SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:05:08.0"
,NAME="eth4"

I wonder if there are any negative consequences to such a scheme?

Thanks!

Comment 38 Andy Gospodarek 2009-12-07 23:04:45 UTC
(In reply to comment #37)
> Well, having said that, this seems to work:
> 
> SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:05:08.0"
> ,NAME="eth4"
> 
> so now we can do:
> 
> SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:01:00.0"
> ,NAME="eth0"                                                                    
> SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:02:00.0"
> ,NAME="eth1"                                                                    
> SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:03:00.0"
> ,NAME="eth2"                                                                    
> SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:04:00.0"
> ,NAME="eth3"                                                                    
> SUBSYSTEM=="net",ACTION=="add",BUS=="pci",KERNEL=="eth*",KERNELS=="0000:05:08.0"
> ,NAME="eth4"
> 
> I wonder if there are any negative consequences to such a scheme?
> 
> Thanks!  

Tim, I'm glad to hear that is working!

The only way those rules might create a problem is if you were to move one of the cards to a different slot.

Comment 41 Jason Haar 2010-06-27 23:34:29 UTC
I have the same issue with RHE4.7 running a custom kernel (2.6.28.1)

Several times now we have experienced issues with Dell 1950 servers where after a reboot, the MAC address of eth0 "moves" to eth1

These interfaces are the standard dual-nic that comes with Dell servers - Broadcom NetXtreme II. They are built into our kernel for our own reasons (ie not modules), so the thing I can't get my head around is how rebooting a server with the same hardware and the same kernel can trigger such flip-flopping behaviour... We have to send a technician out to these sites to swap the Ethernet cables over!

I have HWADDR defined within ifcfg-eth0 - but it doesn't help when the MAC appears to migrate between interfaces (remember: dual-nic card). ifrename doesn't seem to work either, and RHE4 doesn't use udev to the same extent as RHE5 - all the above udev talk doesn't map to RHE4

Any ideas how I can *guarantee* that the eth0 "hardware details" I see after the first reboot remain that way forever? If I created a new kernel with bnx2 as a module, could I "modprod" eth0 with a mac address - thereby getting the guarantee I'm after? 

Thanks

Jason

Comment 42 Greg Bradner 2010-07-14 21:30:41 UTC
This should be reopened. UDEV is broken. I can't believe anyone really thinks a random assignment is a solution.  Nor is udev rules hacks. (comment #38) Hardware changes as mentioned.
And I don't see how moving the random assignment to a different file fixes this bug. (comment #33)
And why is noone helping Jason Haar? (comment #41)

Comment 43 Chad Farmer 2010-07-14 23:59:23 UTC
Just to add to the problem description...

I believe that udev is loading the required Ethernet drivers concurrently, so that two or more Ethernet drivers are initializing and asking for "the next" ethN name more or less at the same time.

It is possible that udev is the victim of an improvement in device driver initialization.  In the past, I believe that the thread executing modprobe called the driver's init routine and the driver init routine did not return until the PCI survey and device name assignment was complete.  Then another driver would be loaded.

In RHEL 5, igb_init_module and e1000e_init_module just register themselves as a pci driver and return.  I presume that an independent kernel thread will actually do the init (survey and create devices).  But meanwhile, the thread loading drivers loads the next driver that registers itself and returns.  The result is that one or more Ethernet drivers could be initializing concurrently, by design.  This makes the allocation of the next ethN name a race.  And while a particular system may get the same result for months, this can suddenly change for too many reasons to list.  This description is somewhat conjecture, so it would be nice to have this confirmed by someone who really knows.

The HWADDR "fix" in RHEL5 doesn't fix the initial random assignment of eth names.  When the network service is starting, it checks the HWADDR value in the ifcfg file and if it does not match, it renames Ethernet interfaces until the interface with the matching MAC address has the name of the ifcfg-ethN file.  This puts the right name on each interface with a few minor issues.  First, /var/log/messages is inconsistent because eth0 starts on one interface and then is "moved" to another interface.  So greping for "eth0" give messages about two different interfaces.  Second, replacing a defective NIC, must include editing the MAC addresses in the ifcfg files.  Third, in order to free an interface name (e.g. eth0) the existing interface is given a non-eth name. If that interface is not configured, it does not get renamed back to some ethN name.  This is OK, but it can cause confusion when trying to configure a previously unused interface.

Note that the old situation was also not ideal.  Since names were assigned sequentially as found, adding or removing NICs renamed interfaces, requiring that the ifcfg files be adjusted.  And, NIC failures where a NIC became unresponsive to the point of not getting configured would cause all the subsequent interfaces to be renamed.

I do not have a great solution to the general problem.  You can configure udev to not load Ethernet device drivers by adding to /etc/udev/rules.d/10-local.rules the line
SUBSYSTEM=="pci", SYSFS{class}=="0x020000",             OPTIONS="ignore_device"

Even when not done by udev, the Ethernet drivers still seem to get loaded automatically when the network service starts, and perhaps are loaded further apart in time.

You can really kludge /etc/modprobe.conf with something like:
install e1000e /sbin/modprobe igb; /sbin/modprobe --ignore-install e1000e
that installs e1000e by first loading igb.  However, if modprobe is returning before names are allocated, you might need
install e1000e /sbin/modprobe igb; sleep 2; /sbin/modprobe --ignore-install e1000e

I think using udev rules to assign names to PCI addresses [37] is the better solution.  Ultimately, you want to configure a physical port on the box so that it can be connected to a specific network.  A person with a wire needs to know where to plug it in.  If the BIOS is rational, a PCI address will consistently refer to a specific physical Ethernet port.  Hopefully, the PCI address is permanently associated with a slot, even if it is empty.  And machines of the same model will all be the same.  Assigning names by PCI address means that adding or removing NICs will not change the existing NIC configuration.  And replacing a NIC does not require any changes to an ifcfg file.  The down side is that it can initially be a challenge to figure out what PCI address corresponds to which built-in Ethernet adapter or PCI slot.

Comment 44 Greg Bradner 2010-07-20 16:59:57 UTC
I assume by the lack of RedHat responses this will not be reopened as a bug.

Comment 45 static 2010-09-08 20:40:11 UTC
I use the udev rule hack that ignores 0x020000 which seems to work.  As extra safety I created a script that gets launched on boot in rc.local that runs dmesg and greps the output to make sure that the ethernet interfaces are assigned what I want otherwise it brings down all interfaces, removes all ethernet modules, inserts ethernet modules one at a time (sleeping 1 second between module insertion) and then brings up the interfaces again.

These are firewall systems so they must come up in a consistent state for security reasons.

Comment 46 Tim 2010-09-10 03:27:27 UTC
Kind of disturbing to see that this is still open, but I thought I would share the interim solution I've been using:

(1) Since we have no movable or hot-pluggable NICs in our hardware platforms, we require the manufacturers to arrange the external network ports so that they are in (breadth-first) order on the PCI bus(ses). For many vendors this does seem to be the default, but YMMV, and we have found that some order left to right while others order right to left!

Note that if you have only one driver, there is a boot-time parameter pci=bfsort that will cause PCI devices to be probed in breadth-first order (usually it's depth first), but if you have multiple drivers, all bets are probably off.

(2) Edit /lib/udev/75-persistent-net-generator.rules to insert a custom rule between two existing ones:

# S/390 uses id matches only, do not use MAC address match
SUBSYSTEMS=="ccwgroup", ENV{COMMENT}="S/390 $driver device at $id", ENV{MATCHID}="$id", ENV{MATCHDRV}="$driver", ENV{MATCHADDR}=""

# Custom rule to pin interface names based on bus order
SUBSYSTEMS=="pci", KERNEL=="eth*", PROGRAM="vsn-ethername $id", RESULT=="?*", ENV{MATCHID}="$id", ENV{MATCHADDR}="", ENV{INTERFACE_NAME}="$result", ENV{COMMENT}="VarioSecure Ethernet $id ($attr{address}) [$kernel->$result]"

# see if we got enough data to create a rule
ENV{MATCHADDR}=="", ENV{MATCHID}=="", ENV{INTERFACE_NAME}=="", GOTO="persistent_net_generator_end"

(3) Add the vsn-ethername shell script that actually does the work of assigning the names. It looks like I can't attach files to comments, so I will append the script in-line below.

(4) That's it, except to note that if you already have a file /etc/udev/rules.d/70-persistent-net.rules you might need to delete it before trying this. Oh, and make sure you have non-network access to the device just in case!

Surprisingly, there does not appear to be any standard governing mappings between externally labelled network ports and any data structure (like bus order) that would allow software to uniquely map external ports to internal names. There is some work being done on this in the SMBIOS working group. See the discussion here:

http://linux.dell.com/wiki/index.php/Oss/libnetdevname

The idea seems to be that software will be able to query the BIOS for port naming information, so something like udev, or an external script, should eventually be able to do this in an unambiguous way. Last I checked, though, the standard was not yet public.

Here's the script. Even though it is just a tiny hack, I had to tack on tons of license stuff to placate administrative people.

----------------------------------------------------------------------

#!/bin/sh

# vsn-ethername
# Copyright (c) 2010 VarioSecure Networks, Inc.

# This program is free software; you can redistribute it and/or modify it
# under the terms and conditions of the GNU General Public License,
# version 2, as published by the Free Software Foundation.
# 
# This program is distributed in the hope it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
# more details.
# 
# You should have received a copy of the GNU General Public License along with
# this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
#
# Contact Information: Me <tim>

# Synthesize an Ethernet interface name from the bus order of the devices,
# because that's the way the external ports are physically ordered on the box.
# This appears to be the best way to go. Otherwise we are at the mercy of 
# udev, the particular way in the kernel is built, and the positions of the
# planets, which can easily vary over time.
#
# Note that one side-effect of this method is that devices retain the same
# names even if no driver is available for them. But this is what you want
# isn't it? A persistent mapping from physical ports to names, regardless
# of what software is available? The disaster is if the system's internal
# names for the ports don't match what is painted over the physical jacks
# on the box, so that is what we aim to avoid.
#
# Put this script in /lib/udev and call from 75-persistent-net-generator.rules
# with something like this:
#
# SUBSYSTEMS=="pci",
#       KERNEL=="eth*",
#       PROGRAM="vsn_ethername $id",
#       RESULT=="?*",
#       ENV{MATCHID}="$id",
#       ENV{MATCHADDR}="",
#       ENV{INTERFACE_NAME}="$result",
#       ENV{COMMENT}="Ethernet $id ($attr{address}) [$kernel->$result]"
#
# Note that if we can't determine a good name, we just return a null string.
# In that case the default rules apply...

if [ -z "${1}" ]; then
    echo "Ethernet device PCI ID required" >&2
    echo 
    exit 0
fi

shopt -s nullglob
UNIT=0
NAME=
for FILE in /sys/bus/pci/devices/*; do
    if [ -L "${FILE}" ]; then
        if [ -f "${FILE}/class" ]; then
            CLASS=$(cat "${FILE}/class")
            if [ "${CLASS}" = "0x020000" ]; then
                DEVICE=${FILE##*/}
                if [ "${DEVICE}" = "${1}" ]; then
                    NAME="eth${UNIT}"
                    break
                fi
                ((UNIT++))
            fi
        fi
    fi
done
shopt -u nullglob
echo "${NAME}"

----------------------------------------------------------------------

Tim

Comment 47 Mick Russom 2010-09-10 05:33:19 UTC
I would ask that this get CLOSED with WONTFIX, calling this NOTABUG is really, really wrong. 

The tens of people here and the many that I know in person who have encountered this think its a big ugly bug and we all have various ways to work around this.

Comment 49 Isvel Lopez 2011-02-05 21:36:41 UTC
Hi everybody, I have a similar problem, let me explain you, I have a Debian 5.0.5 64bits kernel 2.6.26-2-amd64, running over VmWare esxi 4.1, the hardware has 4 network adapters and Debian can see the 4 adapters, well my problem is that every time a shutdown the Debian with init 0 the ips change their physical port, I have tried a lot of things like the udev rules changing the names of the interfaces, I have tried also the aliases in the /etc/modprobe.d/aliases, I have also tried to give always the same MAC write them manually in vmware but nothing, every time I use "init 0" the ips change their position, another thing, if a I use init 6 the ips stay in the same place they were last time. I don't know what else to do, please help me.

Comment 50 Daniel Voina 2011-03-16 15:13:56 UTC
@Tim: I am not sure that the method you've described in comment #46 is accurate for RH 5.3. udev-095-14.20 does not contain the file 75-75-persistent-net-generator.rules. Are you sure that you have tested with the same udev release?

Comment 51 Isvel Lopez 2011-03-16 15:19:56 UTC
Hi everybody, I solved my problem in comment #49, it was a VmWare problem, I just create a Virtual Switch for each Physical Interface, and thats it.

Comment 52 Rui Ferrao 2011-05-13 16:23:58 UTC
And how can i pass the HWADDR to anaconda during my kickstart install to make sure that i have consistent naming across several machines?

Comment 53 ricky 2011-08-22 03:11:13 UTC
I have a Dell R610 with 4 onboard Broadcom gig ports and 2 add on dual port PCI intel gig cards. modprobe.conf is configured as such

alias eth0 bnx2
alias eth1 bnx2
alias eth2 bnx2
alias eth3 bnx2
alias eth4 igb
alias eth5 igb
alias eth6 igb
alias eth7 igb

The system is configured to boot onboard Broadcom cards as eth0 eth1 eth2  eth3 and the pci intel card as eth4 to eth7. On some ocassions on reboot, Intel cards became eth0 to eth3. Will specifying HWAADR in ifcfg-ethx really solve the issue with Intel card being initialised first on some reboots ?

thanks for any ideas.

Comment 54 Andy Gospodarek 2011-08-22 14:25:41 UTC
(In reply to comment #53)
> Will specifying HWAADR in ifcfg-ethx really solve
> the issue with Intel card being initialised first on some reboots ?
> 

Yes, that should be all you need to do to resolve this on RHEL5.

Comment 55 ricky 2011-08-22 15:43:17 UTC
(In reply to comment #54)
> (In reply to comment #53)
> > Will specifying HWAADR in ifcfg-ethx really solve
> > the issue with Intel card being initialised first on some reboots ?
> > 
> Yes, that should be all you need to do to resolve this on RHEL5.

Thanks. 

but to understand it a bit more, how would a intel card being initialised 1st and when it reads ifcfgf-eth0 sees the hwaddr value is different, would know that it need to be setup as eth3 ? does it scan it ifcfg-ethx files until it sees it hwaddr value ?

Comment 56 Andy Gospodarek 2011-08-22 17:56:44 UTC
Response to ricky send after a private email:

"As modules are loaded, udev will rename the devices if needed.  If there
is an HWADDR match in an ifcfg-ethX file, the device name used in that
file will be the new name of that interface.

"There are other ways to name the devices (like special udev rules), but
specifying the device MAC address in ifcfg-ethX is the easiest way to do
this."

Comment 57 Jason Haar 2011-08-22 20:48:49 UTC
You need to re-read the previous comments. A lot of us *are* putting HWADDR details into the ifcfg-ethXX configs - but RH is ignoring it! 

e.g I have a dual-NIC Dell server with ifcfg-eth0 and ifcfg-eth1 with appropriate HWADDR entries. Only eth0 is wired into the network. I reboot and suddenly the MAC address of eth0 is on eth1. As eth0 is now WRONG, we've lost the server - as the Ethernet cable is now plugged into a disabled interface. We have to send a technician over to re-cable it.

Comment 58 Brent Woodruff 2011-10-03 21:02:42 UTC
On RHEL 6.1, the only way we have been able to get consistent naming is to use udev to name devices based on MAC address. Using this method, we have been able to name interfaces whatever we wish, including having eth0 and eth2 on one dual port while eth1 is on another card entirely.

What we do during kickstart in %post is remove /etc/udev/rules.d/70-persistent-net.rules and create a file /etc/udev/rules.d/80-satellite-net.rules containing, e.g.:
KERNEL=="eth*", ATTR{address}=="MAC GOES HERE", NAME="eth1"
KERNEL=="eth*", ATTR{address}=="MAC GOES HERE", NAME="eth0"

There's no need to list interfaces which are not configured, as long as there is no corresponding ifcfg- file. There is also no need for /etc/modprobe.conf aliases. If you have the OPTIONS="ignore_device" rule discussed in earlier fixes, remove that since you do want udev to handle NICs with this method.

If you are using cobbler/spacewalk/satellite to provision machines, then you can use the same snippet we do somewhere in %post:

#for $iface, $config in $interfaces.items()
echo "KERNEL==\"eth*\", ATTR{address}==\"$config['mac_address']\", NAME=\"$iface\"" >> /etc/udev/rules.d/80-satellite-net.rules
#end for
rm -f /etc/udev/rules.d/70-persistent-net.rules

Maybe there is a better/cleaner way but this works for us so far.

Comment 59 petre rodan 2012-03-31 15:11:32 UTC
I had the same problem of having the ethernet devices mixed up after booting a new (3.3.0) kernel, so I wrote a simple script that takes care of reordering them either based on the PCI bus id or a file containing eth names and mac addresses.

it can be downloaded from https://github.com/rodan/fix_eth_order

it's a standalone solution that was ment for circumventing any udev or modprobe-related hacks. tested on gentoo servers with monolithic kernels and static /dev.

Comment 60 J 2013-06-20 14:08:34 UTC
This bug is especially annoying when dealing with virtual boxes and vagrant.  Every time you vagrant up it is a crap shoot as to whether or not eth0 and eth1 end up on the correct network (in my case one is on a NAT and the other is on a host-only with a static ip).   You can't assign the HWADDR b/c the MAC address gets reset every time you bring up a new test environment.  

Also we're using RHEL 6.x and seeing this problem.

Comment 62 Andrius Benokraitis 2013-10-15 19:27:03 UTC
No additional minor releases are planned for Production Phase 2 in Red Hat Enterprise Linux 5, and therefore Red Hat is closing this bugzilla as it does not meet the inclusion criteria as stated in:
https://access.redhat.com/site/support/policy/updates/errata/#Production_2_Phase


Note You need to log in before you can comment on or make changes to this bug.