Bug 187550 - Network interfaces assignment is unpredictable at each boot
Summary: Network interfaces assignment is unpredictable at each boot
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: initscripts
Version: 5
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Bill Nottingham
QA Contact: Brock Organ
URL:
Whiteboard: bzcl34nup
: 188454 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-03-31 20:53 UTC by Olivier Benghozi
Modified: 2014-03-17 02:59 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-05-06 15:43:53 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
This is the config, without the HWADDR= lines (10.26 KB, application/octet-stream)
2006-07-07 14:21 UTC, Karsten Hahn
no flags Details
config files for the last comment (18.03 KB, application/x-bzip2)
2006-07-18 11:03 UTC, Karsten Hahn
no flags Details

Description Olivier Benghozi 2006-03-31 20:53:16 UTC
I have two Intel PCI NICs (driver e100) and one PCI 3Com (driver 3c59x), using
an old PII-400 (Asus P3B-F MotherBoard, Intel chipset) with PCI and AGP buses.

1) Usually, linux (Fedora core 1-4) always detected the two Intel as eth0 and
eth1, and the 3com as eth2. It wasn't OK for me so my modprobe.conf contains the
proper alias lines (alias eth0 3c59x, alias eth1 e100, alias eth2 e100). It
worked for years like this. Since Fedora Core 5, this file has no effect at all,
the system acts like this file doens't even exist, whatever I put in it (3com is
"always" eth2).

2) OK, let's use 3com as eth2. I modified modprobe.conf correspondingly.
It's still not OK. 2/3 of the times, eth0 & eth1 are the Intel cards, and eth2
is the 3com card. 1/3 of the times, eth0 is 3c59x and eth1/2 are e100. Of course
I can no longer access the box in this case. I guess the new driver loading
scheme for network drivers via the udev system (if I understood the release
notes) may create this problem (not reliable / not predictible).



-------------------------------------------
Additional Information: Some Dmesg messages:
-------------------------------------------

************** 3com is eth0 (OK for me, but not normal case - Fedora has always
detected it as eth2 if all previous versions)**************

Mar 28 20:10:13 limace kernel: ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
Mar 28 20:10:13 limace kernel: ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link
[LNKC] -> GSI 10 (level, low) -> IRQ 10
Mar 28 20:10:13 limace kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Mar 28 20:10:13 limace kernel: 0000:00:0a.0: 3Com PCI 3c905C Tornado at
e0834000. Vers LK1.1.19
Mar 28 20:10:13 limace kernel: e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
Mar 28 20:10:13 limace kernel: e100: Copyright(c) 1999-2005 Intel Corporation
Mar 28 20:10:13 limace kernel: ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 5
Mar 28 20:10:13 limace kernel: ACPI: PCI Interrupt 0000:00:09.0[A] -> Link
[LNKD] -> GSI 5 (level, low) -> IRQ 5
Mar 28 20:10:13 limace kernel: e100: eth1: e100_probe: addr 0xd7000000, irq 5,
MAC addr 00:50:8B:5B:C6:69
Mar 28 20:10:13 limace kernel: ACPI: PCI Interrupt 0000:00:0d.0[A] -> Link
[LNKD] -> GSI 5 (level, low) -> IRQ 5
Mar 28 20:10:13 limace kernel: e100: eth2: e100_probe: addr 0xd6000000, irq 5,
MAC addr 00:50:8B:5A:88:9D
Mar 28 20:10:13 limace kernel: piix4_smbus 0000:00:04.3: Found 0000:00:04.3 device
Mar 28 20:10:13 limace kernel: ACPI: PCI Interrupt 0000:00:0e.0[A] -> Link
[LNKC] -> GSI 10 (level, low) -> IRQ 10





********************** Normal version (eth0 & 1 are e100, eth2 is 3c59x)
**************
Mar 31 22:15:29 limace kernel: e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
Mar 31 22:15:29 limace kernel: e100: Copyright(c) 1999-2005 Intel Corporation
Mar 31 22:15:29 limace kernel: ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 5
Mar 31 22:15:29 limace kernel: ACPI: PCI Interrupt 0000:00:09.0[A] -> Link
[LNKD] -> GSI 5 (level, low) -> IRQ 5
Mar 31 22:15:29 limace kernel: e100: eth0: e100_probe: addr 0xd7000000, irq 5,
MAC addr 00:50:8B:5B:C6:69
Mar 31 22:15:29 limace kernel: ACPI: PCI Interrupt 0000:00:0d.0[A] -> Link
[LNKD] -> GSI 5 (level, low) -> IRQ 5
Mar 31 22:15:29 limace kernel: e100: eth1: e100_probe: addr 0xd6000000, irq 5,
MAC addr 00:50:8B:5A:88:9D
Mar 31 22:15:29 limace kernel: ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
Mar 31 22:15:29 limace kernel: ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link
[LNKC] -> GSI 10 (level, low) -> IRQ 10
Mar 31 22:15:29 limace kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Mar 31 22:15:29 limace kernel: 0000:00:0a.0: 3Com PCI 3c905C Tornado at
e0896000. Vers LK1.1.19
Mar 31 22:15:29 limace kernel: piix4_smbus 0000:00:04.3: Found 0000:00:04.3 device
Mar 31 22:15:29 limace kernel: ACPI: PCI Interrupt 0000:00:0e.0[A] -> Link
[LNKC] -> GSI 10 (level, low) -> IRQ 10
Mar 31 22:15:29 limace kernel: USB Universal Host Controller Interface driver v2.3
Mar 31 22:15:29 limace kernel: ACPI: PCI Interrupt 0000:00:04.2[D] -> Link
[LNKD] -> GSI 5 (level, low) -> IRQ 5






****************** Another variant of unworking driver loading *******************
Mar 31 20:17:35 limace kernel: piix4_smbus 0000:00:04.3: Found 0000:00:04.3 device
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_ethtool_sset
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_link_ok
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_check_link
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_nway_restart
Mar 31 20:17:35 limace kernel: e100: Unknown symbol generic_mii_ioctl
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_ethtool_gset
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_ethtool_sset
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_link_ok
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_check_link
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_nway_restart
Mar 31 20:17:35 limace kernel: e100: Unknown symbol generic_mii_ioctl
Mar 31 20:17:35 limace kernel: e100: Unknown symbol mii_ethtool_gset
   ------------------> My comment: the system tried to load e100 driver for
operating the 3com device!
Mar 31 20:17:35 limace kernel: ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
Mar 31 20:17:35 limace kernel: ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link
[LNKC] -> GSI 10 (level, low) -> IRQ 10
Mar 31 20:17:35 limace kernel: 3c59x: Donald Becker and others.
www.scyld.com/network/vortex.html
Mar 31 20:17:35 limace kernel: 0000:00:0a.0: 3Com PCI 3c905C Tornado at
e0840000. Vers LK1.1.19
Mar 31 20:17:35 limace kernel: ACPI: PCI Interrupt 0000:00:0e.0[A] -> Link
[LNKC] -> GSI 10 (level, low) -> IRQ 10
Mar 31 20:17:35 limace kernel: USB Universal Host Controller Interface driver v2.3
Mar 31 20:17:35 limace kernel: ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 5
Mar 31 20:17:35 limace kernel: ACPI: PCI Interrupt 0000:00:04.2[D] -> Link
[LNKD] -> GSI 5 (level, low) -> IRQ 5
   [........... other things ............]
Mar 31 20:17:35 limace kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
Mar 31 20:17:35 limace kernel: Netfilter messages via NETLINK v0.30.
Mar 31 20:17:35 limace kernel: ip_conntrack version 2.4 (4095 buckets, 32760
max) - 232 bytes per conntrack
Mar 31 20:17:35 limace kernel: e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
Mar 31 20:17:35 limace kernel: e100: Copyright(c) 1999-2005 Intel Corporation
Mar 31 20:17:35 limace kernel: ACPI: PCI Interrupt 0000:00:09.0[A] -> Link
[LNKD] -> GSI 5 (level, low) -> IRQ 5
Mar 31 20:17:35 limace kernel: e100: eth1: e100_probe: addr 0xd7000000, irq 5,
MAC addr 00:50:8B:5B:C6:69
Mar 31 20:17:35 limace kernel: ACPI: PCI Interrupt 0000:00:0d.0[A] -> Link
[LNKD] -> GSI 5 (level, low) -> IRQ 5
Mar 31 20:17:35 limace kernel: e100: eth2: e100_probe: addr 0xd6000000, irq 5,
MAC addr 00:50:8B:5A:88:9D
    ------------------> this time the e100 is finally loaded via the launch of
the network scripts and modprobe.conf file assignments.

Comment 1 Hann-Huei Chiou 2006-04-01 06:54:56 UTC
I have the same (similar) problems, too.

Two configurations:
(a) i386
As kudzu (hwconf) reported:
eth0 e1000     (private)
eth1 8139too   (public)

(b) x86_64
As kudzu (hwconf) reported:
eth0 forcedeth (unused)
eth1 e1000     (public)

I used to name the public interface as eth0, so I swapped both
with the alias derivative in modprobe.conf:
(a)
alias eth0 8139too
alias eth1 e1000
(b)
alias eth0 e1000
alias eth1 forcedeth

It worked as in RH9/FC1 to FC4. Unfortunately, it seems that
FC5 uses the order in hwconf, not modprobe.conf.

If I stop the network service, rmmod these nic modules,
and run "ifup eth0 ; ifup eth1", the correct modules are
loaded.

Now I have to edit all related network scripts to "fixed" this.

Any suggestions?


Comment 2 Sorin Sbarnea 2006-04-08 13:05:05 UTC
I have the same problem. In my system I have 4 network interfaces one e1000
(embedded on mb) and 3 pci cards using 8139too. It seams that at boot time the
additional cards are swaped. 

Comment 3 Tomasz Kepczynski 2006-04-21 22:29:10 UTC
It looks like I also have this problem with additional twist. I have
2 onboard NICs (r8169 and rhine II), modprobe lists eth0 as r8169 and
eth1 as via-rhine. It looks like eth0 is always brought up as expected
but rhine gets device name in the form devX where X is a number (5 digit
if I remeber correctly).
I worked around this by executing
# ip link show dev eth0
# ip link show dev eth1
early in rc.sysinit (when /etc/sysconfig/network is included) and
it worked but with lots of ugly selinux complaints.




Comment 4 Harald Hoyer 2006-04-24 14:01:48 UTC
why don't you bind the interface name to the MAC address? Either with
system-config-network of with HWADDR in ifcfg-

Comment 5 Tomasz Kepczynski 2006-04-24 18:06:49 UTC
Because according to /usr/share/doc/initscripts-8.31.1/sysconfig.txt
it does not work as intended with MACADDR.

Comment 6 Olivier Benghozi 2006-04-24 20:19:03 UTC
Well, I already found a workaround: in my case, the only proper workaround is
putting both NIC drivers in /etc/modprobe.d/blacklist. It prevents the loading
of these drivers by udev and instead uses the classical system; it suddently
makes my system work, so maybe something has been forgotten in the udev system ?

Of course it's not an acceptable solution, only a temporary workaround.
Udev is not expected to generate such problem, its adoption was expected to
prevent this. Interfaces that don't physically move or change should not have
their name/IP/order move or changed at each boot: naming of static interfaces
shoud be automatically static across reboots. Of course udev system should work
properly without such workaround.

Comment 7 Harald Hoyer 2006-04-25 02:22:07 UTC
    HWADDR=
      ethernet hardware address for this device


not MACADDR!!!

Comment 8 Tomasz Kepczynski 2006-04-25 05:05:46 UTC
I understand this and I NEED to change MAC address as described here:
    MACADDR=
      Set the hardware address for this device to this.
      Use of this in conjunction with HWADDR= may cause
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      unintended behavior.
      ^^^^^^^^^^^^^^^^^^^^
And this is why I don't use HWADDR.

Comment 9 Harald Hoyer 2006-04-25 09:31:16 UTC
ah ok... sorry... 

Comment 10 Warren Togami 2006-04-25 15:00:59 UTC
*** Bug 188454 has been marked as a duplicate of this bug. ***

Comment 11 Warren Togami 2006-04-25 15:10:35 UTC
This is possibly a duplicate of Bug #188955.  Please test the initscripts in FC5
updates-testing to see if it solves this issue for you.  If that package solves
your problem, then it is an issue where udev is loading the drivers in a random
order even before it has a chance to rename the devices.  HWADDR does not help
in this case at all.

Comment 12 Olivier Benghozi 2006-04-26 18:11:39 UTC
I installed this initscripts-8.31.2-1.i386.rpm.
I removed both lines in /etc/modprobe.d/blocked.
I didn't modify /etc/modprobe.conf.

I rebooted several times. Most of the time e100 is loaded first, sometimes 3c59x
is loaded first. So, it didn't correct the problem for me; it looks like it
didn't change the system behavior in fact.

Comment 13 Bill Nottingham 2006-04-26 18:39:56 UTC
symbol: do you have ifcfg files for both interfaces with HWADDR in them?

Comment 14 Sorin Sbarnea 2006-05-16 10:18:31 UTC
I think this update might have solved the problem but I can't guarantee this.
Booted 3 times and it's ok. 

Comment 15 Sorin Sbarnea 2006-06-27 10:11:00 UTC
The use of initscripts-8.31.2-1.i386.rpm does NOT solve the problem. Please tell
me what information do I need to submit.

Comment 16 Bill Nottingham 2006-06-29 03:35:51 UTC
Do you have ifcfg-XXX files for all your interfaces, with appropriate HWADDR set?

Comment 17 Karsten Hahn 2006-07-06 07:29:29 UTC
We have this problem here too. It seems, that this bug occures even with the
drivers blacklisted for D-LINK cards (eight sundance driver) with two tg3
onboard cards. Intel cards (four e1000) instead of the sundance cards seem to
work in the same server (HP Proliant) when blacklisted. 
Please advise if you need further information.

Comment 18 Bill Nottingham 2006-07-06 15:10:43 UTC
1) make sure you're using the latest updates-testing initscripts
2) make sure you have HWADDR= in all your ifcfg files for all the interfaces

Comment 19 Karsten Hahn 2006-07-07 12:08:09 UTC
okay, I tried again:
1. New initscripts alone did not work
2. New initscripts + bind to MAC address (HWADDR=) did not work
3. initscripts/HWADDR= and blacklisting in /etc/modprobe.d/blacklist did not work
4. initscripts/HWADDR=/blacklisting and adding udev rules to /etc/udev/rules.d/
did work (one rule like the following for each eth device:
KERNEL="eth*", ID="0000:03:01.0", NAME="eth0")
5. after reverting some of the stuff above: adding udev rules and blacklisting
in /etc/modprobe.d/blacklist (but without new initscripts and HWADDR=) works also

Comment 20 Bill Nottingham 2006-07-07 13:57:07 UTC
What do your config files look like?

Direct udev renaming isn't really reliable, as it will fail if there is already
a device at the name you're trying to rename.

Comment 21 Karsten Hahn 2006-07-07 14:21:44 UTC
Created attachment 132055 [details]
This is the config, without the HWADDR= lines

Thanks a lot for your time. Please advise if you need anything else.

Comment 22 Bill Nottingham 2006-07-07 14:25:53 UTC
While you still may be running into bugs, there's no way it can work without the
HWADDR lines. I wonder if the bonding usage is affecting this.

Comment 23 Karsten Hahn 2006-07-07 15:44:36 UTC
I'll try on Monday without the bonding interfaces and with the HWADDR. But I
still want to avoid binding the interfaces to a MAC address, because this
complicates our procedures for replacing a defective NIC drastically.

Comment 24 Bill Nottingham 2006-07-07 15:47:23 UTC
It's impossible to fix without the HWADDR, as udev can (and will) load the
modules in arbitrary order.

Comment 25 Sorin Sbarnea 2006-07-07 16:37:02 UTC
I confirm that it works if i write HWADDR for each adapter but i think bug
should remain open because this is just an workaround.

Comment 26 Bill Nottingham 2006-07-07 17:34:27 UTC
There is no other way to enforce ordering; udev loads modules as it finds them
on the PCI bus.

Comment 27 Sorin Sbarnea 2006-07-07 18:53:33 UTC
In this case I think the naming algorithm it's not quite good - it should
generate the same name for the same adaptor in all cases. Anyway - the
workaround it's good but how about to make the write of HWADDR directry on first
assignement. "No user intervention" it's a goog thing. 

Comment 28 Karsten Hahn 2006-07-07 19:18:50 UTC
So, what you are saying is, if I don't change my hardware, the ordering of
devices should always stay the same, even without the HWADDR, the modprobe
blacklist, and the rules in udev/rules.d/. And even if I replace a NIC with
absolutely the same type in the same PCI slot, the ordering should not change.
Right? 
Just want to make sure that I understand the bug correctly and not waste any
time testing the wrong stuff.

Comment 29 Bill Nottingham 2006-07-07 19:21:54 UTC
(In reply to comment #28)
> So, what you are saying is, if I don't change my hardware, the ordering of
> devices should always stay the same, even without the HWADDR, the modprobe
> blacklist, and the rules in udev/rules.d/. And even if I replace a NIC with
> absolutely the same type in the same PCI slot, the ordering should not change.
> Right? 

It shouldn't, as long as udev's walk of sysfs for the device tree uses the same
algorithm. It may not match the order that the installer set them up in, or the
order that they were loaded in previous releases though.



Comment 30 Olivier Benghozi 2006-07-07 19:36:35 UTC
The problem is that it changes, and sometimes at each boot, on system where the
physical ordering didn't change in any way.

Forcing ordering should be automatic, either in udev or in the redhat system.
Udev is expected to follow a deterministic order when loading drivers, based on
appropriate information; it does not, there's a bug.
However, the system can automatically save ethX & MAC for automatic ordering.
Some of these features seem to be alreay implemented in kudzu, but a solid
solution is yet to define.
Anyway, if we look Fedora like a giant betatest or prerelease for RedHat
corporate OSes, we immediatly see that the current behavior is unacceptable,
anyway; erratic by default, it becomes rigid if obliged to manually configure
HWADDR.
So the need for either a patch to udev or a semi-scripted solution does exist.

Comment 31 Sorin Sbarnea 2006-07-07 20:36:03 UTC
I think comment #30 conclude my opinion. 

Comment 32 Bill Nottingham 2006-07-07 20:42:24 UTC
OK, just for tracking purposes, I've created a bug for making sure devices have
HWADDR automatically written (bug 197984). If you'd like to open a bug for udev
changing the order it loads devices in, please do. This bug will then be for
tracking problems when all the devices have HWADDR listed and still do not come
up right.

Comment 33 Karsten Hahn 2006-07-18 11:01:29 UTC
I have some new information:
We exchanged the quad-port D-LINK cards (sundance driver) on three sever with
dual port HP cards (e1000 driver). Now we have three sever with identical
hardware configuration (2 onbard tg3 NICs and 6 e1000).
We configured all three servers manually with the same network configuration,
different IP addresses, no manual udev rules, no bind to MAC address, but
modprobe blacklisting tg3 and e1000.
This works on two of the three sever, but not on the third: 2 e1000 ports (one
card) are listed on the pci bus, but udev does not recognize them. And the
really stange thing is, that one MAC from the other 4 e1000 cards is used twice
by udev (for eth3 and eth5).
Now I get a feeling, that this is not an initscript but a udev bug.

Comment 34 Karsten Hahn 2006-07-18 11:03:13 UTC
Created attachment 132593 [details]
config files for the last comment

sorry, forgot the attachment with the config

Comment 35 Karsten Hahn 2006-07-18 13:02:21 UTC
oh and I also forgot to mention that (as you mentioned before) the network
config seems to work with a correct HWADDR= line in each ifcfg file. 
And it seems also that the double MAC address is caused by bonding, but should
the bonding really change the MAC address as seen by udev?

Comment 36 Bill Nottingham 2006-07-18 18:51:07 UTC
The bonding should change the MAC address, but it should not generate new udev
events. Hm, I'll have to test trhe rename_device code w.r.t. bonding

Comment 37 Bug Zapper 2008-04-04 02:29:25 UTC
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers

Comment 38 Bug Zapper 2008-05-06 15:43:51 UTC
This bug is open for a Fedora version that is no longer maintained and
will not be fixed by Fedora. Therefore we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen thus bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.