Bug 187550
Summary: | Network interfaces assignment is unpredictable at each boot | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Olivier Benghozi <olivier.benghozi+redhatbugzilla> | ||||||
Component: | initscripts | Assignee: | Bill Nottingham <notting> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Brock Organ <borgan> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 5 | CC: | hattenator+bugzilla, karsten.hahn, koala, olivier.baudron, rvokal, sbeh2006, tomek, triage | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i686 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | bzcl34nup | ||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2008-05-06 15:43:53 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Olivier Benghozi
2006-03-31 20:53:16 UTC
I have the same (similar) problems, too. Two configurations: (a) i386 As kudzu (hwconf) reported: eth0 e1000 (private) eth1 8139too (public) (b) x86_64 As kudzu (hwconf) reported: eth0 forcedeth (unused) eth1 e1000 (public) I used to name the public interface as eth0, so I swapped both with the alias derivative in modprobe.conf: (a) alias eth0 8139too alias eth1 e1000 (b) alias eth0 e1000 alias eth1 forcedeth It worked as in RH9/FC1 to FC4. Unfortunately, it seems that FC5 uses the order in hwconf, not modprobe.conf. If I stop the network service, rmmod these nic modules, and run "ifup eth0 ; ifup eth1", the correct modules are loaded. Now I have to edit all related network scripts to "fixed" this. Any suggestions? I have the same problem. In my system I have 4 network interfaces one e1000 (embedded on mb) and 3 pci cards using 8139too. It seams that at boot time the additional cards are swaped. It looks like I also have this problem with additional twist. I have 2 onboard NICs (r8169 and rhine II), modprobe lists eth0 as r8169 and eth1 as via-rhine. It looks like eth0 is always brought up as expected but rhine gets device name in the form devX where X is a number (5 digit if I remeber correctly). I worked around this by executing # ip link show dev eth0 # ip link show dev eth1 early in rc.sysinit (when /etc/sysconfig/network is included) and it worked but with lots of ugly selinux complaints. why don't you bind the interface name to the MAC address? Either with system-config-network of with HWADDR in ifcfg- Because according to /usr/share/doc/initscripts-8.31.1/sysconfig.txt it does not work as intended with MACADDR. Well, I already found a workaround: in my case, the only proper workaround is putting both NIC drivers in /etc/modprobe.d/blacklist. It prevents the loading of these drivers by udev and instead uses the classical system; it suddently makes my system work, so maybe something has been forgotten in the udev system ? Of course it's not an acceptable solution, only a temporary workaround. Udev is not expected to generate such problem, its adoption was expected to prevent this. Interfaces that don't physically move or change should not have their name/IP/order move or changed at each boot: naming of static interfaces shoud be automatically static across reboots. Of course udev system should work properly without such workaround. HWADDR= ethernet hardware address for this device not MACADDR!!! I understand this and I NEED to change MAC address as described here: MACADDR= Set the hardware address for this device to this. Use of this in conjunction with HWADDR= may cause ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unintended behavior. ^^^^^^^^^^^^^^^^^^^^ And this is why I don't use HWADDR. ah ok... sorry... *** Bug 188454 has been marked as a duplicate of this bug. *** This is possibly a duplicate of Bug #188955. Please test the initscripts in FC5 updates-testing to see if it solves this issue for you. If that package solves your problem, then it is an issue where udev is loading the drivers in a random order even before it has a chance to rename the devices. HWADDR does not help in this case at all. I installed this initscripts-8.31.2-1.i386.rpm. I removed both lines in /etc/modprobe.d/blocked. I didn't modify /etc/modprobe.conf. I rebooted several times. Most of the time e100 is loaded first, sometimes 3c59x is loaded first. So, it didn't correct the problem for me; it looks like it didn't change the system behavior in fact. symbol: do you have ifcfg files for both interfaces with HWADDR in them? I think this update might have solved the problem but I can't guarantee this. Booted 3 times and it's ok. The use of initscripts-8.31.2-1.i386.rpm does NOT solve the problem. Please tell me what information do I need to submit. Do you have ifcfg-XXX files for all your interfaces, with appropriate HWADDR set? We have this problem here too. It seems, that this bug occures even with the drivers blacklisted for D-LINK cards (eight sundance driver) with two tg3 onboard cards. Intel cards (four e1000) instead of the sundance cards seem to work in the same server (HP Proliant) when blacklisted. Please advise if you need further information. 1) make sure you're using the latest updates-testing initscripts 2) make sure you have HWADDR= in all your ifcfg files for all the interfaces okay, I tried again: 1. New initscripts alone did not work 2. New initscripts + bind to MAC address (HWADDR=) did not work 3. initscripts/HWADDR= and blacklisting in /etc/modprobe.d/blacklist did not work 4. initscripts/HWADDR=/blacklisting and adding udev rules to /etc/udev/rules.d/ did work (one rule like the following for each eth device: KERNEL="eth*", ID="0000:03:01.0", NAME="eth0") 5. after reverting some of the stuff above: adding udev rules and blacklisting in /etc/modprobe.d/blacklist (but without new initscripts and HWADDR=) works also What do your config files look like? Direct udev renaming isn't really reliable, as it will fail if there is already a device at the name you're trying to rename. Created attachment 132055 [details]
This is the config, without the HWADDR= lines
Thanks a lot for your time. Please advise if you need anything else.
While you still may be running into bugs, there's no way it can work without the HWADDR lines. I wonder if the bonding usage is affecting this. I'll try on Monday without the bonding interfaces and with the HWADDR. But I still want to avoid binding the interfaces to a MAC address, because this complicates our procedures for replacing a defective NIC drastically. It's impossible to fix without the HWADDR, as udev can (and will) load the modules in arbitrary order. I confirm that it works if i write HWADDR for each adapter but i think bug should remain open because this is just an workaround. There is no other way to enforce ordering; udev loads modules as it finds them on the PCI bus. In this case I think the naming algorithm it's not quite good - it should generate the same name for the same adaptor in all cases. Anyway - the workaround it's good but how about to make the write of HWADDR directry on first assignement. "No user intervention" it's a goog thing. So, what you are saying is, if I don't change my hardware, the ordering of devices should always stay the same, even without the HWADDR, the modprobe blacklist, and the rules in udev/rules.d/. And even if I replace a NIC with absolutely the same type in the same PCI slot, the ordering should not change. Right? Just want to make sure that I understand the bug correctly and not waste any time testing the wrong stuff. (In reply to comment #28) > So, what you are saying is, if I don't change my hardware, the ordering of > devices should always stay the same, even without the HWADDR, the modprobe > blacklist, and the rules in udev/rules.d/. And even if I replace a NIC with > absolutely the same type in the same PCI slot, the ordering should not change. > Right? It shouldn't, as long as udev's walk of sysfs for the device tree uses the same algorithm. It may not match the order that the installer set them up in, or the order that they were loaded in previous releases though. The problem is that it changes, and sometimes at each boot, on system where the physical ordering didn't change in any way. Forcing ordering should be automatic, either in udev or in the redhat system. Udev is expected to follow a deterministic order when loading drivers, based on appropriate information; it does not, there's a bug. However, the system can automatically save ethX & MAC for automatic ordering. Some of these features seem to be alreay implemented in kudzu, but a solid solution is yet to define. Anyway, if we look Fedora like a giant betatest or prerelease for RedHat corporate OSes, we immediatly see that the current behavior is unacceptable, anyway; erratic by default, it becomes rigid if obliged to manually configure HWADDR. So the need for either a patch to udev or a semi-scripted solution does exist. I think comment #30 conclude my opinion. OK, just for tracking purposes, I've created a bug for making sure devices have HWADDR automatically written (bug 197984). If you'd like to open a bug for udev changing the order it loads devices in, please do. This bug will then be for tracking problems when all the devices have HWADDR listed and still do not come up right. I have some new information: We exchanged the quad-port D-LINK cards (sundance driver) on three sever with dual port HP cards (e1000 driver). Now we have three sever with identical hardware configuration (2 onbard tg3 NICs and 6 e1000). We configured all three servers manually with the same network configuration, different IP addresses, no manual udev rules, no bind to MAC address, but modprobe blacklisting tg3 and e1000. This works on two of the three sever, but not on the third: 2 e1000 ports (one card) are listed on the pci bus, but udev does not recognize them. And the really stange thing is, that one MAC from the other 4 e1000 cards is used twice by udev (for eth3 and eth5). Now I get a feeling, that this is not an initscript but a udev bug. Created attachment 132593 [details]
config files for the last comment
sorry, forgot the attachment with the config
oh and I also forgot to mention that (as you mentioned before) the network config seems to work with a correct HWADDR= line in each ifcfg file. And it seems also that the double MAC address is caused by bonding, but should the bonding really change the MAC address as seen by udev? The bonding should change the MAC address, but it should not generate new udev events. Hm, I'll have to test trhe rename_device code w.r.t. bonding Fedora apologizes that these issues have not been resolved yet. We're sorry it's taken so long for your bug to be properly triaged and acted on. We appreciate the time you took to report this issue and want to make sure no important bugs slip through the cracks. If you're currently running a version of Fedora Core between 1 and 6, please note that Fedora no longer maintains these releases. We strongly encourage you to upgrade to a current Fedora release. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained and closing them. http://fedoraproject.org/wiki/LifeCycle/EOL If this bug is still open against Fedora Core 1 through 6, thirty days from now, it will be closed 'WONTFIX'. If you can reporduce this bug in the latest Fedora version, please change to the respective version. If you are unable to do this, please add a comment to this bug requesting the change. Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we are following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again. And if you'd like to join the bug triage team to help make things better, check out http://fedoraproject.org/wiki/BugZappers This bug is open for a Fedora version that is no longer maintained and will not be fixed by Fedora. Therefore we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen thus bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |