Bug 208740 - network interface improperly renamed.
Summary: network interface improperly renamed.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: initscripts
Version: 6
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Bill Nottingham
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks: FC6Blocker
TreeView+ depends on / blocked
 
Reported: 2006-10-01 11:21 UTC by David Woodhouse
Modified: 2014-03-17 03:02 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-12-06 18:13:09 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
1. patch to increase verbosity (1.56 KB, patch)
2006-11-23 16:20 UTC, Wilfried Weissmann
no flags Details | Diff
2. patch, handle renaming of temporary devices (3.63 KB, patch)
2006-11-23 16:42 UTC, Wilfried Weissmann
no flags Details | Diff
a slightly different patch for this (1.99 KB, patch)
2006-11-28 23:14 UTC, Bill Nottingham
no flags Details | Diff

Description David Woodhouse 2006-10-01 11:21:35 UTC
If network driver modules get reloaded in the wrong order so that the device
which should be 'eth0' is actually 'eth1', we rename the devices.

Unfortunately, this can result in the device which _was_ 'eth0' ending up with a
name like '__tmp1804289383'. After we're done renaming, we shouldn't leave it
with a device name like that -- we should put it back to eth%d.

NetworkManager and dhclient both fail on devices like '__tmp1804289383':

Oct  1 11:56:41 pegasos dhclient: Bind socket to interface: No such device

Renaming it with one fewer digit (i.e. '__tmp180428938') makes it work with
dhclient, although hal and NetworkManager seem to get utterly confused when we
rename devices -- I tried renaming it back to 'eth3' and NetworkManager was
still in a loop eating 100% CPU trying to do SIOCGIFFLAGS on a device starting
'__tmp180428938...' and ending in complete garbage.

Comment 1 Bill Nottingham 2006-10-02 13:52:22 UTC
kudzu (should) clean this up for any unconfigured interfaces on boot. Do you
have it installed and enabled?

Comment 2 David Woodhouse 2006-10-02 14:04:40 UTC
No, kudzu was disabled. Will undo the configuration change and try again with
kudzu enabled.


Comment 3 David Woodhouse 2006-10-02 14:15:47 UTC
Yes, that makes it work.

Comment 4 David Woodhouse 2006-10-07 09:09:40 UTC
... sometimes. I just saw it again even with kudzu running. I've put back my
ifcfg-eth1 for now.

Comment 5 Bill Nottingham 2006-10-09 17:22:23 UTC
So, kudzu only oes this the first time it sees a device, as it changes the name
and then writes a config file. After that, it assumes the config file is still
there (or changed by s-c-network if the user wanted it changed.)

Comment 6 Wilfried Weissmann 2006-11-23 16:20:50 UTC
Created attachment 141998 [details]
1. patch to increase verbosity

Comment 7 Wilfried Weissmann 2006-11-23 16:42:26 UTC
Created attachment 142000 [details]
2. patch, handle renaming of temporary devices

(sorry, patch "141998: 1. patch to increase verbosity" belongs to this one. i
just figured out that i failed to transmit both patches together in one note.
apply both patches on top of each other!)

this patch renames any __tmpXXX devices to the devicenames that are specified
in the configuration. i cannot rename these temporary names back to their
original names if they where never configured. their names are already occupied
by configured devices. my solution for this was to generate a new device name
that can also used later on. (but it is not recommended!) new devices that were
never configured but have to be renamed because of a devicename clash are now
called newnetXXXXXX. the 6 digits are chosen semirandomly by calling srand() to
prevent clashes of newnet names.

this change solves any __tmp1804289383 problems that are floating around that i
can think of (+ bugs #214817 #209009 #210780). btw: this bug appears in FC6
now. the patches apply against initscripts-8.45.5-1 version of this
distribution.

for production uses you might want reduce the verbosity of my patch.

regards,
wilfried

Comment 8 Bill Nottingham 2006-11-27 19:05:36 UTC
So, for patch #1, you're logging in a situation where there's no logging daemon.
Not sure how well that helps. 

As to replacing one particular temporary name with another, I don't think that's
particularly useful. You may want to look at the current updates/updates-testing
initscripts and kudzu - this should fix it so that config files are always
written for new devices, and any __tmp devices are renamed there. In the
meantime, any machine bitten by this might need to manually add a configuration.


Comment 9 Wilfried Weissmann 2006-11-28 10:16:47 UTC
I just installed initscripts-8.45.6-1 and kudzu-1.2.57.3-1 from updates-testing
but the problem remains.

About logging: With LOG_CONS I also log to the console. But that was just for
debugging and illustrating the problem.

About renaming: I only rename to "newnet" if no name for that device was found
in the configuration. With the patch applied after a Fedora install you will not
have a "newnet" device because all network devices have a configured names and
you get properly named devices. Without the patch you have a "__tmp" device if
network device names must be exchanged no matter if the names were configured or
not.
With the patch you only end up with a "newnet" device if you add new hardware
and the new device names are in conflict with the already configured devices. A
better name would be nice, but I was not sure if stripping off the digits at the
end of the original device name and at the end of rename_device append the next
free number to that name does the job. (Thinking about vlan, tuntap, and
inconsistent naming conventions in wlan drivers, ...) It is far from perfect but
it is some progress (but it fixed my problem so i posted it to bugzilla).

Regards,
Wilfried

Comment 10 Bill Nottingham 2006-11-28 18:22:41 UTC
Which specific problem remains?

If it's just that you have __tmpXXX instead of newnet*, I'm not sure that's a
useful change - it's just exchanging one temporary name for another.

Comment 11 David Woodhouse 2006-11-28 18:48:38 UTC
(In reply to comment #10)
> Which specific problem remains?
> 
> If it's just that you have __tmpXXX instead of newnet*, I'm not sure that's a
> useful change - it's just exchanging one temporary name for another.

As long as the new name is one character shorter, it means that dhclient and
NetworkManager should no longer crap themselves :)

Comment 12 Bill Nottingham 2006-11-28 18:57:14 UTC
Well, that's silly. They should be fixed. :)

Comment 13 Wilfried Weissmann 2006-11-28 22:03:03 UTC
To clarify my statement: initscripts-8.45.6-1 and kudzu-1.2.57.3-1 from
updates-testing do not fix this bug. I still get __tmpXXX devices. I have 4
ethernet devices:

1x 3Com (3c59x)
2x Realtek (8139too)
1x VIA Rhine (via-rhine)

They are detected by anaconda in this order:

3com => eth0
1. Realtek => eth1
2. Realtek => eth2
VIA Rhine => eth3

The ifcfg-* files are written accordingly. No old bak files or rpmsave or
whatever got in the way (I think this was some other bug about that but this one
did not byte me.). This is fresh install.

One of these devices always gets a __tmpXXX name with original FC6 install or
with the kudzu and initscripts rpm in updates/testing. The reason is that the
order in which the modules are loaded is different when the installed system
boots and the renaming that has to be done to get consistent device names does a
"loop". Something like:

before udev magic => after udev magic:
--------------------------------------
eth0 => eth1
eth1 => eth2
eth2 => eth3
eth3 => eth0

rename_device fails in this case and leaves a __tmpXXX device.

My patch fixes this case. With my patch applied I get all ethX devices with
their proper names. That is: eth[0-3]. No newnet in this case.
I would get newnet devices (and only then!) when I add a new ethernet device and
the devicename of the new device is inside the renaming "loop". In this case
this would mean that if the new device is eth4 after the module load then it is
not part of the loop and we have no newnet device.
Otherwise we have. I agree which you that this is not so great. I could include
in my patch something like this instead of the "newnet" part if you want:

rename_new_unconfigured_dev(newdevice) {
  fixedname=truncate_trailing_digits(newdevice);
  for(i=0;1;i++) {
    if(device_exists(fixedname+itoa(i) == false) {
      fixedname=fixedname+itoa(i);
      return fixedname;
    }
  }
}

This way we would always have proper names. I cannot think of a case where this
would break. Maybe there is a weird case with vlans, bonding or whatever. I have
not done complex network configuration on Redhat servers. You people know better.

Comment 14 Bill Nottingham 2006-11-28 23:14:34 UTC
Created attachment 142340 [details]
a slightly different patch for this

So, in parallel I was working on a I-thought-unrelated issue that turned out to
be  exactly this (doh). Here's what I'm using now, which is a
similar-idea-but-different patch approach. I've integrated the srand() - that
was obviously missing in general.

Some comments, which explain the logic in this patch:

1) we should already know which temporary devices we want to rename, and what
to rename them to, because those would be ones where we had to not rename them
because they conflicted with a device already in the chain - this eliminates
the need to re-look up hwaddrs in the rename-of-temporary-devices loop
2) the reason I keep using the __tmpXXXXX name is because that is used by kudzu
to find newly added temporary devices (which it then renames to something sane
when it writes configs for them

Does this patch work for you?

Comment 15 Bill Nottingham 2006-11-29 00:23:25 UTC
Added in 8.48-1, 8.45.7-1.

Comment 16 Wilfried Weissmann 2006-11-29 09:21:28 UTC
I applied your patch against 8.45.6-1. I could not find 8.45.7-1 in the
repository. All network devices have correct names now. I simulated new devices
by removing entries in kudzu's hwconf and the corresponding files in
network-scripts. This works too. The new devices are configured with dhcp.
This one works! :)

Comment 17 Fedora Update System 2006-11-29 12:26:23 UTC
initscripts-8.45.7-1 has been pushed for fc6, which should resolve this issue.  If these problems are still present in this version, then please make note of it in this bug report.

Comment 18 Fedora Update System 2006-12-06 17:40:22 UTC
initscripts-8.45.7-1 has been pushed for fc6, which should resolve this issue.  If these problems are still present in this version, then please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.