Bug 437692

Summary: Network Interfaces Mysteriously Wiped Out After Yum Update
Product: [Fedora] Fedora Reporter: Chris Spencer <chrisspen>
Component: initscriptsAssignee: Bill Nottingham <notting>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: low    
Version: 8CC: dcbw, maurizio.antillon, rvokal, trevor
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-11-26 19:09:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg from machine 1 right after 1st boot into F8
none
ifcfgs & udev from machine 1 right after 1st boot into F8
none
dmesg from machine 2 right after 1st boot into F8
none
ifcfgs & udev from machine 2 right after 1st boot into F8 none

Description Chris Spencer 2008-03-16 15:05:48 UTC
Description of problem:
After updating my system via yum then rebooting, I found all of my network
interfaces had been renamed to *.bak, and none would start during boot.

How reproducible:
So far, only one-time.

Additional info:
ifup and ifdown now complain of errors when starting/stopping an interface,
presumably due to the *.bak renaming, e.g.

[localhost]# ifup eth0.bak

Determining IP information for eth0.../sbin/dhclient-script: configuration for
eth0 not found. Continuing with defaults.
/etc/sysconfig/network-scripts/network-functions: line 78: eth0: No such file or
directory
External network device eth0 is not ready. Aborting..
/sbin/dhclient-script: configuration for eth0 not found. Continuing with defaults.
/etc/sysconfig/network-scripts/network-functions: line 78: eth0: No such file or
directory
Firewall started
 done.
[localhost]# ifdown eth0.bak
/sbin/dhclient-script: configuration for eth0 not found. Continuing with defaults.
/etc/sysconfig/network-scripts/network-functions: line 78: eth0: No such file or
directory
Firewall started

Comment 1 Chris Spencer 2008-03-16 15:10:00 UTC
This is the list of everything that Yum updated, just prior to the problem:
Mar 15 23:29:50 Updated: coreutils - 6.9-16.fc8.i386
Mar 15 23:29:51 Updated: e2fsprogs-libs - 1.40.4-2.fc8.i386
Mar 15 23:29:53 Updated: krb5-libs - 1.6.2-13.fc8.i386
Mar 15 23:29:53 Updated: hal-libs - 0.5.10-1.fc8.2.i386
Mar 15 23:29:54 Updated: audit-libs - 1.6.8-2.fc8.i386
Mar 15 23:29:55 Updated: net-snmp-libs - 1:5.4.1-6.fc8.i386
Mar 15 23:29:55 Updated: nautilus-extensions - 2.20.0-9.fc8.i386
Mar 15 23:30:20 Installed: kernel - 2.6.24.3-12.fc8.i686
Mar 15 23:30:20 Updated: eject - 2.1.5-6.fc8.i386
Mar 15 23:30:21 Updated: taglib - 1.5-1.fc8.i386
Mar 15 23:30:22 Updated: fuse-libs - 2.7.3-2.fc8.i386
Mar 15 23:30:24 Updated: imlib2 - 1.4.0-6.fc8.i386
Mar 15 23:30:25 Updated: boost - 1.34.1-7.fc8.i386
Mar 15 23:30:26 Updated: libtirpc - 0.1.7-15.fc8.i386
Mar 15 23:30:51 Updated: nautilus - 2.20.0-9.fc8.i386
Mar 15 23:30:53 Updated: fuse - 2.7.3-2.fc8.i386
Mar 15 23:30:54 Updated: audit - 1.6.8-2.fc8.i386
Mar 15 23:30:55 Updated: audit-libs-python - 1.6.8-2.fc8.i386
Mar 15 23:30:57 Updated: hal - 0.5.10-1.fc8.2.i386
Mar 15 23:30:58 Updated: krb5-workstation - 1.6.2-13.fc8.i386
Mar 15 23:31:02 Updated: e2fsprogs - 1.40.4-2.fc8.i386
Mar 15 23:31:03 Updated: rsyslog - 2.0.2-3.fc8.i386
Mar 15 23:31:06 Updated: mercurial - 0.9.5-6.fc8.i386
Mar 15 23:31:07 Updated: synaptics - 0.14.6-2.fc8.i386
Mar 15 23:31:08 Updated: cpio - 2.9-7.fc8.i386
Mar 15 23:31:10 Updated: cmake - 2.4.8-1.fc8.i386
Mar 15 23:31:11 Updated: gtk-nodoka-engine - 0.6.2-1.fc8.i386
Mar 15 23:31:12 Updated: xterm - 234-1.fc8.i386
Mar 15 23:31:13 Installed: python-genshi - 0.4.4-2.fc8.noarch
Mar 15 23:31:15 Installed: python-paste - 1.4.2-1.fc8.noarch
Mar 15 23:31:17 Updated: smolt - 1.1.1.1-1.fc8.noarch
Mar 15 23:31:17 Updated: smolt-firstboot - 1.1.1.1-1.fc8.noarch
Mar 15 23:31:20 Updated: kernel-headers - 2.6.24.3-12.fc8.i386
Mar 15 23:31:25 Updated: tzdata - 2007k-2.fc8.noarch
Mar 15 23:31:27 Updated: tzdata-java - 2007k-2.fc8.noarch
Mar 15 23:31:29 Updated: setroubleshoot-plugins - 2.0.4-4.fc8.noarch
Mar 15 23:31:47 Installed: kernel-devel - 2.6.24.3-12.fc8.i686
Mar 15 23:31:48 Updated: perl-libs - 4:5.8.8-36.fc8.i386
Mar 15 23:31:57 Updated: perl - 4:5.8.8-36.fc8.i386
Mar 15 23:32:41 Updated: evolution - 2.12.3-3.fc8.i386
Mar 15 23:32:58 Updated: kdelibs - 6:3.5.9-5.fc8.i386
Mar 15 23:32:59 Installed: perl-Date-Manip - 5.48-1.fc8.noarch
Mar 15 23:33:07 Updated: evolution-help - 2.12.3-3.fc8.i386
Mar 15 23:33:09 Updated: net-snmp - 1:5.4.1-6.fc8.i386
Mar 15 23:33:30 Erased: perl-DateManip

Comment 2 Dan Williams 2008-03-17 20:26:18 UTC
initscripts, not NM

Comment 3 Bill Nottingham 2008-03-17 20:31:42 UTC
The only reason they would be removed is if, at some boot, the modules would not
load properly, giving the indication that the devices were no longer present.

Does moving them back solve the problem (you can't run ifup with .bak files,
really, as the scripts are designed to explicitly ignore them in most cases.)

Comment 4 Chris Spencer 2008-03-17 23:49:09 UTC
How do I move them back? There's no option in system-config-network to change
their name, and searching for "eth0.bak" gives me three different files.

Comment 5 Bill Nottingham 2008-03-18 02:36:16 UTC
for foo in /etc/sysconfig/network-scripts/ifcfg-*.bak ; do mv $foo ${foo%%.bak}
; done

Comment 6 Chris Spencer 2008-03-18 11:37:52 UTC
Thanks Bill. I see similar files in /etc/sysconfig/networking/devices and
/etc/sysconfig/networking/profiles/default. Should I move these as well, or will
those be automatically updated?

Comment 7 Bill Nottingham 2008-03-18 15:25:02 UTC
Moving those won't hurt.

Comment 8 Trevor Cordes 2008-03-24 23:18:15 UTC
This bug happens to me ALL the time.  I manage about 2 dozen router/firewalls
based on F8.  This bug hits me often upon updates and/or reboots!  These are
boxes with 2-4 NICs.  Whatever does this calculation is getting it WRONG on a
very regular basis.

These systems worked 100% fine with F5 but since the upgrade to F8 I have to
sweat each time I reboot remotely!  There is no specific or common hardware, a
whole swath of different NICs, etc.  The hardware is all 100% fine as I can go
onsite and ifup or rmmod/ifup the NICs and they work as expected, after I rename
the files back.

Besides just naming them to .bak and making the machine useless (remotely), the
scripts (or udev?) also sometimes swap the NICs around, like naming 0 to bak and
1 to 0.  It's driving me mental!  It's entirely non-deterministic!  Sometimes
it's fine.

This is a serious, serious, serious problem.  I mean to provide much more info
but it will have to wait until I have a mo.  I'll provide dmesg output which
clearly shows the whatever messing up on figuring out the NICs.

All my ifcfg scripts have the hwaddr parm set correctly.  modprobe.conf has the
eth aliases set properly.  I don't know what else to do.

PLEASE PLEASE make an option like "neverrename=1"!!  Even if my eth0 died I
would never ever want eth1 to be renamed to eth0!!  That makes no sense for a
firewall/router, and in fact is a huge security hole!

Argh!


Comment 9 Trevor Cordes 2008-03-24 23:22:24 UTC
Oh, I do all administration by hand, via the files, never with s-c-n or any gui
tools.  I try to disable as many of those automatic things as I can.

Comment 10 Bill Nottingham 2008-03-25 02:09:02 UTC
Trevor - the simple solution to your issue would be to disable the kudzu service.

Comment 11 Trevor Cordes 2008-03-25 13:00:16 UTC
Kudzu is what renames these files?  Kudzu is indeed running, and I guess I don't
really need it.  I'll try that, thanks!

Anyhow, I upgraded 2 boxes last night from FC5 to F8 and this bug hit on both
boxes, but in slightly different ways.  I saved the dmesg output, should I
attach it?  Whatever is causing this gets it all horribly wrong.  It's worse
than doing nothing at all.


Comment 12 Bill Nottingham 2008-03-25 15:12:00 UTC
Would need at a minimum /etc/sysconfig/network-scripts/ifcfg-* and
/etc/udev/rules.d/70-persistent-net.rules.

Comment 13 Trevor Cordes 2008-03-27 05:30:11 UTC
OK, here's all the details in the following attachments.  Everything from box #1
is suffixed with "1", and likewise for 2.


Comment 14 Trevor Cordes 2008-03-27 05:32:52 UTC
Created attachment 299275 [details]
dmesg from machine 1 right after 1st boot into F8

Comment 15 Trevor Cordes 2008-03-27 05:33:28 UTC
Created attachment 299276 [details]
ifcfgs & udev from machine 1 right after 1st boot into F8

Comment 16 Trevor Cordes 2008-03-27 05:35:07 UTC
Created attachment 299277 [details]
dmesg from machine 2 right after 1st boot into F8

Comment 17 Trevor Cordes 2008-03-27 05:35:36 UTC
Created attachment 299278 [details]
ifcfgs & udev from machine 2 right after 1st boot into F8

Comment 18 Trevor Cordes 2008-03-27 05:43:53 UTC
Notes:

Ignore the "fw99d" type dmesg output, it's just my wacky firewall debug output.

In all cases I was able to easily ifup the interfaces after renaming the files
back to what they were before the boot.

What I mean by "after 1st boot into F8" is I had just yum upgraded from F7 and
after the successful update I rebooted from F7 to F8.  The bug never hits in F7,
AFAIK, it's always in F8.  I'm not saying the bug only hits on the 1st boot into
F8, I've had this bug hit all over the place, though whether it's tied to
intra-F8 yum updates, I can't be sure.  Certainly yum updating is more
"dangrous" than just rebooting.  Perhaps kernel, udev, or kudzu updates cause
the code to run again and rethink the interfaces, making the bug more likely to
happen after an update.

Since setting all my 25 boxes to kudzu OFF, the problem has not recurred, but
it's only been 1-2 days, so we'll see what happens over the next while.

Thanks for all your help, sorry to be grouchy, I'm not angry at people, I'm
angry at having to drive 50km's at 3am to attach a head to a headless box and
bring up interfaces manually!

I think someone seriously needs to think about the security implications of
arbitrarily renaming ifcfg's.  Again, what if an internal (trusted) and external
(internet) interface are swapped?  Scary.


Comment 19 Bill Nottingham 2008-03-27 14:00:52 UTC
Oh, I have an idea.

Was this an update to base Fedora 8, or an update to Fedora 8 + updates? There
was a kernel update for a while for Fedora 8 that would break this code, causing
it to think that network adapters disappeared that didn't.

Comment 20 Trevor Cordes 2008-03-28 11:05:38 UTC
This was a "live" update via yum to F8+updates.  In other words, I was running
F7, I did the tricks with fedora-release and did "yum upgrade", which
immediately put in the newest everything, including he kernel.  So it was out of
the gate running 2.6.24.3-34.fc8 when it next rebooted.

I've seen this bug since day one upgrading boxes to F8, ie: across the gamut of
early to late kernel versions.  And I swear I remember seeing this bug hit often
after normal F8 version X to F8 version Y updates.  I think, but am not so sure,
that even just plain reboots (no updates) triggers it sometimes.

I'm loathe to try to reproduce this (reenable kudzu) but certainly if it occurs
again (regardless of kudzu) I will post here.  If there are some direct ideas to
try I could possibly try them on a less-critical system with someone onsite.


Comment 21 Trevor Cordes 2008-03-28 11:06:31 UTC
Oh, I've also done a couple of boxes fresh from F8 DVD media (no upgrades, just
blank hard disk installs) and I've seen this behaviour there also.  It's not
just yum upgrading.


Comment 22 Matteo Corti 2008-03-28 13:12:42 UTC
I've experienced the same problem 2 out of 3 machines I have with Fedora 8. The
correct configuration was moved to the .bak file and the interfaces were set up
with DHCP.

kuzdu was (I disabled it) running on all of them

Comment 23 Bill Nottingham 2008-03-28 15:53:01 UTC
From a kernel perspective, you really want either kernel-2.6.24.3-47.fc8 or
later, or kernel-2.6.23.15 or earlier.

Comment 24 Trevor Cordes 2008-03-29 11:54:47 UTC
What's the redhat or kernel.org bugzilla #'s for the bug (comment #23)?

I don't think I've tested one with 3-50 and kudzu on yet.  I'm pretty sure this
has happened to me on pre-kernel-2.6.23.15, but I can't be sure.


Comment 25 Bill Nottingham 2008-03-31 16:54:01 UTC
It's a configuration issue, not a bug, per se. A needed configuration setting
was not set.

Comment 26 Bug Zapper 2008-11-26 10:10:43 UTC
This message is a reminder that Fedora 8 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 8.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '8'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 8's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 8 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 27 Bill Nottingham 2008-11-26 19:09:45 UTC
This is resolved in later releases by not configuring network devices, and mainly appeared on Fedora 8 due to a broken kernel config that existed during a particular update period. Closing.

Comment 28 Trevor Cordes 2009-04-07 06:20:03 UTC
This problem is back in F10 and it's worse than ever.  Now you have to make sure you get the MAC addresses just so in the ifcfg's AND write your own /etc/udev/rules.d/70-persistent-net.rules with exactly the correct syntax (and I mean exactly!) or udev does Very Stupid Things with the renames, basically duplicating your entire interface set and ignoring your ifcfg settings (because the numbers no longer match anything) and making remote machines unreachable.

I now have to (at least!) do the following to all boxes to get the right interfaces on the correct cards without stupid unwanted renaming:

1. Remove all eth aliases from modprobe.conf (used to be req'd in older F's)

2. Make all ifcfg's just so with the correct MACs

3. Make a precise /etc/udev/rules.d/70-persistent-net.rules with the MACs and full line syntax (there was some bug regarding this)

4. rpm -e NetworkManager

5. rpm -e kudzu

Only after all that can I safely reboot and hope to reach a remote machine.

This is insane.  Any server or >1 NIC computer will run into this.

I should just be able to put my MACs in ifcfg and maybe some parameter somewhere like "I_KNOW_WHAT_I_AM_DOING_WITH_IFS_DONT_MESS_WITH_IT_DARNIT=1" and not worry about it.  I don't know why with each release this gets harder to achieve.  I mean, if I have 3 NIC's, how hard does it have to be to have F know I don't want an eth3 (or 4, or 5...)?