Bug 61169 (IT_72965) - Kudzu doesn't work as expected when removing NICs
Summary: Kudzu doesn't work as expected when removing NICs
Keywords:
Status: CLOSED ERRATA
Alias: IT_72965
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kudzu
Version: 3.0
Hardware: i686
OS: Linux
high
high
Target Milestone: ---
Assignee: Bill Nottingham
QA Contact: David Lawrence
URL:
Whiteboard:
: 62732 70279 114517 128276 (view as bug list)
Depends On:
Blocks: 132991
TreeView+ depends on / blocked
 
Reported: 2002-03-14 19:16 UTC by Danny Trinh
Modified: 2014-03-17 02:26 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-06-14 16:07:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Attached is JAguar info: dmidecode, dump_pirq, mptable, etc. (23.71 KB, application/octet-stream)
2002-03-14 19:21 UTC, Danny Trinh
no flags Details
The files you need (26.48 KB, application/octet-stream)
2002-03-14 20:20 UTC, Danny Trinh
no flags Details
Content of ifcfg-eth8 (39 bytes, text/plain)
2002-04-05 22:59 UTC, Danny Trinh
no flags Details
Issue-10027.zip (188.16 KB, application/octet-stream)
2003-09-10 15:28 UTC, Larry Troan
no flags Details
Network configuration - valid state (LOM enabled) (30.00 KB, application/octet-stream)
2005-03-29 23:13 UTC, jordan hargrave
no flags Details
Network state - after LOM disable (20.00 KB, application/octet-stream)
2005-03-29 23:16 UTC, jordan hargrave
no flags Details
/var/log/messages (1.31 KB, text/plain)
2005-05-05 07:15 UTC, Charles Rose
no flags Details
Network status before the LOM is disabled (20.00 KB, application/octet-stream)
2005-05-09 11:42 UTC, Charles Rose
no flags Details
Network status after the LOM is disabled (20.00 KB, application/octet-stream)
2005-05-09 11:43 UTC, Charles Rose
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2004:509 0 normal SHIPPED_LIVE Updated kudzu packages 2004-12-21 05:00:00 UTC
Red Hat Product Errata RHBA-2005:125 0 normal SHIPPED_LIVE kudzu bug fix update 2005-05-19 04:00:00 UTC

Description Danny Trinh 2002-03-14 19:16:55 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows 98)

Description of problem:
After installing Hampton B2 to Jaguar, and reboot. I shut down server, removed 
5 dual PRO100 NIC cards, and then reboot. Kudzu starts and asks to remove all 5 
dual PRO100 NIC cards (I say yes). The process continued, and tried to 
initialize NIC cards that have been removed.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Jaguar + onboard ROMB + onboard NIC cards + PERC3/QC + PRO1000XT + 5 dual 
PRO100 NIC cards. 
2.Install HAmpton B2
3.Reboot
4.ifconfig to check all NIC are working OK.
5.Shutdown server, and remove 5 dual PRO100 NIC cards.
6.Reboot
7.Pay attention on boot process.
8.Check /etc/sysconfig/network-scripts directory
9.Check /etc/modules.conf file
	

Actual Results:  Kudzu doesn't modify, and clean up all the necessary files.

Expected Results:  It should work without any errors.

Additional info:

Comment 1 Danny Trinh 2002-03-14 19:21:27 UTC
Created attachment 48506 [details]
Attached is JAguar info: dmidecode, dump_pirq, mptable, etc.

Comment 2 Bill Nottingham 2002-03-14 19:35:13 UTC
Was that modules.conf file from before or after the removal?

I'd need the modules.conf from both before and after, as well as the
/etc/sysconfig/hwconf from before.

Comment 3 Danny Trinh 2002-03-14 20:02:31 UTC
After the removal. I will upload all necessary files later.

Comment 4 Danny Trinh 2002-03-14 20:20:10 UTC
Created attachment 48524 [details]
The files you need

Comment 5 Danny Trinh 2002-03-21 17:55:51 UTC
not fixed in beta3

Comment 6 Bill Nottingham 2002-03-22 06:23:50 UTC
Fixed in kudzu-0.99.48-1; it should remove all the relevent aliases, renumber
any aliases higher than the one removed, and rename the ifcfg-<foo> files of any
interfaces that are renumbered.

Comment 7 Danny Trinh 2002-04-05 22:58:53 UTC
This doesn't work as it suppose to. See attachment if ifcfg-eth8

Comment 8 Danny Trinh 2002-04-05 22:59:40 UTC
Created attachment 52486 [details]
Content of ifcfg-eth8

Comment 9 Danny Trinh 2002-04-05 23:01:19 UTC
Also, please see the bug # 62732

Comment 10 Bill Nottingham 2002-04-09 16:56:52 UTC
*** Bug 62732 has been marked as a duplicate of this bug. ***

Comment 11 Jay Turner 2002-04-17 15:20:54 UTC
Deferring this, as it doesn't look like it's going to get fixed in this release.

Comment 12 Larry Troan 2003-01-12 05:02:05 UTC
IS THIS FIXED IN GINGIN (8.1)?

Comment 13 Bill Nottingham 2003-01-13 19:55:26 UTC
No.

Comment 14 Larry Troan 2003-01-21 20:15:32 UTC
Do we have a target fix date or release since it looks like we missed gingin?

Comment 15 Bill Nottingham 2003-01-21 20:16:44 UTC
No. Fixing this the rgiht way involves reimplenting the way we treat network
interfaces systemwide.

Comment 16 Larry Troan 2003-01-29 17:06:09 UTC
Opened Feature Tracker #795 to keep this on the radar screen.

Comment 17 Bill Nottingham 2003-02-28 01:58:05 UTC
*** Bug 70279 has been marked as a duplicate of this bug. ***

Comment 18 Larry Troan 2003-07-01 12:47:10 UTC
Danny, have you tested this against Taroon Alpha4? Is it still a problem? 

Comment 19 Bill Nottingham 2003-07-01 14:33:28 UTC
It will behave the same under Taroon A4 as it did under previous releases.

Comment 20 Bill Nottingham 2003-08-01 20:16:22 UTC
*** Bug 85372 has been marked as a duplicate of this bug. ***

Comment 23 Bill Nottingham 2003-08-15 05:02:05 UTC
Fixed in the combination of:

kudzu-1.1.16-x
initscripts-7.29-x

Requirements for configurations for this to be handled correctly:
- HWADDR=<hardware address> in ifcfg-<whatever>
- kernel driver support for ethtool GDRVINFO for any drivers in use

Note that now in the case where there is eth0, eth1, and eth2 and eth1 is
removed, you will have eth0 and eth2, *not* eth0 and eth1. 


Comment 24 Larry Troan 2003-09-10 15:27:26 UTC
FROM ISSUE TRACKER 
Event posted 09-09-2003 03:29pm by dtrinh with duration of 0.00
issue-10027.zip
It still doesn't work right on Taroon B2 (kernel 421).
Server tested: Merlot + 24Gb + 2 onboard NICs

When I insert 1 dual e100, 3 dual e1000, 1 dual broadcom 1000, 1 e1000 NICs and
reboot. Kudzu see them all, but failed to generate ifcfg-ethX and modules.conf
correctly.
When I removed all above NICs but 2 onboard NICs, kudzu remove all ifcfg-ethx
and clean up all ethx in modules.conf. I have to edit hwconf file and rerun
kudzu again for seeing 2 onboard NICs.
Attached is a zip of log files.
Conclusion: It's worse than when I open this issue. I increase this issue to
severity 2.

Status set to: Waiting on Tech
File uploaded: issue-10027.zip
Severity set to: High

Comment 25 Larry Troan 2003-09-10 15:28:15 UTC
Created attachment 94377 [details]
Issue-10027.zip

Comment 26 Larry Troan 2003-09-10 15:29:56 UTC
CHANGEING PRODUCT AND VERSION TO REFLECT CURRENT TAROON WORK.

Comment 27 Bill Nottingham 2003-09-10 17:04:01 UTC
Note that taking a month to test means your problem may not get fixed.

Comment 28 Bill Nottingham 2003-09-10 20:04:29 UTC
Which version of kudzu was this with?

Comment 29 Larry Troan 2003-09-12 14:46:46 UTC
FROM ISSUE TRACKER
Event posted 09-11-2003 11:49am by dtrinh with duration of 0.00
I retested this morning with kudzu-1.1.20-1.

- When I insert all those NICs, kudzu seems to see all NICs correctly, but tg3,
and e1000 modules are loaded (e100 module is not loaded ????). That is why
modules.conf misses 2 entries for a e100 dual nic, ifcfg-ethx are not created
for a e100 dual nic also. If I manually "insmod e100" and rerun kudzu then
modules.conf fills up with 2 entries for e100 dual NIC, ifcfg-ethx are created
also. I have tested with hugemem-423, smp-423 kernels with the same results.

- Removing NICs seems to work fine this time.

Comment 32 Bill Nottingham 2003-09-15 21:10:50 UTC
That attachment is not really useful at all.

Comment 33 Bill Nottingham 2003-09-15 21:15:29 UTC
OK, so, this works for me in testing smaller amounts of cards (I don't have that
kind of hardware on hand.)  It *could* be something related to having more than
10 interfaces, but the code doesn't have any obvious problems with that on
visual inspection.

Comment 34 Larry Troan 2003-09-17 16:33:10 UTC
NOTE THAT BUG 85372 WAS MARKED AS A DUP OF THIS BUG. Thsi bug is against RHEL3.
BUG 85372 IS AGAINST 2.1 WITH TARGET QU3.

Comment 35 Bill Nottingham 2003-09-17 16:37:53 UTC
Bug 85372 is against RHEL 2.1; it describes the same issue though, and that
issue will not be fixed for RHEL 2.1.

Comment 36 Larry Troan 2003-09-19 20:25:41 UTC
FROM ISSUE TRACKER
Event posted 09-18-2003 05:00pm by dtrinh with duration of 0.00
Ok, I tried with 1 dual e100 NIC, and 1 dual e1000 NIC; It acts the same thing
(didn't load e100 module).


Comment 37 Bill Nottingham 2003-09-19 20:53:35 UTC
So, you simultaneously add a dual e100 and dual e1000? I'm still not certain
that we have such hardware.

Comment 38 Larry Troan 2003-10-07 14:28:22 UTC
FROM ISSUE TRACKER
Event posted 09-30-2003 11:31am by dtrinh with duration of 0.00        
I replaced a dual e100 NIC, and a dual e1000 NIC with 2 e100 NICs, and 2 e1000
NICs. It works fine with 2.4.21-3.EL kernel. So, the problem only occurs on dual
e100 NIC card.


Status set to: Waiting on Tech

Comment 39 Larry Troan 2003-10-14 14:47:18 UTC
Bill doesn't have a dual port NIC and can't recreate the bug with single port NICs.

Comment 42 Larry Troan 2003-10-16 19:31:38 UTC
Per Dell, neither MUSTFIX or SHOULDFIX for Update1. Removing 106472.

Comment 43 Larry Troan 2003-11-04 16:16:40 UTC
Bill has the dual port NICs. RH will try to fix but not high priority
since neither MUSTFIX or SHOULDFIX for U1.

Comment 44 Larry Troan 2003-11-04 16:19:28 UTC
Oops! need to put this back in ASSIGNED status. Since U1 cutoff has
passed, can't add to MUST FIX list at thsi time.

Comment 46 Bill Nottingham 2003-12-15 16:18:10 UTC
Please read comment #22 and comment #35. Removing.

Comment 47 Larry Troan 2003-12-16 14:37:38 UTC
Bill, per your note above, WONTFIX for 2.1 (Comment #35):
Bug 85372 is against RHEL 2.1; it describes the same issue though, and
that issue will not be fixed for RHEL 2.1. 

The dual bug is per RH Engineering request to open a bug for each
level that experiences a problem :-(  I agree our multi-release bug
handling is busted... 

Per comment #43:
Bill has the dual port NICs. RH will try to fix but not high priority
since neither MUSTFIX or SHOULDFIX for U1.

So........ Status on fixing for RHEL3 (this bug)?

Comment 48 Bill Nottingham 2003-12-16 15:50:46 UTC
As of yet unable to reproduce under RHEL3.

Comment 50 Bill Nottingham 2004-01-29 04:49:22 UTC
*** Bug 114517 has been marked as a duplicate of this bug. ***

Comment 52 Larry Troan 2004-02-18 18:23:31 UTC
FROM BUG 61169 (RHEL2.1)
Additional Comment #13 From Bill Nottingham (notting)  on
2004-01-28 14:13 -------

If you would like some more detailed explanation, just off the top of
my head, fixing this properly requires changing (significantly, in
many respects):

- kudzu
- the init scripts
- the network configuration tools
- ethtool
- the kernel drivers
- anaconda
- *and requires rewriting all existing configurations*

It's just completely out of scope for an update release.

BUG CLOSED AS WONTFIX.... This bug looks like a DUP of Bug 61169. 
Should we close as WONTFIX for RHEL 2.1 ????


Comment 53 Bill Nottingham 2004-07-21 03:25:39 UTC
*** Bug 128276 has been marked as a duplicate of this bug. ***

Comment 54 Amit Bhutani 2004-07-21 03:34:36 UTC
Bug 128276 is not tied to Intel Dual Port 10/100 NICS. Furthermore it 
is a RHEL 2.1 issue, not a RHEL 3. I am not sure if we should 
neccessarily tie these two issues as duplicates.

Comment 55 Bill Nottingham 2004-07-21 03:37:50 UTC
Well, this started out as the RHEL 2.1 bug.

RHEL 2.1 is WONTFIX for this in any case.

Comment 56 Bill Nottingham 2004-08-27 22:37:14 UTC
Please try kudzu-1.1.22.5-1, available at:

http://people.redhat.com/notting/kudzu/


Comment 58 jordan hargrave 2004-09-08 18:46:58 UTC
New kudzu seems to fix the issue on RHEL3, U3 (8/16 release)


Comment 60 jordan hargrave 2004-11-30 15:35:08 UTC
Kudzu is now no longer working in Update4 beta1

Comment 61 John Flanagan 2004-12-21 14:22:00 UTC
An advisory has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-509.html


Comment 62 jordan hargrave 2005-01-21 19:41:43 UTC
Most kudzu fixes in RHEL4 kernel have been resolved, reopening on
RHEL3 for code to be merged.

Comment 63 Amit Bhutani 2005-02-07 19:36:57 UTC
Please refer to comment #60. Kudzu is still buggy in RHEL 3. Can the 
person putting the issue in the "modified" state, please list what 
version of Kudzu resolves this bug and where that version of Kudzu 
can be found ?

Comment 64 Bill Nottingham 2005-02-07 19:45:07 UTC
kudzu-1.1.22.11-1 solves comment #62, namely:

 Most kudzu fixes in RHEL4 kernel have been resolved, reopening on
 RHEL3 for code to be merged.

It's currently scheduled for U5.


Comment 65 Amit Bhutani 2005-02-07 19:49:17 UTC
Since we do not have access to the U5 Beta yet, any chance that you 
can post that version of Kudzu on a people page ? We promise to 
provide testing feedback ;-)

Comment 68 jordan hargrave 2005-03-29 23:11:56 UTC
Kudzu is still broken in U5 Beta1
PE1800 with the following configuration:

Kudzu version: kudzu-1.1.22.11-1
kudzu-devel-1.1.22.11-1
---- hwconf : device,hwaddr,driver,pciaddr
eth0,00:0F:1F:FA:36:71,e1000,3:7.0 (Onboard Intel Gigabit)
eth1,00:04:23:08:7E:6F,e1000,2:5.1 (Dual port Intel Gigabit)
eth2,00:04:23:08:7E:6E,e1000,2:5.0 (Dual port Intel Gigabit)
eth3,00:10:18:04:5A:0E,tg3,3:9.0   (Broadcom)
----- kernel : device,hwaddr,driver,pciaddr
eth0,00:0F:1F:FA:36:71,e1000,3:7.0
eth1,00:04:23:08:7E:6F,e1000,2:5.1
eth2,00:04:23:08:7E:6E,e1000,2:5.0
eth3,00:10:18:04:5A:0E,tg3,3:9.0
---- modprobe.conf : device,driver
eth0,e1000
eth1,e1000
eth2,e1000
eth3,tg3
---- network-scripts : device,hwaddr
eth0,00:0F:1F:FA:36:71
eth1,00:04:23:08:7E:6F
eth2,00:04:23:08:7E:6E
eth3,00:10:18:04:5A:0E

If the onboard Gigabit NIC is disabled in the BIOS, after kudzu runs on the 
next boot the configuration is:
Kudzu version: kudzu-1.1.22.11-1
kudzu-devel-1.1.22.11-1
---- hwconf : device,hwaddr,driver,pciaddr
eth0,00:04:23:08:7E:6E,e1000,2:5.0
eth1,00:04:23:08:7E:6F,e1000,2:5.1
eth3,00:10:18:04:5A:0E,tg3,3:9.0
----- kernel : device,hwaddr,driver,pciaddr
eth1,00:04:23:08:7E:6F,e1000,2:5.1
eth2,00:04:23:08:7E:6E,e1000,2:5.0
eth3,00:10:18:04:5A:0E,tg3,3:9.0
---- modprobe.conf : device,driver
eth1,e1000
eth2,e1000
eth3,tg3
---- network-scripts : device,hwaddr
eth1,00:04:23:08:7E:6F
eth2,00:04:23:08:7E:6E
eth3,00:10:18:04:5A:0E

So eth2 has moved to eth0, but kudzu does not update the other settings 
properly


Comment 69 jordan hargrave 2005-03-29 23:13:36 UTC
Created attachment 112436 [details]
Network configuration - valid state (LOM enabled)

Comment 70 jordan hargrave 2005-03-29 23:16:24 UTC
Created attachment 112437 [details]
Network state - after LOM disable

Comment 71 Bill Nottingham 2005-03-30 03:06:15 UTC
Not a bug. What you say the kernel 'sees' as eth0 will be renamed to eth2 on
interface bringup; as it hasn't changed, the device name will not change.

Comment 72 Charles Rose 2005-05-05 07:13:09 UTC
On a PE1600, with one onboard Intel gigabit interface, one Intel Pro100S card
and one Intel Gigabit card, all interfaces startup fine.

Here is the status before hardware change:
Kernel version: 2.4.21-32.ELsmp
Kudzu version: kudzu-1.1.22.11-1
---- hwconf : device,hwaddr,driver,pciaddr
eth0,00:C0:9F:40:15:C1,e1000,0:2.0
eth1,00:0E:0C:2F:0F:7A,e1000,1:6.0
eth2,00:0E:0C:51:98:9C,e100,0:6.0
----- kernel : device,hwaddr,driver,pciaddr
eth0,00:C0:9F:40:15:C1,e1000,0:2.0
eth1,00:0E:0C:2F:0F:7A,e1000,1:6.0
eth2,00:0E:0C:51:98:9C,e100,0:6.0
---- modprobe.conf : device,driver
eth0,e1000
eth1,e1000
eth2,e100

We disabled the onboard interface in the BIOS. Boot the OS. Kudzu detects the
change, we remove the hardware config. During network startup both eth1 AND eth2
*FAIL* to start. Running /etc/init.d/network restart repeatedly brings up eth1,
but eth2 never comes up.

Here is the status AFTER hardware change:
Kernel version: 2.4.21-32.ELsmp
Kudzu version: kudzu-1.1.22.11-1
---- hwconf : device,hwaddr,driver,pciaddr
eth0,00:0E:0C:2F:0F:7A,e1000,1:6.0
eth2,00:0E:0C:51:98:9C,e100,0:6.0
----- kernel : device,hwaddr,driver,pciaddr
eth0,00:0E:0C:2F:0F:7A,e1000,1:6.0
eth1,00:0E:0C:51:98:9C,e100,0:6.0
---- modprobe.conf : device,driver
eth1,e1000
eth2,e100

I will attach Network state info before and after the LOM was disabled.

Comment 73 Charles Rose 2005-05-05 07:15:31 UTC
Created attachment 114048 [details]
/var/log/messages

failed to bring up eth2

Comment 74 Bill Nottingham 2005-05-05 15:34:29 UTC
What do the ifcfg files look like?

Comment 75 Charles Rose 2005-05-09 11:42:18 UTC
Created attachment 114152 [details]
Network status before the LOM is disabled

Comment 76 Charles Rose 2005-05-09 11:43:16 UTC
Created attachment 114153 [details]
Network status after the LOM is disabled

Comment 77 Bill Nottingham 2005-05-09 18:25:20 UTC
Network config files do not have HWADDR included; configuration is incorrect to
start out with.

Leaving as modified, re comment #65.

Comment 78 Bill Nottingham 2005-05-09 18:37:39 UTC
If you know what wrote the ifcfg files (i.e., if they were originally written by
the installer, etc.), you may want to file a separate bug that HWADDR isn't put
in the config files.

Comment 79 Charles Rose 2005-05-18 03:40:53 UTC
Opened Issue Tracker 72965 - RHEL3 U5: Anaconda does not set HWADDR in ifcfg-eth
config files

Comment 81 Tim Powers 2005-05-19 17:17:40 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-125.html


Comment 82 Charles Rose 2005-06-14 12:51:57 UTC
Opened Issue Tracker 74404 to continue discussion on this defect.

Comment 83 Bill Nottingham 2005-06-14 16:07:09 UTC
Closing, again. Please open *NEW* bugs on new issues.

Comment 84 Bill Nottingham 2005-06-14 16:08:40 UTC
To be more precise, the previous 82 comments probably have very little to do
with any new problems that are seen at this point, and therefore aren't really
relevant for new issues, bugzilla, etc.


Note You need to log in before you can comment on or make changes to this bug.