Bug 429968
| Summary: | Anaconda stage 1 installer does NOT work with network installs on ports about eth10 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Bill Hayes <bill.hayes> | ||||||||
| Component: | anaconda | Assignee: | Dave Cantrell <dcantrell> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Alexander Todorov <atodorov> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | high | ||||||||||
| Version: | 5.2 | CC: | andriusb, dchapman, ddomingo, glen.foster, maurizio.antillon, rick.bieber, rick.hester, tao | ||||||||
| Target Milestone: | rc | Keywords: | Reopened | ||||||||
| Target Release: | --- | ||||||||||
| Hardware: | ia64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | RHBA-2008-0397 | Doc Type: | Bug Fix | ||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2008-05-21 15:32:44 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
Bill Hayes
2008-01-24 00:45:04 UTC
See Issue Tracker: https://enterprise.redhat.com/issue-tracker/?module=issues&action=view&tid=161661 Are we sure this isn't a dupe of 303681? I committed a patch on February 5th to
fix that:
commit 0dcf8192c048324b718c3b0c2d212d1dfa584ac4
Author: David Cantrell <dcantrell>
Date: Tue Feb 5 12:15:36 2008 -1000
Use libnl to read MAC and IP addresses (#303681).
This patches reduces nl.c in libisys to just what we need to talk
to libnl. libnl provides the netlink cache for interfaces and should
allow us to see all NICs in the system and gather the MAC and IP
addresses for each.
Can someone try a current RHEL 5.2 nightly on a system with more than 10 NICs?
*** This bug has been marked as a duplicate of 303681 *** Ronald, as per previous comment, looks like this issue is resolved. clearing requires_release_notes flag. please reset requires_release_notes flag if this issue is unresolved and needs to be documented (please include workarounds, if any). I am re-opening this bug as it is _not_ the same as the issue it was closed as a dup of. More info to follow shortly.... The problem appears to be with "high numbered tg3 devices". I.e. when a tg3
device has an eth number of > ~10 anaconda stage1 fails to be able to configure
it to be used to install over.
There has been much confusion on this issue. I am NOT saying that you cannot
set up the device to be configured for the installed os. I am saying that in
stage 1 if you want to do and http/nfs/ftp etc install over this device THAT is
what fails (sorry for being blunt but there has been much confusion here).
What happens is you are prompted for the list of available network devices as
expected, in this case the tg3 devies are eth12 and eth13. When I try to select
either of them, then on the next page try to configure with dhcp, then click OK
it tries to get a dhcp address then just goes back to that same page (and yes,
our network is configured to provide dhcp to these devices).
If I select on of the other devices it works OK.
If we removed some of the other devices so the same tg3 cards are still there
but they have lower ethX numbers they work OK.
Looking at the anaconda logfile I find this:
16:53:23 ERROR : nic_by_name: no interface named eth13 found
16:53:23 CRITICAL: dhcp_nic: net_get_by_name(eth13) failed
16:53:23 DEBUG : dhcp: DHCP configuration failed
16:53:28 DEBUG : waiting for link eth13...
16:53:28 DEBUG : 0 seconds.
16:53:28 DEBUG : sleep (nicdelay) for 0 secs first
16:53:28 DEBUG : continuing...
the device however does appear to be there. This is from dropping to a shell in
anaconda stage2:
sh-3.2# ifconfig eth13
eth13 Link encap:Ethernet HWaddr 00:17:A4:99:8F:CA
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:75
and from dmesg all appears to be OK:
<6>eth13: Tigon3 [partno(BCM95700A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 1
10/100/1000Base-T Ethernet 00:17:a4:99:8f:ca
<6>eth13: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] WireSpeed[1] TSOcap[1]
<6>eth13: dma_rwctrl[769f0000] dma_mask[64-bit]
I will attach the full anaconda log as an attachment.
Created attachment 295540 [details]
logfile from install showing errors on eth12 and eth13
Doug, I think I have found the problem you are hitting. I need some time to work up some patches (for anaconda and libdhcp), but I'll post here when I have more info. Yes, this issue has been confusing to me. Mostly because I have a lot of networking related bug reports for RHEL5 right now and all of the bug reports are valid, but the reporters are also hitting other networking bugs I already know about...hence, thinking some are dupes and some aren't. This particular failure is happening in nic_get_links() in nic.c in libdhcp, which is how we are caching the netlink device information. More information when I have more to tell. Thanks. David, I have setup another server with 26 Ethernet ports. All the ports are Intel, but some ports use the e1000e and some ports use the e1000 driver. eth25, eth22, eth19, eth17, eth13, eth12, eth11 and eth10 failed to come up. eth9, eth0 and eth4 would come up. eth0-7 are e1000e and eth8-25 are e1000. The failing ports in the anaconda.log behave the same as what Doug reported: 17:25:49 INFO : going to pick interface 17:27:57 INFO : going to do getNetConfig 17:27:57 INFO : eth25 is not a wireless adapter 17:27:58 DEBUG : waiting for link eth25... 17:27:58 DEBUG : 0 seconds. 17:27:58 DEBUG : sleep (nicdelay) for 0 secs first 17:27:58 DEBUG : continuing... requesting dhcp timeout 45 17:27:58 ERROR : nic_by_name: no interface named eth25 found 17:27:58 CRITICAL: dhcp_nic: net_get_by_name(eth25) failed 17:27:58 DEBUG : dhcp: DHCP configuration failed 17:28:20 DEBUG : waiting for link eth25... 17:28:20 DEBUG : 0 seconds. 17:28:20 DEBUG : sleep (nicdelay) for 0 secs first 17:28:20 DEBUG : continuing... requesting dhcp timeout 45 17:28:20 ERROR : nic_by_name: no interface named eth25 found 17:28:20 CRITICAL: dhcp_nic: net_get_by_name(eth25) failed 17:28:20 DEBUG : dhcp: DHCP configuration failed I will attach the anaconda.log file next. Bill │ eth0 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ↑ │ │ eth1 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▮ │ lo│ eth2 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▒ │ lo│ eth3 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▒ │ n ed│ eth4 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▒ │ lo│ eth5 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ↓ │ │ eth6 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ↑ │ ed│ eth7 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▒ │ │ eth8 - Intel Corporation 82546GB Gigabit Ethernet Controller ▮ │ │ eth9 - Intel Corporation 82546GB Gigabit Ethernet Controller ▒ │ │ eth10 - Intel Corporation 82546GB Gigabit Ethernet Controller ▒ │ lo│ eth11 - Intel Corporation 82546GB Gigabit Ethernet Controller ↓ │ │ eth12 - Intel Corporation 82546GB Gigabit Ethernet Controller ↑ │ │ eth13 - Intel Corporation 82546GB Gigabit Ethernet Controller ▒ │ │ eth14 - Intel Corporation 82546GB Gigabit Ethernet Controller ▮ │ │ eth15 - Intel Corporation 82546GB Gigabit Ethernet Controller ▒ │ │ eth16 - Intel Corporation 82546GB Gigabit Ethernet Controller ▒ │ lo│ eth17 - Intel Corporation 82546GB Gigabit Ethernet Controller ↓ │ │ eth18 - Intel Corporation 82546GB Gigabit Ethernet Controller ↑ │ │ eth19 - Intel Corporation 82546GB Gigabit Ethernet Controller ▒ │ │ eth20 - Intel Corporation 82546GB Gigabit Ethernet Controller ▒ │ │ eth21 - Intel Corporation 82546GB Gigabit Ethernet Controller ▮ │ │ eth22 - Intel Corporation 82546GB Gigabit Ethernet Controller ▒ │ lo│ eth23 - Intel Corporation 82546GB Gigabit Ethernet Controller ↓ │ │ eth24 - Intel Corporation 82546GB Gigabit Ethernet Controller ▒ │ lo│ eth25 - Intel Corporation 82546GB Gigabit Ethernet Controller ↓ │ Created attachment 295966 [details]
26 Ethernet port server - anconda.log file
This should be fixed in anaconda-11.1.2.105-1 and later. I tested the first drop of RHEL 5.2. I had anaconda-11.1.2.105-1. I was able to bring up all 26 Ethernet ports on the same system that I used in comment #10. I will attach the anaconda.log file from this test. billh@lart:~$ grep -i dhcprequest anaconda.log 21:54:44 INFO : DHCPREQUEST on eth0 to 255.255.255.255 port 67 21:55:47 INFO : DHCPREQUEST on eth1 to 255.255.255.255 port 67 21:56:22 INFO : DHCPREQUEST on eth2 to 255.255.255.255 port 67 21:57:04 INFO : DHCPREQUEST on eth3 to 255.255.255.255 port 67 21:57:36 INFO : DHCPREQUEST on eth4 to 255.255.255.255 port 67 21:58:11 INFO : DHCPREQUEST on eth5 to 255.255.255.255 port 67 21:58:48 INFO : DHCPREQUEST on eth6 to 255.255.255.255 port 67 21:59:27 INFO : DHCPREQUEST on eth7 to 255.255.255.255 port 67 22:00:02 INFO : DHCPREQUEST on eth8 to 255.255.255.255 port 67 22:00:36 INFO : DHCPREQUEST on eth9 to 255.255.255.255 port 67 22:01:10 INFO : DHCPREQUEST on eth10 to 255.255.255.255 port 67 22:01:40 INFO : DHCPREQUEST on eth11 to 255.255.255.255 port 67 22:02:23 INFO : DHCPREQUEST on eth12 to 255.255.255.255 port 67 22:02:59 INFO : DHCPREQUEST on eth13 to 255.255.255.255 port 67 22:03:34 INFO : DHCPREQUEST on eth14 to 255.255.255.255 port 67 22:04:08 INFO : DHCPREQUEST on eth15 to 255.255.255.255 port 67 22:04:45 INFO : DHCPREQUEST on eth16 to 255.255.255.255 port 67 22:05:40 INFO : DHCPREQUEST on eth17 to 255.255.255.255 port 67 22:06:15 INFO : DHCPREQUEST on eth18 to 255.255.255.255 port 67 22:06:58 INFO : DHCPREQUEST on eth19 to 255.255.255.255 port 67 22:07:31 INFO : DHCPREQUEST on eth20 to 255.255.255.255 port 67 22:08:14 INFO : DHCPREQUEST on eth21 to 255.255.255.255 port 67 22:08:52 INFO : DHCPREQUEST on eth22 to 255.255.255.255 port 67 22:09:28 INFO : DHCPREQUEST on eth23 to 255.255.255.255 port 67 22:10:03 INFO : DHCPREQUEST on eth24 to 255.255.255.255 port 67 22:10:39 INFO : DHCPREQUEST on eth25 to 255.255.255.255 port 67 billh@lart:~$ Created attachment 296953 [details]
anaconda.log
In case my #13 update is unclear, the first drop of RHEL 5.2 fixes this problem. I was able to bring up all 26 Ethernet ports and get a DHCP addresses for each port. I will try this on other systems also. I tried a smaller server with 14 Ethernet ports and all the ports came up just fine and got DHCP addresses. │ eth0 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ↑ │ │ eth1 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▮ │ │ eth2 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▒ │ │ eth3 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▒ │ │ eth4 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▒ │ │ eth5 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ↓ │ │ eth6 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ↑ │ │ eth7 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) ▒ │ │ eth8 - Digital Equipment Corporation DECchip 21142/43 ▮ │ │ eth9 - Digital Equipment Corporation DECchip 21142/43 ▒ │ │ eth10 - Digital Equipment Corporation DECchip 21142/43 ▒ │ │ eth11 - Digital Equipment Corporation DECchip 21142/43 ↓ │ │ eth12 - Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet ▮ │ │ eth13 - Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet ↓ │ David, I just tested a blade with 16 Ethernet ports and it was also fine. All of these hardware configurations would have failed on RHEL 5 or RHEL 5.1. Thanks for the fix. Bill │ eth0 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ↑ │ │ eth1 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ▮ │ │ eth2 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ▒ │ │ eth3 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ▒ │ │ eth4 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ▒ │ │ eth5 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ↓ │ │ eth6 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ↑ │ │ eth7 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ▒ │ │ eth8 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ▒ │ │ eth9 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ▮ │ │ eth10 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ▒ │ │ eth11 - Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter ↓ │ │ eth12 - Broadcom Corporation NetXtreme BCM5704S Gigabit Ethernet ▒ │ │ eth13 - Broadcom Corporation NetXtreme BCM5704S Gigabit Ethernet ▒ │ │ eth14 - Broadcom Corporation NetXtreme BCM5704S Gigabit Ethernet ▮ │ │ eth15 - Broadcom Corporation NetXtreme BCM5704S Gigabit Ethernet ↓ │ Bill, Thanks for all the feedback and thanks for testing this out. Glad to hear the fixes are working. *** Bug 320841 has been marked as a duplicate of this bug. *** An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0397.html |