Bug 303681 - LTC31785-Installer only has hardware address for the first 10 ethernet cards
Summary: LTC31785-Installer only has hardware address for the first 10 ethernet cards
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: anaconda
Version: 5.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: David Cantrell
QA Contact: Alexander Todorov
URL:
Whiteboard:
Depends On: 230525
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-09-24 17:32 UTC by Ken Reilly
Modified: 2018-10-19 22:19 UTC (History)
8 users (show)

Fixed In Version: RHBA-2008-0397
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-05-21 15:31:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
anaconda-RHEL-5-bz303681.patch (30.93 KB, patch)
2008-02-05 22:17 UTC, David Cantrell
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0397 0 normal SHIPPED_LIVE anaconda bug fix and enhancement update 2008-05-19 23:11:23 UTC

Comment 1 RHEL Program Management 2007-10-16 03:38:30 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 2 Red Hat Bugzilla 2007-10-23 15:25:40 UTC
User pgraner's account has been closed

Comment 5 David Cantrell 2008-02-05 22:17:08 UTC
Created attachment 294052 [details]
anaconda-RHEL-5-bz303681.patch

This patch fixes the problem where we can't see all of the NICs in the system. 
This should make all visible to use.

Comment 6 David Cantrell 2008-02-05 22:22:14 UTC
James,

Can I get a QA ack on this one?  Thought I had it fixed in 5.1, but apparently not.

Comment 8 David Cantrell 2008-02-06 03:54:05 UTC
This fix will be included in anaconda-11.1.2.97-1.

Comment 10 David Cantrell 2008-02-14 21:44:08 UTC
*** Bug 429968 has been marked as a duplicate of this bug. ***

Comment 11 Alexander Todorov 2008-02-19 16:54:22 UTC
side note: 
can test with Xen guest with up to 15 NICs. With 16 or more something breaks

Comment 12 Bill Hayes 2008-02-20 00:30:10 UTC
I tried the 20080212 nightly build and I was only able to bring up the first 12
Ethernet ports in stage 1 "Configure TCP/IP" screen.  The last 2 Ethernet ports
in this system would not come up despite repeated attempts to bring them up. 
The Ethernet port configuration was as follows:

  | eth0 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)  ^ |  
  | eth1 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)  # |  
  | eth2 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)  : |  
  | eth3 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)  : |  
  | eth4 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)  : |  
  | eth5 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)  v |  
  | eth6 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)  ^ |  
  | eth7 - Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)  : |  
  | eth8 - Digital Equipment Corporation DECchip 21142/43                  # |  
  | eth9 - Digital Equipment Corporation DECchip 21142/43                  : |  
  | eth10 - Digital Equipment Corporation DECchip 21142/43                 : |  
  | eth11 - Digital Equipment Corporation DECchip 21142/43                 v |  
  | eth12 - Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet        # |  
  | eth13 - Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet        v |  

I looks like this nightly drop had Anaconda 11.1.2.100-1
(./Server/anaconda-runtime-11.1.2.100-1.ia64.rpm).

Re: Bug 429968 



Comment 13 David Cantrell 2008-02-20 02:06:42 UTC
We're obviously seeing more than 10 NICs now, so the original reported problem
is fixed.  In the above example, eth12 and eth13 are BCM5704 devices.  Are these
supported by the RHEL 5.2 kernel?  And if so, have we enabled them in anaconda
with the correct driver?

Comment 14 Bill Hayes 2008-02-20 13:53:20 UTC
Once the installation is done, all of these ports will work fine.  So yes the
Broadcom 5704 ports on this system are fully supported by the kernel.  

If I reduce the number of Ethernet ports in this system, I can install on the
5704 ports just fine.  It seems to be a problem with the number of Ethernet
ports in the stage 1 anaconda installer.




Comment 15 Alexander Todorov 2008-02-20 14:22:50 UTC
Bill,
I've used up to 15 nics with Xen and I didn't hit any problems. I've configured
all of them to use DHCP and they were brought up once the system was installed
and booted for the first time.

Comment 16 Doug Chapman 2008-02-20 15:07:18 UTC
(In reply to comment #15)
> Bill,
> I've used up to 15 nics with Xen and I didn't hit any problems. I've configured
> all of them to use DHCP and they were brought up once the system was installed
> and booted for the first time.

The problem Bill is reporting is related to anaconda which has nothing to do
with xen.  Also, note that Bill says all network connections work just run
_after_ install which I believe is what you are saying here.  We still have the
problem that you cannot install over some of the network devices.


Comment 17 Alexander Todorov 2008-02-20 15:49:22 UTC
Let me clarify:
I've used Xen to simulate http installation on hardware with 15 nics. anaconda
loader sees all of them (as in comment #12) and they are configurable in stage2.
All of them have different HW addresses (as specified by the virtualization
layer). Always choosing a random device in the installer to complete the
installation.  All of the devices are brought up when the system is booted after
reboot. 

I believe the original reported problem was that Anaconda could not see more
than 10 network devices which is solved now. There might be some other problems
related to this as well but it's not obvious (at least for me) from all these
comments.



Comment 18 Doug Chapman 2008-02-20 16:58:17 UTC
The original description of this bug is marked private so neither I nor the
others from HP can read it.  Our BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=429968 was closed as a duplicate of
this BZ.

Perhaps we need to un-mark bug 429968 as a dup since it appears that bug is NOT
fixed.


Comment 19 David Cantrell 2008-02-20 18:43:31 UTC
Doug,

Bug #429968 is a problem with seeing more than 10 NICs.  That's been fixed now.
 What you're hitting now is likely a problem with us missing a required driver
in the anaconda environment, so it appears that the original problem is still
unresolved.  This is a problem with encountering multiple bugs in a row when
testing.  As noted in comment #12, Bill is able to see all but the last two
Ethernet devices.  That means he can see through device 11, so the more than 10
problem is solved.  The last two devices are BCM5704, and I don't know if we
have that driver in the anaconda environment.  To me, this is now a hardware
enablement issue in anaconda, so it's a completely separate issue.

This is a new bug that should be filed separately because it's no longer the >10
NIC problem.

Thanks.

Comment 20 Bill Hayes 2008-02-20 19:29:37 UTC
David,

I am getting a little confused.

Bug #429968 is a problem with being able to use all NICs for installation within
stage 1 anaconda (like DHCP setup of the ports).  It is not a problem with
seeing more than 10 NICs; assuming your definition of "seeing" is that the stage
1 anaconda gives all ports a ethX number and lists it as something you can
network install from.

Bug #429968 is a problem with getting the all the ports to be able bring up the
link and get an address from DHCP.  I have seen this problem on Ethernet ports
supported by the e1000, s2io and tg3 links.

If you have a low number of NIC ports (like 8 or less) you will not ever seen
the problem.  If I am seeing the problem and then I remove several ports then
the problem will go away.  It does not seem to be a hardware/linux-driver
problem at all.  The problem may be related to memory pressure or something, but
it does not appear to be a driver problem at all.  

I have seen a system with more than 10 Ethernet ports but it does not see the
problem.  It almost seems like you have a particular configuration of Ethernet
ports.

We have reported this as a s2io and a tg3 driver problem and that was never able
to solve the problem.  See https://bugzilla.redhat.com/show_bug.cgi?id=320841 and
https://enterprise.redhat.com/issue-tracker/?module=issues&action=view&tid=108235&header_entry=1.

Is there any DEBUG output options that we can turn on within the stage 1
installer and generate the error again?

Thanks.  

Comment 22 David Cantrell 2008-02-20 20:41:02 UTC
(In reply to comment #20)
> David,
> 
> I am getting a little confused.
> 
> Bug #429968 is a problem with being able to use all NICs for installation within
> stage 1 anaconda (like DHCP setup of the ports).  It is not a problem with
> seeing more than 10 NICs; assuming your definition of "seeing" is that the stage
> 1 anaconda gives all ports a ethX number and lists it as something you can
> network install from.
> 
> Bug #429968 is a problem with getting the all the ports to be able bring up the
> link and get an address from DHCP.  I have seen this problem on Ethernet ports
> supported by the e1000, s2io and tg3 links.
> 
> If you have a low number of NIC ports (like 8 or less) you will not ever seen
> the problem.  If I am seeing the problem and then I remove several ports then
> the problem will go away.  It does not seem to be a hardware/linux-driver
> problem at all.  The problem may be related to memory pressure or something, but
> it does not appear to be a driver problem at all.  
> 
> I have seen a system with more than 10 Ethernet ports but it does not see the
> problem.  It almost seems like you have a particular configuration of Ethernet
> ports.

If this is the case, the fixes present in the RHEL 5.2 trees should work for you
now.  In your kickstart file, add network lines for all the devices you want up
and running for installation.  Anaconda will bring them all up.  There is no way
to do this from the interactive installer, it's only available from kickstart.

The reason you were seeing this problem is because of the code we had in
anaconda to find those ethX devices.  We couldn't see above eth10, so that's why
anaconda could never bring them up--we couldn't find them.

> We have reported this as a s2io and a tg3 driver problem and that was never able
> to solve the problem.  See https://bugzilla.redhat.com/show_bug.cgi?id=320841 and
>
https://enterprise.redhat.com/issue-tracker/?module=issues&action=view&tid=108235&header_entry=1.

If anaconda is doing everything it should be, then it would be a kernel issue. 
This bug you refer is even a kernel bug.

> Is there any DEBUG output options that we can turn on within the stage 1
> installer and generate the error again?

Boot with 'loglevel=debug' as a boot parameter.

Proceed through stage1 and advance to stage2 (stage2 starts after you see
"Running anaconda, the Red Hat Enterprise Linux system installer - please
wait...").  Collect the log files from /tmp for detailed debugging information.

Your kickstart file should have network lines for all devices you want brought
up during installation:

network --device eth0 --bootproto dhcp
network --device eth1 --bootproto dhcp
.... and so on

In stage1, you will see it bring up each interface.


Comment 23 Doug Chapman 2008-02-20 21:53:07 UTC
(In reply to comment #22)

> Boot with 'loglevel=debug' as a boot parameter.
> 
> Proceed through stage1 and advance to stage2 (stage2 starts after you see
> "Running anaconda, the Red Hat Enterprise Linux system installer - please
> wait...").  Collect the log files from /tmp for detailed debugging information.

I think you are missing the point of the problem we reported (as the other BZ).
 We are trying to install over the network device.  This has nothing to do with
configuring the network device for the installed system.  Our problem is in
anaconda stage 1 so we cannot even get to stage 2 so the loglevel=debug won't
give any info.

> 
> Your kickstart file should have network lines for all devices you want brought
> up during installation:

Since it never is able to bring up the network device in stage1 it can't even
get a kickstart file since that needs to be grabbed over the network as well.



Comment 25 Alexander Todorov 2008-02-22 12:56:36 UTC
Folks,
can we move forward with this issue? What is the actual problem reported at
first and has it been fixed. IMO (comment #24) it was fixed. I'll put this BZ in
VERIFIED on Monday if there's no reply. 

HP can you please open separate bugs for other issue (even connected with this
one) so we can track them easier.

Thanks.


Comment 26 Doug Chapman 2008-02-22 14:14:27 UTC
We have determined that the issue we reported at HP that was incorrectly dup'ed
to this bug.  We have un-dup'ed it so yes, go ahead and set this one to verified.

- Doug


Comment 27 Alexander Todorov 2008-02-22 14:20:25 UTC
As per comment #26, moving to VERIFIED

Thanks Doug.

Comment 30 errata-xmlrpc 2008-05-21 15:31:49 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0397.html



Note You need to log in before you can comment on or make changes to this bug.