Bug 476977

Summary: Eth0 not set up first try while using pxeboot image to get kickstart running. Some network cards.
Product: [Fedora] Fedora Reporter: Karl Magnus Kolstø <karl.kolsto>
Component: anacondaAssignee: David Cantrell <dcantrell>
Status: CLOSED WORKSFORME QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 10CC: anaconda-maint-list, mlb
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-02-21 04:33:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Karl Magnus Kolstø 2008-12-18 12:28:18 UTC
Description of problem:
Booting from the standard pxeboot vmlinuz and initrd.img
anaconda starts up and starts NetworkManager to do get the kickstart script downloaded. Network is not set up properly. -> Error message.
Click "Retry" and everything is set up and installation goes as expected.

Seen with Syskonnect SK-9871 (skge) and RTL-8169 NICs

Version-Release number of selected component (if applicable):
anaconda version 11.4.1.64
Seen on both 32 and 64 bit

How reproducible:
Every time.

Steps to Reproduce:
1. Have a kickstart server with appropriate kickstart file
2. Have client with Syskonnect SK-9871 (driver: skge) or RTL-8169 NIC
3. boot with the pxeboot kernel and initrd.img, use ks= to point to kickstart file
4. Wait for anaconda to start and the Error message to turn up.
  
Actual results:
NetworkManages fails to set up the network card, presents you with an error message that network couldn't be set up and gives you the option "retry"
Pressing "Retry" sets up network properly and continues kickstart installation as expected.


Expected results:

Installation should run without any intervention needed
(subject to what the kickstart file is set up to do, of course)

Additional info:
This is seen on machines with the following NIC chipsets;
Syskonnect SK-9871 (driver: skge) and RTL-8169
The Syskonnect has been tested in widely different machines, the behaviour follows the NICs. (More than one NIC has been tested)
------------------------
The anaconda error looks something like this;
<box>
"Waiting for NetworkManager to configure eth0..."
</box>

<box>
Network Error
"There was an error configurring your network interface."
<button>
Retry
</button>
</box>

-----------------------
The messages from vt4
"
NetworkManager: <info>  (eth0):
NetworkManager: <info>  starting...
NetworkManager: <info>  Waiting for HAL to start...
NetworkManager: <WARN>  nm_generic_enable_loopback(): error -17 returned from rtnl_addr_add():
Sucess
NetworkManager: <info>  Trying to start the supplicant...
NetworkManager: <info>  Trying to start the system settings daemon...
nm-system-settings: initial_add_devices_of_type: could not get device from HAL: The name org.freedesktop.Hal was not provided by any .service files (2)
nm-system-settings: initial_add_devices_of_type: could not get device from HAL: The name org.freedesktop.Hal was not provided by any .service files (2)
nm-system-settings: initial_add_devices_of_type: could not get device from HAL: The name org.freedesktop.Hal was not provided by any .service files (2)
nm-system-settings: Loaded plugin ifcfg-fedora: (c) 2007 - 2008 Red Had, Inc. To report bugs please use the NetworkManager mailing list.
nm-system-settings:     ifcfg-fedora: parsing /etc/sysconfig/network-scripts-ifcfg-eth0 ...
nm-system-settings:     ifcfg-fedora:     error: File '/etc/sysonfig/network-scripts/ifcfg-eth0' specified device 'eth0', but the device's type could not be determined.
NetworkManager: <info>  HAL re-appeared
NetworkManager: <info>  eth0: driver is 'skge'.
NetworkManager: <info>  Found new Ethernet device 'eth0'.
NetworkManager: <info>  (eth0): exported as /org/freedesktop/Hal/devices/net_00_00_5a_9d_3e_2c
NetworkManager: <info>  (eth0): device state change: 1 -> 2
NetworkManager: <info>  (eth0): bringing up device.
skge eth0: enabling interface
NetworkManager: <info>  (eth0): preparing device.
NetworkManager: <info>  (eth0): deactivating device (reason: 2).
skge eth0: Link is up at 100 Mbps, full duplex, flow control both
NetworkManager: <info>  (eth0): carrier now ON (device state 2)
NetworkManager: <info>  (eth0): device state change: 2 -> 3
"
<waiting>
------------------

Comment 1 David Cantrell 2008-12-19 02:41:29 UTC
What install image are you booting from?

Comment 2 Karl Magnus Kolstø 2008-12-19 08:11:29 UTC
We use the standard pxeboot kernel and image.
I just verified that by downloading them from an official mirror and comparing md5sums. These are the sums, just for reference;
45f3f7500d67eee47de6a84b2e302d37  initrd.img
a0385d163b3b50fae17200d642266aec  vmlinuz

Comment 3 Matthew Boyle 2009-01-13 14:55:20 UTC
we've got the same problem here.  the DHCP server log reports the following:

--------------

12:18:30 dhcpd: DHCPDISCOVER from 00:11:d8:b7:8f:04 via eth0
12:18:30 dhcpd: DHCPOFFER on 10.0.0.72 to 00:11:d8:b7:8f:04 via eth0
12:18:32 dhcpd: DHCPREQUEST for 10.0.0.72 (10.0.0.120) from 00:11:d8:b7:8f:04 via eth0
12:18:32 dhcpd: DHCPACK on 10.0.0.72 to 00:11:d8:b7:8f:04 via eth0

[snip]

12:19:42 dhcpd: DHCPDISCOVER from 00:11:d8:b7:8f:04 via eth0
12:19:42 dhcpd: DHCPOFFER on 10.0.0.72 to 00:11:d8:b7:8f:04 via eth0
12:19:42 dhcpd: DHCPREQUEST for 10.0.0.72 (10.0.0.120) from 00:11:d8:b7:8f:04 via eth0
12:19:42 dhcpd: DHCPACK on 10.0.0.72 to 00:11:d8:b7:8f:04 via eth0

---------------

the first four lines correspond to the initial PXE boot, and the second four to the successful attempt.  so it looks like the DHCP server never sees NetworkManager's first attempt at acquiring a lease.  in fact, sniffing on that stretch of the network shows no such traffic either.

this is with a Marvell Yukon 88E8001 GbE NIC using the skge driver.  initrd and vmlinuz md5sums match those in comment #2.

the box (nvidia MCP51 on-board GbE, forcedeth driver) plugged into the same switch has no such problem.

Comment 4 David Cantrell 2009-02-21 04:33:28 UTC
Unable to reproduce here.  Using e1000 NICs.  It could possibly be a problem with hal and/or NetworkManager interacting with your particular network devices.