Bug 467427 - Kickstart hangs waiting for network
Summary: Kickstart hangs waiting for network
Keywords:
Status: CLOSED DUPLICATE of bug 466340
Alias: None
Product: Fedora
Classification: Fedora
Component: NetworkManager
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dan Williams
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-10-17 12:32 UTC by Joe Orton
Modified: 2008-10-30 05:57 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-10-30 05:57:18 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Screendump of tty4 (51.48 KB, image/jpeg)
2008-10-26 09:25 UTC, A.J. Werkman
no flags Details
Screen image of tty4 (86.28 KB, image/jpeg)
2008-10-27 10:08 UTC, Joe Orton
no flags Details

Description Joe Orton 2008-10-17 12:32:30 UTC
Description:
A kickstart is hanging waiting for the network to come up.

The last message on console 0 is "waiting for hardware to initialize..."

The last message on the syslog console is an NM state change 2 -> 3, following
eth0 reporting link up and various other bits.

This is a bog-standard (cobbler-driven) kickstart.
anaconda-11.4.1.49
Raw Hide compose from 2008-10-15/16-ish

Comment 1 David Cantrell 2008-10-17 20:34:32 UTC
How did you start the installer?  Can you provide the kickstart file used?

Comment 2 Joe Orton 2008-10-20 12:04:16 UTC
A pxeboot using the /images/pxeboot/{vmlinuz,initrd.img}

label linux
        kernel /images/store-rawhide-i386/vmlinuz
        append ksdevice=eth0 lang=  kssendmac syslog=monolith.manyfish.co.uk:25150 text  initrd=/images/store-rawhide-i386/initrd.img ks=http://monolith.manyfish.co.uk/cblr/svc/op/ks/system/ereskigal

Comment 3 Joe Orton 2008-10-20 12:05:05 UTC
It's not actually getting to the point where it fetches the kickstart file, AFAICT.

Comment 4 Joe Orton 2008-10-21 13:34:20 UTC
Same with anaconda .50 and the Raw Hide push from the 20th.  A non-kickstart install proceeds as expected and brings up the network for an NFS install fine.

Comment 5 A.J. Werkman 2008-10-23 07:22:04 UTC
I see the same also using anaconda 11.4.1.50 in rawhide.

I start installation from a second harddisk.

My grub params are:

Title Kickstart
     root (hd1,0)
     kernel /boot/vmlinuz.raw ksdevice=eth0 ks=http://<server>/ks.raw
     initrd /boot/vmlinuz.raw

When do the same, but skipping the ksdevice part, the kickstart installation goes without a problem.

Title Kickstart
     root (hd1,0)
     kernel /boot/vmlinuz.raw ks=http://<server>/ks.raw
     initrd /boot/vmlinuz.raw

Comment 6 Frank Arnold 2008-10-24 12:36:48 UTC
I'm seeing this too while trying to install F10 as a guest with kickstart on upstream Xen. When you wait for about 1 minute on the message 'waiting for hardware to initialize...' and then just press Return you will get a segfault:

loader received SIGSEGV!  Backtrace:
/sbin/loader(loaderSegvHandler+0x7e)[0x4095ae]
/lib64/libc.so.6[0x7fb1979ef100]
/lib64/libnewt.so.0.52(newtGetKey+0x40)[0x639690]
/lib64/libnewt.so.0.52(newtFormRun+0x47d)[0x63c23d]
/lib64/libnewt.so.0.52(newtRunForm+0x10)[0x63c470]
/lib64/libnewt.so.0.52[0x6425ac]
/lib64/libnewt.so.0.52(newtWinMessage+0x8a)[0x642a2a]
/sbin/loader(readNetConfig+0xcb)[0x41cc3b]
/sbin/loader(kickstartNetworkUp+0x72)[0x41d222]
/sbin/loader(main+0x705)[0x40a545]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x7fb1979da546]
/sbin/loader[0x4078a9]
install exited abnormally [1/1]
disabling swap...
unmounting filesystems...
        /proc done
        /dev/pts done
        /sys done
sending termination signals...done
sending kill signals...done
you may safely reboot your system

I was able to reproduce it reliably. This is with Rawhide boot.iso from 2008-10-23.

Comment 7 Joe Orton 2008-10-24 12:57:25 UTC
I saw that too.   The code looks like:

        i = get_connection(iface);
        newtPopWindow();

        if (i > 0) {
            newtWinMessage(_("Network Error"), _("Retry"),
                           _("There was an error configuring your network "
                             "interface."));

so I guess that:

1) the get_connection is failing

2) the error path then hangs since it's calling into newt at a point where newt has not been initialized

Comment 8 David Cantrell 2008-10-25 01:36:12 UTC
(In reply to comment #7)
> I saw that too.   The code looks like:
> 
>         i = get_connection(iface);
>         newtPopWindow();
> 
>         if (i > 0) {
>             newtWinMessage(_("Network Error"), _("Retry"),
>                            _("There was an error configuring your network "
>                              "interface."));
> 
> so I guess that:
> 
> 1) the get_connection is failing

Is there anything on tty3 or tty4?

> 2) the error path then hangs since it's calling into newt at a point where newt
> has not been initialized

newt is still initialized at this point, so that likely isn't cause of the SIGSEGV.

Comment 9 David Cantrell 2008-10-25 01:55:51 UTC
I have been unable to reproduce this problem with the latest development build of anaconda.

Can you try rawhide again once anaconda-11.4.1.51-1 appears in the tree?

Comment 10 A.J. Werkman 2008-10-25 13:24:00 UTC
In Anaconda-11.4.1.50 my tty3 shows:

INFO: getting kickstart file
INFO: doing kickstart ... setting it up

My tty4 shows:
skge eth0: Link is up at 1000 Mbps, full duplex, flow control both
NetworkManager: <info> (eth0): carrier now ON (device state 2)
NetworkManager: <info> (eth0): device state change: 2->3

I don't have anaconda 51 right now in my rawhide tree.

Comment 11 A.J. Werkman 2008-10-26 09:01:47 UTC
This morning I retried this with anaconda-11.4.1.51 and the problem is still there.

Comment 12 A.J. Werkman 2008-10-26 09:25:48 UTC
Created attachment 321544 [details]
Screendump of tty4

With anaconda-11.4.1.51 kickstarting hangs even if you do not use the ksdevice commandline parameter. This attachment show the screen tty4.

Comment 13 Joe Orton 2008-10-27 10:08:07 UTC
Created attachment 321594 [details]
Screen image of tty4

Comment 14 David Cantrell 2008-10-27 19:30:26 UTC
Based on the screen capture from comment #13, I think this is a NetworkManager problem with nm-system-settings.

Comment 15 Dan Williams 2008-10-27 20:10:42 UTC
This is probably a dupe of the one where nm-system-settings cannot determine the device's type if it's a qemu network adapter or something.  I uploaded a small script to that bug that should be run in the target install environment to see what the return error from SIOCGIWRANGE is.

Comment 16 David Cantrell 2008-10-27 20:23:37 UTC
Which bug # was that?

Comment 17 Joe Orton 2008-10-28 09:33:45 UTC
Looks like bug 466340.  I'll try running the program; this is a bare-metal install with an ethernet interface, not a virtualized one or a wireless one.

Comment 18 Mark Wielaard 2008-10-28 10:18:15 UTC
Seeing the same thing with a kvm/qemu setup that uses preupgrade to go to rawhide. The last message on console 0 is "waiting for hardware to initialize..." and when you hit a key it show a backtrace similar to comment #6. The last message on the syslog console is an NetworkManager: <info> (eth0): device state change: 2 -> 3. This is with anaconda 11.4.1.51. Preupgrade grub line is: kernel /boot/upgrade/vmlinuz preupgrade repo=hd::UID=a73876cc-24ed-4c2e-96fe-a1c7bfad0e95:/boot/upgrade/ks.cfg

Comment 19 Joe Orton 2008-10-28 15:00:09 UTC
With iwrange-test I had to:

1) change it use errno rather than the return value from ioctl
2) use a hacked copy of linux/wireless.h which didn't include linux/if.h et al, which conflicted with the userspace versions

output:

[root@ereskigal ~]# ./iwrange-test eth0
error getting IWRANGE: 95 (Operation not supported)
[root@ereskigal ~]# ifdown eth0; ./iwrange-test eth0; ifup eth0
error getting IWRANGE: 22 (Invalid argument)

Comment 20 Joe Orton 2008-10-28 15:03:15 UTC
that's with a (manual, non-kickstart) Raw Hide install from this morning:

[root@ereskigal ~]# rpm -qf /usr/sbin/nm-system-settings 
NetworkManager-0.7.0-0.11.svn4201.fc10.i386

Notwithstanding the NetworkManager bug here, I'd still say that there are also bug(s) in anaconda here too:

1) the fact that it hangs forever rather than reporting anything useful when NetworkManager fails to DTRT like this

2) the fact that it segfaults when you press a key after (1) occurs

Comment 21 Dan Williams 2008-10-28 16:21:43 UTC
Thanks Joe!  I'll fix the NM stuff to do the ioctl() correctly, and add EINVAL to the check.  I'd argue that EINVAL is the wrong value to return from SIOCGIWRANGE becuase if the device isn't wireless, it should be EOPNOTSUPP, and that some wireless devices may legitimately return EINVAL from SIOCGIWRANGE, leading to a mischaracterization of wireless devices too.

iwconfig uses SIOCGIWNAME as the check, which is wrong, but I'll change NM to reflect that behavior as well.

Comment 22 Dan Williams 2008-10-30 05:57:18 UTC

*** This bug has been marked as a duplicate of bug 466340 ***


Note You need to log in before you can comment on or make changes to this bug.