Bug 804846 - dracut fails to retrieve network kickstart file, possibly PXE-specific, timing issue
Summary: dracut fails to retrieve network kickstart file, possibly PXE-specific, timin...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: 17
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Will Woods
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F17Blocker, F17FinalBlocker
TreeView+ depends on / blocked
 
Reported: 2012-03-19 23:03 UTC by Orion Poplawski
Modified: 2012-05-01 21:54 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-05-01 21:54:02 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
anaconda.log (1007 bytes, text/x-log)
2012-03-19 23:03 UTC, Orion Poplawski
no flags Details

Description Orion Poplawski 2012-03-19 23:03:11 UTC
Created attachment 571238 [details]
anaconda.log

Description of problem:

With Beta.TC2 I'm getting:

The following error was found while parsing the kickstart configuration file:

The following problem occurred on line 0 of the kickstart file:

Unable to open input file: Could not open/read file:///run/install/ks.cfg

Failed to start the anaconda installation program         [FAILED]


My boot line is:
method=http://fedora.cora.nwra.com/development/17/x86_64/os/ ksdevice=link lang= text ks=http://cobbler.cora.nwra.com/cblr/svc/op/ks/system/vmf17 root=live:http://fedstage.cora.nwra.com/17-Beta.TC2/Fedora/x86_64/os/LiveOS/squashfs.img kssendmac


I don't see any requests the in the cobbler http server log for the ks file from the vm.

Version-Release number of selected component (if applicable):
17.13

How reproducible:
2 for 2

Comment 1 Adam Williamson 2012-03-20 19:36:25 UTC
Hum - Tao Wu and Hongqing both marked the QA:Testcase_Kickstart_Http_Server_Ks_Cfg test as a pass in the matrix:

https://fedoraproject.org/wiki/Test_Results:Current_Installation_Test

so it seems like it worked for them...



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 2 Orion Poplawski 2012-03-20 19:51:18 UTC
Just tried on a physical machine with a koan --replace-self.  It's complaining that it could not resolve cobbler.cora.nwra.com.  Saw that because it took a while to download the 32-bit squashfs.img.  So maybe a network timing issue?  Seemed strange that that system tried to download the kickstart before the squashfs.img, which doesn't appear to be the case in the 64-bit VM.

Comment 3 Orion Poplawski 2012-03-20 20:08:40 UTC
Definitely a timing issue in the 32-bit phys case.  I could download the kickstart file with curl from the dracut debug shell, so perhaps the initial attempt is starting before /etc/resolv.conf is written.  Seems a very different case than what I'm seeing in the vm.

Comment 4 Orion Poplawski 2012-03-20 20:15:57 UTC
I see a /run/install dir in the dracut shell but not from the anaconda VT2 shell.  Perhaps the ks file is getting downloaded too early and then /run/install is getting wiped out?

Comment 5 Orion Poplawski 2012-03-20 20:23:04 UTC
3rd system - get curl: (7) Failed to connect to 10.10.10.1: Network is unreachable.  This is with using the ip addr in the url, so yes the attempt is happening before the network is truly up.

Comment 6 Orion Poplawski 2012-03-20 20:25:09 UTC
(In reply to comment #1)
> Hum - Tao Wu and Hongqing both marked the
> QA:Testcase_Kickstart_Http_Server_Ks_Cfg test as a pass in the matrix:
> 
> https://fedoraproject.org/wiki/Test_Results:Current_Installation_Test
> 
> so it seems like it worked for them...

Perhaps it works with the iso or other media?  I'll give that a try next.

Comment 7 Orion Poplawski 2012-03-20 20:54:16 UTC
Yeah, it seems to work okay with the netinst.iso, so I guess this only applies to PXE boot or similar setups.

Comment 8 Adam Williamson 2012-03-20 23:29:58 UTC

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 9 Orion Poplawski 2012-04-10 20:06:33 UTC
Still seeing this on at least one machine with Beta RC4 failing to get the kickstart file.  Old Dell Optiplex 170L with Intel e100 NIC.

Comment 10 Orion Poplawski 2012-04-10 20:09:34 UTC
It goes on to fetch the root image but then crashes with:

dracut Warning: Unable to process initqueue

Which I'm hoping is caused by being unable to download the kickstart file.

Comment 11 k.wic 2012-04-19 21:16:20 UTC
I ran into this problem as well with the following lines in pxelinux.cfg:

kernel f17/vmlinuz
append initrd=f17/initrd.img root=live:http://dl.fedoraproject.org/pub/fedora/linux/releases/test/17-Beta/Fedora/x86_64/os/LiveOS/squashfs.img inst.repo=http://dl.fedoraproject.org/pub/fedora/linux/releases/test/17-Beta/Fedora/x86_64/os inst.ks=http://rhe.fedorapeople.org/install/ks.cfg

It fails with curl complaining about the network not being reachable. When I add the inst.update=http://... option the update fails with the same error. The squashfs.img however can be downloaded without problems.

I played around a little with my initrd.img and added a 'sleep 30' to /lib/dracut/hooks/initqueue/online/00fetch-kickstart-net.sh right before 'if fetch_url "$kickstart" /tmp/ks.cfg; then'. This however made no difference at all.
Next I tried renaming /lib/dracut/hooks/initqueue/online/anaconda-ifcfg.sh which is generate by $hookdir/cmdline/25parse-anaconda-options.sh to 00anaconda-ifcfg.sh so that it is executed before 00fetch-kickstart-net.sh. This also made no difference.
Then I looked closer at why the squashfs.img download was working and I stumbled over the following two commands 'all_ifaces_up' and 'setup_net $netif' (I hardcoded $netif to be em1). If they are placed in 00fetch-kickstart-net.sh on the same line I tried to use the 'sleep 30' and '. /lib/net-lib.sh' is added to provide those functions, then the download of the kickstart file works without problems.

Unfortunately the kickstart I used does not work with F17, but I will try to fix that tomorrow. Then I will also take a closer look at what exactly in 'all_ifaces_up' and 'setup_net $netif' causes curl to work.

Comment 12 Adam Williamson 2012-04-20 18:36:00 UTC
Discussed at 2012-04-20 blocker review meeting: http://meetbot.fedoraproject.org/fedora-bugzappers/2012-04-20/fedora-bugzappers.2012-04-20-17.01.log.txt . Agreed we need more information on exactly what the problem is here before we can determine if it's serious enough to be a blocker - multiple people seem to be hitting trouble, but then others are not. It would be good if all testers could try various adjustments to the config to see if we can isolate exactly what the problem is here. tflink notes that he's never seen this, but he always uses numerical IP addresses not hostnames; name resolution could be a factor?

Comment 13 Will Woods 2012-04-20 22:15:36 UTC
I'm fairly sure this is caused by the 'online' hook scripts running before 'setup_net' happens, which should be fixed by these two patches:
  
  http://git.kernel.org/?p=boot/dracut/dracut.git;a=commitdiff;h=1e4a880
  https://www.redhat.com/archives/anaconda-devel-list/2012-April/msg00248.html

You could approximate this fix by adding these two lines to the top of 00fetch-kickstart-net.sh:

  . /lib/net-lib.sh
  setup_net $netif

Could someone confirm that fixes the problem?

Comment 14 Orion Poplawski 2012-04-23 17:27:04 UTC
I'd be happy to test with a test image, otherwise don't know how to setup a test of this.

Comment 15 Fedora Update System 2012-04-23 19:08:21 UTC
anaconda-17.22-1.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/anaconda-17.22-1.fc17

Comment 16 Fedora Update System 2012-04-24 03:18:49 UTC
Package anaconda-17.22-1.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing anaconda-17.22-1.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-6486/anaconda-17.22-1.fc17
then log in and leave karma (feedback).

Comment 17 Fedora Update System 2012-04-26 15:59:40 UTC
anaconda-17.23-1.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/anaconda-17.23-1.fc17

Comment 18 Tim Flink 2012-04-26 17:04:56 UTC
Was a fix for this in anaconda-17.22-1? If so, can someone test with F17 final TC1?

If not, we can wait for TC2 - I've never hit this so I don't think I would be the best person to test the fix.

Comment 19 Orion Poplawski 2012-04-26 17:26:39 UTC
TC1 with 17.22 appears to have fixed it for me.

Comment 20 Tim Flink 2012-05-01 21:54:02 UTC
Since there have been no new reports of this recently and one report of it being fixed with anaconda-17.22-1, I'm closing it. If someone hits this same issue again, please re-open the bug.


Note You need to log in before you can comment on or make changes to this bug.