Bug 804846
Summary: | dracut fails to retrieve network kickstart file, possibly PXE-specific, timing issue | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Orion Poplawski <orion> | ||||
Component: | anaconda | Assignee: | Will Woods <wwoods> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 17 | CC: | atigro, awilliam, g.kaviyarasu, jonathan, k.wic, robatino, tflink, vanmeeuwen+fedora | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-05-01 21:54:02 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 752650 | ||||||
Attachments: |
|
Description
Orion Poplawski
2012-03-19 23:03:11 UTC
Hum - Tao Wu and Hongqing both marked the QA:Testcase_Kickstart_Http_Server_Ks_Cfg test as a pass in the matrix: https://fedoraproject.org/wiki/Test_Results:Current_Installation_Test so it seems like it worked for them... -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Just tried on a physical machine with a koan --replace-self. It's complaining that it could not resolve cobbler.cora.nwra.com. Saw that because it took a while to download the 32-bit squashfs.img. So maybe a network timing issue? Seemed strange that that system tried to download the kickstart before the squashfs.img, which doesn't appear to be the case in the 64-bit VM. Definitely a timing issue in the 32-bit phys case. I could download the kickstart file with curl from the dracut debug shell, so perhaps the initial attempt is starting before /etc/resolv.conf is written. Seems a very different case than what I'm seeing in the vm. I see a /run/install dir in the dracut shell but not from the anaconda VT2 shell. Perhaps the ks file is getting downloaded too early and then /run/install is getting wiped out? 3rd system - get curl: (7) Failed to connect to 10.10.10.1: Network is unreachable. This is with using the ip addr in the url, so yes the attempt is happening before the network is truly up. (In reply to comment #1) > Hum - Tao Wu and Hongqing both marked the > QA:Testcase_Kickstart_Http_Server_Ks_Cfg test as a pass in the matrix: > > https://fedoraproject.org/wiki/Test_Results:Current_Installation_Test > > so it seems like it worked for them... Perhaps it works with the iso or other media? I'll give that a try next. Yeah, it seems to work okay with the netinst.iso, so I guess this only applies to PXE boot or similar setups. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Still seeing this on at least one machine with Beta RC4 failing to get the kickstart file. Old Dell Optiplex 170L with Intel e100 NIC. It goes on to fetch the root image but then crashes with: dracut Warning: Unable to process initqueue Which I'm hoping is caused by being unable to download the kickstart file. I ran into this problem as well with the following lines in pxelinux.cfg: kernel f17/vmlinuz append initrd=f17/initrd.img root=live:http://dl.fedoraproject.org/pub/fedora/linux/releases/test/17-Beta/Fedora/x86_64/os/LiveOS/squashfs.img inst.repo=http://dl.fedoraproject.org/pub/fedora/linux/releases/test/17-Beta/Fedora/x86_64/os inst.ks=http://rhe.fedorapeople.org/install/ks.cfg It fails with curl complaining about the network not being reachable. When I add the inst.update=http://... option the update fails with the same error. The squashfs.img however can be downloaded without problems. I played around a little with my initrd.img and added a 'sleep 30' to /lib/dracut/hooks/initqueue/online/00fetch-kickstart-net.sh right before 'if fetch_url "$kickstart" /tmp/ks.cfg; then'. This however made no difference at all. Next I tried renaming /lib/dracut/hooks/initqueue/online/anaconda-ifcfg.sh which is generate by $hookdir/cmdline/25parse-anaconda-options.sh to 00anaconda-ifcfg.sh so that it is executed before 00fetch-kickstart-net.sh. This also made no difference. Then I looked closer at why the squashfs.img download was working and I stumbled over the following two commands 'all_ifaces_up' and 'setup_net $netif' (I hardcoded $netif to be em1). If they are placed in 00fetch-kickstart-net.sh on the same line I tried to use the 'sleep 30' and '. /lib/net-lib.sh' is added to provide those functions, then the download of the kickstart file works without problems. Unfortunately the kickstart I used does not work with F17, but I will try to fix that tomorrow. Then I will also take a closer look at what exactly in 'all_ifaces_up' and 'setup_net $netif' causes curl to work. Discussed at 2012-04-20 blocker review meeting: http://meetbot.fedoraproject.org/fedora-bugzappers/2012-04-20/fedora-bugzappers.2012-04-20-17.01.log.txt . Agreed we need more information on exactly what the problem is here before we can determine if it's serious enough to be a blocker - multiple people seem to be hitting trouble, but then others are not. It would be good if all testers could try various adjustments to the config to see if we can isolate exactly what the problem is here. tflink notes that he's never seen this, but he always uses numerical IP addresses not hostnames; name resolution could be a factor? I'm fairly sure this is caused by the 'online' hook scripts running before 'setup_net' happens, which should be fixed by these two patches: http://git.kernel.org/?p=boot/dracut/dracut.git;a=commitdiff;h=1e4a880 https://www.redhat.com/archives/anaconda-devel-list/2012-April/msg00248.html You could approximate this fix by adding these two lines to the top of 00fetch-kickstart-net.sh: . /lib/net-lib.sh setup_net $netif Could someone confirm that fixes the problem? I'd be happy to test with a test image, otherwise don't know how to setup a test of this. anaconda-17.22-1.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/anaconda-17.22-1.fc17 Package anaconda-17.22-1.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing anaconda-17.22-1.fc17' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-6486/anaconda-17.22-1.fc17 then log in and leave karma (feedback). anaconda-17.23-1.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/anaconda-17.23-1.fc17 Was a fix for this in anaconda-17.22-1? If so, can someone test with F17 final TC1? If not, we can wait for TC2 - I've never hit this so I don't think I would be the best person to test the fix. TC1 with 17.22 appears to have fixed it for me. Since there have been no new reports of this recently and one report of it being fixed with anaconda-17.22-1, I'm closing it. If someone hits this same issue again, please re-open the bug. |