Created attachment 1668127 [details]
Journal with rd.debug on
Description of problem:
During kickstart installation anaconda fails early after switch root because kickstart is not found.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Run kickstart installation, eg PXE:
append initrd=test/rv/scripted/development-rawhide/initrd.img repo=http://download.englab.brq.redhat.com/pub/fedora/development-rawhide/Everything/x86_64/os ks=http://10.43.136.2/ks/rv/ks.unattended.cfg
Installation fails with
anaconda 33.2-1.fc33 for Fedora Rawhide (pre-release) started.
* installation log files are stored in /tmp during the installation
* shell is available on TTY2
* if the graphical installation interface fails to start, try again with the
inst.text bootoption to start text installation
* when reporting a bug add logs from /tmp as separate text/plain attachments
14:25:21 Kickstart file /run/install/ks.cfg is missing.
Attaching journal with rd.debug.
Use stage2 from boot.iso, repo defined in kickstart.
append initrd=test/rv/scripted/development-rawhide/initrd.img stage2=hd:CDLABEL=Fedora-E-dvd-x86_64-rawh ks=http://10.43.136.2/ks/rv/ks.unattended.cfg
url --url "http://download.englab.brq.redhat.com/pub/fedora/development-rawhide/Everything/x86_64/os"
Based on Harald debugging (thanks a lot for your help!) it seems that the issue is in Anaconda in the end. Probably the issue is there for a long time but there were a time condition which has a different outcome on Dracut 050.
What is happening, is that Anaconda will create job to initqueue to download KS file:
Mar 06 14:16:27 dhcp92.anaconda.englab.brq.redhat.com dracut-initqueue: ////lib/dracut/hooks/initqueue/online/11-fetch-kickstart-net.sh@72(source): newjob=/lib/dracut/hooks/initqueue/fetch-ks-ens11.sh
Mar 06 14:16:27 dhcp92.anaconda.englab.brq.redhat.com dracut-initqueue: ////lib/dracut/hooks/initqueue/online/11-fetch-kickstart-net.sh@76(source): cat
Which is fine, unfortunately it's too late so Dracut has already exited. That would also explain why stage2 from boot.iso is working as a workaround --> not downloading stage2 will make the whole process faster so there is a time to create and execute kickstart download job.
To fix this, Harald has proposed a solution to add inhibitor to initqueue to avoid it's exit. This inhibitor will be just a script which would return 0 if all the conditions are finished (KS file, stage2, updates.img and product.img are downloaded if requested) or -1 because we're still waiting for something.
Created attachment 1668827 [details]
Journal with rd.debug on from F32
The case from the description is working on Fedora 32 (Fedora-32-20200309.n.0).
(In reply to Jiri Konecny from comment #2)
> To fix this, Harald has proposed a solution to add inhibitor to initqueue to
> avoid it's exit. This inhibitor will be just a script which would return 0
> if all the conditions are finished (KS file, stage2, updates.img and
> product.img are downloaded if requested) or -1 because we're still waiting
> for something.
I think we have such an inhibitor, from F32 (the run from comment#3):
pre-pivot:/# cat /usr/lib/dracut/hooks/initqueue/finished/kickstart.sh.
[ -e /tmp/ks.cfg.done ]
It is generated by:
But comparing logs from f32 (comment #3) and rawhide (the description) the generator is not run of rawhide for some reason. Of the pre-trigger hooks:
only the 01-load-modsign-keys.sh and 03-lldpad.sh are run on rawhide, then dracut-pre-trigger finishes.
This seems like a dracut issue to me ?
(In reply to Radek Vykydal from comment #4)
> only the 01-load-modsign-keys.sh and 03-lldpad.sh are run on rawhide, then
> dracut-pre-trigger finishes.
Perhaps caused by:
that makes the dracut-per-trigger exit.
(In reply to Radek Vykydal from comment #5)
> (In reply to Radek Vykydal from comment #4)
> > only the 01-load-modsign-keys.sh and 03-lldpad.sh are run on rawhide, then
> > dracut-pre-trigger finishes.
> Perhaps caused by:
> that makes the dracut-per-trigger exit.
> Introduced in:
PR with possible fix:
Better workaround: add rd.nofcoe=1 boot option.
Fix was merged upstream, I backported it: https://koji.fedoraproject.org/koji/taskinfo?taskID=42392356 . Should be in next Rawhide compose.
It seems to me like all Rawhide installer boots starting failing when dracut 050 landed, not just kickstart installs. At least, openQA install tests are all failing, and George Goffe reported Rawhide installer image failing to boot to test@ as well. I'm curious to see if this fixes those issues too. Will keep an eye on the openQA results tomorrow.
FEDORA-2020-529e8e2f53 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-529e8e2f53
Does indeed seem like this fixes non-kickstart installer boots as well. This bug was reported against Rawhide, we don't really need to track the fix in the f32 update (the bug never made f32 stable anyway).
Thanks a lot for testing this Adam. Yeah, it could be affecting also non-kickstart installation. Everything is started parallel in Dracut if I'm not mistaken, so it could have different outcomes based on environment.
Removing needinfo from harald because this is already fixed.