Bug 1811070

Summary: Anaconda: kickstart not passed to installer environment (Kickstart file /run/install/ks.cfg is missing.)
Product: [Fedora] Fedora Reporter: Radek Vykydal <rvykydal>
Component: dracutAssignee: dracut-maint-list
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: anaconda-maint-list, awilliam, dracut-maint-list, grgoffe, harald, jkonecny, jonathan, jstodola, kellin, mkolman, vanmeeuwen+fedora, vponcova, wwoods, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: openqa
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-11 16:46:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Journal with rd.debug on
none
Journal with rd.debug on from F32 none

Description Radek Vykydal 2020-03-06 14:34:04 UTC
Created attachment 1668127 [details]
Journal with rd.debug on

Description of problem:

During kickstart installation anaconda fails early after switch root because kickstart is not found.

Version-Release number of selected component (if applicable):
Fedora-Rawhide-20200304.n.1
dracut-050-1.fc33

How reproducible:
 
Always

Steps to Reproduce:
1. Run kickstart installation, eg PXE:
  append initrd=test/rv/scripted/development-rawhide/initrd.img repo=http://download.englab.brq.redhat.com/pub/fedora/development-rawhide/Everything/x86_64/os ks=http://10.43.136.2/ks/rv/ks.unattended.cfg

Actual results:

Installation fails with

anaconda 33.2-1.fc33 for Fedora Rawhide (pre-release) started.
 * installation log files are stored in /tmp during the installation
 * shell is available on TTY2
 * if the graphical installation interface fails to start, try again with the
   inst.text bootoption to start text installation
 * when reporting a bug add logs from /tmp as separate text/plain attachments
14:25:21 Kickstart file /run/install/ks.cfg is missing.


Expected results:

Installation succeeds

Additional info:

Attaching journal with rd.debug.

Comment 1 Radek Vykydal 2020-03-09 08:51:55 UTC
A workaround:
Use stage2 from boot.iso, repo defined in kickstart.

pxe boot:
append initrd=test/rv/scripted/development-rawhide/initrd.img stage2=hd:CDLABEL=Fedora-E-dvd-x86_64-rawh ks=http://10.43.136.2/ks/rv/ks.unattended.cfg

ks:
url --url "http://download.englab.brq.redhat.com/pub/fedora/development-rawhide/Everything/x86_64/os"

Comment 2 Jiri Konecny 2020-03-09 12:46:24 UTC
Based on Harald debugging (thanks a lot for your help!) it seems that the issue is in Anaconda in the end. Probably the issue is there for a long time but there were a time condition which has a different outcome on Dracut 050.

What is happening, is that Anaconda will create job to initqueue to download KS file:

Mar 06 14:16:27 dhcp92.anaconda.englab.brq.redhat.com dracut-initqueue[756]: ////lib/dracut/hooks/initqueue/online/11-fetch-kickstart-net.sh@72(source): newjob=/lib/dracut/hooks/initqueue/fetch-ks-ens11.sh
Mar 06 14:16:27 dhcp92.anaconda.englab.brq.redhat.com dracut-initqueue[756]: ////lib/dracut/hooks/initqueue/online/11-fetch-kickstart-net.sh@76(source): cat

Which is fine, unfortunately it's too late so Dracut has already exited. That would also explain why stage2 from boot.iso is working as a workaround --> not downloading stage2 will make the whole process faster so there is a time to create and execute kickstart download job.

To fix this, Harald has proposed a solution to add inhibitor to initqueue to avoid it's exit. This inhibitor will be just a script which would return 0 if all the conditions are finished (KS file, stage2, updates.img and product.img are downloaded if requested) or -1 because we're still waiting for something.

Comment 3 Radek Vykydal 2020-03-10 08:30:51 UTC
Created attachment 1668827 [details]
Journal with rd.debug on from F32

The case from the description is working on Fedora 32 (Fedora-32-20200309.n.0).

Comment 4 Radek Vykydal 2020-03-10 08:44:20 UTC
(In reply to Jiri Konecny from comment #2)

> To fix this, Harald has proposed a solution to add inhibitor to initqueue to
> avoid it's exit. This inhibitor will be just a script which would return 0
> if all the conditions are finished (KS file, stage2, updates.img and
> product.img are downloaded if requested) or -1 because we're still waiting
> for something.

I think we have such an inhibitor, from F32 (the run from comment#3):

pre-pivot:/# cat /usr/lib/dracut/hooks/initqueue/finished/kickstart.sh.
[ -e /tmp/ks.cfg.done ]

It is generated by:
/usr/lib/dracut/hooks/pre-trigger/50-kickstart-genrules.sh

But comparing logs from f32 (comment #3) and rawhide (the description) the generator is not run of rawhide for some reason. Of the pre-trigger hooks:

/usr/lib/dracut/hooks/pre-trigger/55-driver-updates-genrules.sh
/usr/lib/dracut/hooks/pre-trigger/50-updates-genrules.sh
/usr/lib/dracut/hooks/pre-trigger/50-repo-genrules.sh
/usr/lib/dracut/hooks/pre-trigger/50-kickstart-genrules.sh
/usr/lib/dracut/hooks/pre-trigger/30-parse-md.sh
/usr/lib/dracut/hooks/pre-trigger/30-parse-dm.sh
/usr/lib/dracut/hooks/pre-trigger/03-lldpad.sh
/usr/lib/dracut/hooks/pre-trigger/01-load-modsign-keys.sh

only the 01-load-modsign-keys.sh and 03-lldpad.sh are run on rawhide, then dracut-pre-trigger finishes.

This seems like a dracut issue to me ?

Comment 5 Radek Vykydal 2020-03-10 09:05:37 UTC
(In reply to Radek Vykydal from comment #4)
 
> only the 01-load-modsign-keys.sh and 03-lldpad.sh are run on rawhide, then
> dracut-pre-trigger finishes.
> 

Perhaps caused by:
https://github.com/dracutdevs/dracut/blob/a76aa8e39016a8564adb0f18f93bbf2e15d3243f/modules.d/95fcoe/lldpad.sh#L5
that makes the dracut-per-trigger exit.
Introduced in:
https://github.com/dracutdevs/dracut/commit/7c6d2ad916bd536dc2f082fd96ef837a5031e497

Comment 6 Radek Vykydal 2020-03-10 09:51:20 UTC
(In reply to Radek Vykydal from comment #5)
> (In reply to Radek Vykydal from comment #4)
>  
> > only the 01-load-modsign-keys.sh and 03-lldpad.sh are run on rawhide, then
> > dracut-pre-trigger finishes.
> > 
> 
> Perhaps caused by:
> https://github.com/dracutdevs/dracut/blob/
> a76aa8e39016a8564adb0f18f93bbf2e15d3243f/modules.d/95fcoe/lldpad.sh#L5
> that makes the dracut-per-trigger exit.
> Introduced in:
> https://github.com/dracutdevs/dracut/commit/
> 7c6d2ad916bd536dc2f082fd96ef837a5031e497

PR with possible fix:
https://github.com/dracutdevs/dracut/pull/754

Comment 7 Radek Vykydal 2020-03-10 09:52:13 UTC
Better workaround: add rd.nofcoe=1 boot option.

Comment 8 Adam Williamson 2020-03-11 00:42:13 UTC
Fix was merged upstream, I backported it: https://koji.fedoraproject.org/koji/taskinfo?taskID=42392356 . Should be in next Rawhide compose.

It seems to me like all Rawhide installer boots starting failing when dracut 050 landed, not just kickstart installs. At least, openQA install tests are all failing, and George Goffe reported Rawhide installer image failing to boot to test@ as well. I'm curious to see if this fixes those issues too. Will keep an eye on the openQA results tomorrow.

Comment 9 Fedora Update System 2020-03-11 14:05:38 UTC
FEDORA-2020-529e8e2f53 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-529e8e2f53

Comment 10 Adam Williamson 2020-03-11 16:46:35 UTC
Does indeed seem like this fixes non-kickstart installer boots as well. This bug was reported against Rawhide, we don't really need to track the fix in the f32 update (the bug never made f32 stable anyway).

Comment 11 Jiri Konecny 2020-03-12 12:52:07 UTC
Thanks a lot for testing this Adam. Yeah, it could be affecting also non-kickstart installation. Everything is started parallel in Dracut if I'm not mistaken, so it could have different outcomes based on environment.

Removing needinfo from harald because this is already fixed.