Bug 2153361
| Summary: | Kickstart doesn't start network to download stage2 from remote source | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Jan Stodola <jstodola> |
| Component: | anaconda | Assignee: | Radek Vykydal <rvykydal> |
| Status: | CLOSED ERRATA | QA Contact: | Release Test Team <release-test-team-automation> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 9.1 | CC: | chchiu, jkonecny, jmeneghi, lnykryn, lrintel, nilesh.javali, njavali, pvalena, rvykydal, vponcova |
| Target Milestone: | rc | Keywords: | Triaged |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | FCOE_P1 | ||
| Fixed In Version: | anaconda-34.25.2.9-1.el9 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-05-09 07:35:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2157082 | ||
| Bug Blocks: | 2129768, 2133053 | ||
Part of the problem seems to be this patch: https://github.com/redhat-plumbers/dracut-rhel9/commit/77630365aed201a729c73a9ffda0733a75f3fee4 but apparently there are more issues. (In reply to Radek Vykydal from comment #2) > Part of the problem seems to be this patch: > https://github.com/redhat-plumbers/dracut-rhel9/commit/ > 77630365aed201a729c73a9ffda0733a75f3fee4 but apparently there are more > issues. So the revert of the ^^ dracut patch and (In reply to Radek Vykydal from comment #3) > > I guess we need to update the code in > https://github.com/rhinstaller/anaconda/blob/ > a3462fae15d22d0c3c2462f187bbcc1d2db13060/dracut/anaconda-lib.sh#L383. > I'll ask Dracut for direction first. this Anaconda update: https://github.com/rhinstaller/anaconda/pull/4477 fixes the problem for me. We might also want to filter out "lo" here: https://github.com/redhat-plumbers/dracut-rhel9/blob/8aa62b8cb28a41d4739633aee9b02e40dc4a75d0/modules.d/35network-manager/nm-run.sh#L66. (In reply to Radek Vykydal from comment #5) > (In reply to Radek Vykydal from comment #2) > > Part of the problem seems to be this patch: > > https://github.com/redhat-plumbers/dracut-rhel9/commit/ > > 77630365aed201a729c73a9ffda0733a75f3fee4 but apparently there are more > > issues. > > So the revert of the ^^ dracut patch or maybe keep it only for the case not using the systemd nm-initrd service case: https://github.com/redhat-plumbers/dracut-rhel9/compare/main...rvykydal:dracut-rhel9:allow-running-nm-run-repeatedly > We might also want to filter out "lo" here: > https://github.com/redhat-plumbers/dracut-rhel9/blob/ > 8aa62b8cb28a41d4739633aee9b02e40dc4a75d0/modules.d/35network-manager/nm-run. > sh#L66. https://github.com/redhat-plumbers/dracut-rhel9/compare/main...rvykydal:dracut-rhel9:ignore-lo-when-running-online-hooks I am going to create a kickstart test for the use case. Needs some updates of the tooling. (In reply to Radek Vykydal from comment #8) > I am going to create a kickstart test for the use case. Needs some updates > of the tooling. https://github.com/rhinstaller/kickstart-tests/pull/851 (In reply to Radek Vykydal from comment #6) > (In reply to Radek Vykydal from comment #5) > > (In reply to Radek Vykydal from comment #2) > > > Part of the problem seems to be this patch: > > > https://github.com/redhat-plumbers/dracut-rhel9/commit/ > > > 77630365aed201a729c73a9ffda0733a75f3fee4 but apparently there are more > > > issues. > > > > So the revert of the ^^ dracut patch > > or maybe keep it only for the case not using the systemd nm-initrd service > case: > https://github.com/redhat-plumbers/dracut-rhel9/compare/main...rvykydal: > dracut-rhel9:allow-running-nm-run-repeatedly > > > > We might also want to filter out "lo" here: > > https://github.com/redhat-plumbers/dracut-rhel9/blob/ > > 8aa62b8cb28a41d4739633aee9b02e40dc4a75d0/modules.d/35network-manager/nm-run. > > sh#L66. > > https://github.com/redhat-plumbers/dracut-rhel9/compare/main...rvykydal: > dracut-rhel9:ignore-lo-when-running-online-hooks Thanks! The changes make sense... I've filed the PRs here: https://github.com/redhat-plumbers/dracut-rhel9/pull/58 https://github.com/redhat-plumbers/dracut-rhel9/pull/59 I will investigate further. Let me know if I should drop them for some reason. Marking as depending on the dracut fix, which is tracked in bug 2134060. The Anaconda part of the fix: https://github.com/rhinstaller/anaconda/pull/4477 Ok, so the condition https://github.com/redhat-plumbers/dracut-rhel9/blob/main/modules.d/35network-manager/nm-run.sh#L5 is wrong. It is a relic from the time when NM was run in oneshot mode. And on top of that, I think that the whole nm-run is wrong. Back in RHEL8 start of NM was tight to the settled initqueue. In other words, we waited for udev to finish processing all the devices, then we knew it was OK to start NM and then we called all the online hooks. Now we start NM as a daemon. We don't have to wait for udev, since NM can handle the device when it appears. But we still have to call the online hooks for those devices. And right now, we are not doing that after NM sets up the device, but when udev processes all devices. If you think that sounds weird, you are entirely correct. There is a race, because there is no synchronization between nm-run and NM actually setting up the interface. Well, there is this check https://github.com/redhat-plumbers/dracut-rhel9/blob/main/modules.d/35network-manager/nm-run.sh#L65 but that only causes that if NM is slow nm-run will do nothing. IMHO The correct solution is to turn the nm-run into a dispatcher script for NM. Thanks a lot Lukáš for explanation. In that case Network Manager needs to take a look on this. Hopefully they would be able to solve that for 9.2 (I know it's late). It's also an issue on bug 2134060. Lubomír, could you please take a look on issue described in comment 14. There seems to be an e-mail thread about this, which contains some discussion about invoking an online hook with a dispatcher. However, this has worked before (be it by a brute force) and regressed. Here's a revert of the regression: https://github.com/dracutdevs/dracut/pull/2134 (In reply to Radek Vykydal from comment #13) > The Anaconda part of the fix: > https://github.com/rhinstaller/anaconda/pull/4477 The ^ patch added via initrd overlay works for me with dracut-057-21.git20230214.el9 from the test compose artifacts.osci.redhat.com/comp/rhel-9.2.0/50756495-2374-dracut/. (In reply to Radek Vykydal from comment #9) > (In reply to Radek Vykydal from comment #8) > > I am going to create a kickstart test for the use case. Needs some updates > > of the tooling. > > https://github.com/rhinstaller/kickstart-tests/pull/851 Kickstart test for the issue: https://github.com/rhinstaller/kickstart-tests/blob/master/stage2-from-ks.ks.in Reproduced on RHEL-9.2.0-20230220.9 with anaconda-34.25.2.8-1.el9. Verified using anaconda-34.25.2.9-1.el9, the problem is fixed and the installation started as expected, with stage2 downloaded from the network source defined in the kickstart file. Marking as Verified:Tested Checked that anaconda-34.25.2.9-1.el9 is in nightly compose RHEL-9.2.0-20230223.23 Moving to VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (anaconda bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2223 |
Description of problem: The installer doesn't configure network specified in the kickstart file and doesn't download stage2 from a remote location (http). I have the following kickstart file: url --url http://<PATH_TO_RHEL> network --device=link --bootproto=dhcp --activate The kickstart file is added to RHEL-9.1 boot.iso using mkksiso: mkksiso ks.cfg RHEL-9.1.0-20221027.3-BaseOS-x86_64-boot.iso iso_with_ks.iso The new iso is booted in a VM, the kernel command line is modified ("inst.stage2=..." is removed): "vmlinuz initrd=initrd.img inst.ks=hd:LABEL=RHEL-9-1-0-BaseOS-x86_64:/ks.cfg console=ttyS0" When the VM is booting, the following error/warning is visible: ... [ OK ] Reached target Local File Systems. [ OK ] Reached target System Initialization. [ OK ] Reached target Basic System. [ 5.526732] dracut-initqueue[1217]: parse-kickstart WARNING: No device with link found for --device=link Starting D-Bus System Message Bus... [ OK ] Started D-Bus System Message Bus. [ 140.656623] dracut-initqueue[976]: Warning: dracut-initqueue: timeout, still waiting for following initqueue hooks: [ 140.660736] dracut-initqueue[976]: Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2froot.sh: "[ -e "/dev/root" ]" [ 140.662510] dracut-initqueue[976]: Warning: /lib/dracut/hooks/initqueue/finished/kickstart.sh: "[ -e /tmp/ks.cfg.done ]" [ 140.663941] dracut-initqueue[976]: Warning: /lib/dracut/hooks/initqueue/finished/nm.sh: "[ -f /tmp/nm.done ]" [ 140.665289] dracut-initqueue[976]: Warning: /lib/dracut/hooks/initqueue/finished/wait_for_settle.sh: "[ -f /tmp/settle.done ]" [ 140.667147] dracut-initqueue[976]: Warning: dracut-initqueue: starting timeout scripts [ 140.667256] dracut-initqueue[976]: Warning: ############# Anaconda installer errors begin ############# [ 140.669005] dracut-initqueue[976]: Warning: # # [ 140.669061] dracut-initqueue[976]: Warning: It seems that the boot has failed. Possible causes include [ 140.669092] dracut-initqueue[976]: Warning: missing inst.stage2 or inst.repo boot parameters on the [ 140.669116] dracut-initqueue[976]: Warning: kernel cmdline. Please verify that you have specified [ 140.669141] dracut-initqueue[976]: Warning: inst.stage2 or inst.repo. [ 140.669174] dracut-initqueue[976]: Warning: Please also note that the 'inst.' prefix is now mandatory. [ 140.669200] dracut-initqueue[976]: Warning: # # [ 140.669225] dracut-initqueue[976]: Warning: #### Installer errors encountered during boot: #### [ 140.669248] dracut-initqueue[976]: Warning: # # [ 140.669271] dracut-initqueue[976]: /lib/dracut/hooks/initqueue/timeout/50-anaconda-error-reporting.sh: line 19: /run/anaconda/initrd_errors.txt: No such file or directory ... Starting Dracut Emergency Shell... Warning: /dev/root does not exist Generating "/run/initramfs/rdsosreport.txt" Entering emergency mode. Exit the shell to continue. Type "journalctl" to view system logs. You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot after mounting them and attach it to a bug report. dracut:/# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp1s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 52:54:00:c1:64:a9 brd ff:ff:ff:ff:ff:ff dracut:/# Version-Release number of selected component (if applicable): RHEL-9.1 GA How reproducible: Always Steps to Reproduce: 1. Have the following kickstart file: url --url http://<PATH_TO_RHEL> network --device=link --bootproto=dhcp --activate 2. Insert the kickstart to the boot.iso: mkksiso ks.cfg RHEL-9.1.0-20221027.3-BaseOS-x86_64-boot.iso iso_with_ks.iso 3. Boot the modified iso in a VM, remove "inst.stage2=..." from the kernel command line. Actual results: The installer doesn't fetch stage2, network is not configured. Expected results: Stage2 is downloaded successfully. Additional info: The same steps work fine on RHEL-8.7