Bug 1983773 - [4.8] coreos-installer fails to download Ignition (DNS error, failed to lookup address)
Summary: [4.8] coreos-installer fails to download Ignition (DNS error, failed to looku...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.7
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
: 4.8.z
Assignee: Jonathan Lebon
QA Contact: Michael Nguyen
URL:
Whiteboard:
: 1991712 (view as bug list)
Depends On: 1967483 1991712 2006965
Blocks: 1983774
TreeView+ depends on / blocked
 
Reported: 2021-07-19 18:36 UTC by Jonathan Lebon
Modified: 2022-06-30 16:35 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: NetworkManager-wait-online.service timed out too early, preventing a connection to be established before coreos-installer started. Consequence: coreos-installer failed to fetch the Ignition config if the network took too long to come up. Fix: The NetworkManager-wait-online.service time out has been increased to its default upstream value. Result: coreos-installer no longer fails to fetch Ignition config since it only runs after networking is up.
Clone Of: 1967483
Environment:
Last Closed: 2022-06-30 16:35:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift os pull 587 0 None open [release-4.8] Bug 1974411: Bump fedora-coreos-config submodule to latest 2021-07-19 18:36:54 UTC

Comment 1 Micah Abbott 2021-08-09 18:47:49 UTC
*** Bug 1991712 has been marked as a duplicate of this bug. ***

Comment 2 RHCOS Bug Bot 2021-09-02 16:36:08 UTC
This bug has been reported fixed in a new RHCOS build.  Do not move this bug to MODIFIED until the fix has landed in a new bootimage.

Comment 3 RHCOS Bug Bot 2021-09-28 14:05:25 UTC
The fix for this bug has landed in a bootimage bump, as tracked in bug 1982001 (now in status MODIFIED).  Moving this bug to MODIFIED.

Comment 6 Michael Nguyen 2021-10-01 14:39:28 UTC
The timeout is still being overrided on 48.84.202109241901-0. 

[core@localhost 35coreos-live]$ ls | grep nm-wait-online
coreos-liveiso-reconfigure-nm-wait-online.service
[core@localhost 35coreos-live]$ grep -R nm-wait-online
live-generator:add_requires coreos-liveiso-reconfigure-nm-wait-online.service initrd.target
module-setup.sh:    inst_simple "$moddir/coreos-liveiso-reconfigure-nm-wait-online.service" \
module-setup.sh:        "$systemdsystemunitdir/coreos-liveiso-reconfigure-nm-wait-online.service"
[core@localhost 35coreos-live]$ rpm-ostree status
State: idle
Deployments:
* ostree://13c18da5e6fee09fade484c3903209730cbb73e9ebcab806b9e9000cf97fd719
                   Version: 48.84.202109241901-0 (2021-09-24T19:04:29Z)
rvice | grep ExecStartos-live]$ cat coreos-liveiso-reconfigure-nm-wait-online.ser
# Right now we are keeping the same ExecStart but we are making it
ExecStartPre=/usr/bin/mkdir -p /run/systemd/system/NetworkManager-wait-online.service.d
ExecStart=/bin/bash -c 'echo -e "[Service]\nExecStart=\nExecStart=-/usr/bin/nm-online -s -q --timeout=5" > /run/systemd/system/NetworkManager-wait-online.service.d/liveiso.conf'

Comment 8 HuijingHei 2021-10-11 13:56:14 UTC
Pretest with RHCOS 48.84.202110072059-0 which includes the fixed patch

[core@cosa-devsh 35coreos-live]$ ls | grep nm-wait-online
[core@cosa-devsh 35coreos-live]$ pwd
/usr/lib/dracut/modules.d/35coreos-live
[core@cosa-devsh 35coreos-live]$ cat live-generator | grep nm-wait-online
[core@cosa-devsh 35coreos-live]$ cat module-setup.sh | grep nm-wait-online

$ cd ../35coreos-multipath
$ grep -E "^After|OnFailure" coreos-propagate-multipath-conf.service
After=initrd-root-fs.target
OnFailure=emergency.target
OnFailureJobMode=isolate

[core@cosa-devsh 35coreos-live]$ rpm-ostree status
State: idle
Deployments:
* ostree://1eabb5b58514f98afc3a2b31970e66ac34a18109f8f219dc0499944b10753bf8
                   Version: 48.84.202110072059-0 (2021-10-07T21:02:47Z)

Comment 9 RHCOS Bug Bot 2022-06-17 15:26:01 UTC
The fix for this bug has landed in a bootimage bump, as tracked in bug 2006965 (now in status MODIFIED).  Moving this bug to MODIFIED.

Comment 11 HuijingHei 2022-06-21 12:00:33 UTC
Verify passed with rhcos-48.84.202206172122-0-qemu.x86_64.qcow2 according to steps in Comment #8

[core@cosa-devsh ~]$ cd /usr/lib/dracut/modules.d/35coreos-live

[core@cosa-devsh 35coreos-live]$ ls | grep nm-wait-online
[core@cosa-devsh 35coreos-live]$ cat live-generator | grep nm-wait-online
[core@cosa-devsh 35coreos-live]$ cat module-setup.sh | grep nm-wait-online
[core@cosa-devsh 35coreos-live]$ cd ../35coreos-multipath
[core@cosa-devsh 35coreos-multipath]$ grep -E "^After|OnFailure" coreos-propagate-multipath-conf.service
After=initrd-root-fs.target
OnFailure=emergency.target
OnFailureJobMode=isolate
[core@cosa-devsh 35coreos-multipath]$ rpm-ostree status
State: idle
Deployments:
● ostree://170ace4e7eb28e850782ecb4532cab0c53dfbf33748dbfb18ad4ec69b19cc255
                   Version: 48.84.202206172122-0 (2022-06-17T21:25:43Z)

Comment 14 errata-xmlrpc 2022-06-30 16:35:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.45 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5167


Note You need to log in before you can comment on or make changes to this bug.