Bug 1983773
Summary: | [4.8] coreos-installer fails to download Ignition (DNS error, failed to lookup address) | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jonathan Lebon <jlebon> |
Component: | RHCOS | Assignee: | Jonathan Lebon <jlebon> |
Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 4.7 | CC: | aivaras.laimikis, bgalvani, bgilbert, chdeshpa, dornelas, dustymabe, hhei, jlebon, jligon, jnordell, lucab, miabbott, mnguyen, mrussell, nstielau |
Target Milestone: | --- | ||
Target Release: | 4.8.z | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: NetworkManager-wait-online.service timed out too early, preventing a connection to be established before coreos-installer started.
Consequence: coreos-installer failed to fetch the Ignition config if the network took too long to come up.
Fix: The NetworkManager-wait-online.service time out has been increased to its default upstream value.
Result: coreos-installer no longer fails to fetch Ignition config since it only runs after networking is up.
|
Story Points: | --- |
Clone Of: | 1967483 | Environment: | |
Last Closed: | 2022-06-30 16:35:30 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1967483, 1991712, 2006965 | ||
Bug Blocks: | 1983774 |
Comment 1
Micah Abbott
2021-08-09 18:47:49 UTC
This bug has been reported fixed in a new RHCOS build. Do not move this bug to MODIFIED until the fix has landed in a new bootimage. The fix for this bug has landed in a bootimage bump, as tracked in bug 1982001 (now in status MODIFIED). Moving this bug to MODIFIED. The timeout is still being overrided on 48.84.202109241901-0. [core@localhost 35coreos-live]$ ls | grep nm-wait-online coreos-liveiso-reconfigure-nm-wait-online.service [core@localhost 35coreos-live]$ grep -R nm-wait-online live-generator:add_requires coreos-liveiso-reconfigure-nm-wait-online.service initrd.target module-setup.sh: inst_simple "$moddir/coreos-liveiso-reconfigure-nm-wait-online.service" \ module-setup.sh: "$systemdsystemunitdir/coreos-liveiso-reconfigure-nm-wait-online.service" [core@localhost 35coreos-live]$ rpm-ostree status State: idle Deployments: * ostree://13c18da5e6fee09fade484c3903209730cbb73e9ebcab806b9e9000cf97fd719 Version: 48.84.202109241901-0 (2021-09-24T19:04:29Z) rvice | grep ExecStartos-live]$ cat coreos-liveiso-reconfigure-nm-wait-online.ser # Right now we are keeping the same ExecStart but we are making it ExecStartPre=/usr/bin/mkdir -p /run/systemd/system/NetworkManager-wait-online.service.d ExecStart=/bin/bash -c 'echo -e "[Service]\nExecStart=\nExecStart=-/usr/bin/nm-online -s -q --timeout=5" > /run/systemd/system/NetworkManager-wait-online.service.d/liveiso.conf' Pretest with RHCOS 48.84.202110072059-0 which includes the fixed patch [core@cosa-devsh 35coreos-live]$ ls | grep nm-wait-online [core@cosa-devsh 35coreos-live]$ pwd /usr/lib/dracut/modules.d/35coreos-live [core@cosa-devsh 35coreos-live]$ cat live-generator | grep nm-wait-online [core@cosa-devsh 35coreos-live]$ cat module-setup.sh | grep nm-wait-online $ cd ../35coreos-multipath $ grep -E "^After|OnFailure" coreos-propagate-multipath-conf.service After=initrd-root-fs.target OnFailure=emergency.target OnFailureJobMode=isolate [core@cosa-devsh 35coreos-live]$ rpm-ostree status State: idle Deployments: * ostree://1eabb5b58514f98afc3a2b31970e66ac34a18109f8f219dc0499944b10753bf8 Version: 48.84.202110072059-0 (2021-10-07T21:02:47Z) The fix for this bug has landed in a bootimage bump, as tracked in bug 2006965 (now in status MODIFIED). Moving this bug to MODIFIED. Verify passed with rhcos-48.84.202206172122-0-qemu.x86_64.qcow2 according to steps in Comment #8 [core@cosa-devsh ~]$ cd /usr/lib/dracut/modules.d/35coreos-live [core@cosa-devsh 35coreos-live]$ ls | grep nm-wait-online [core@cosa-devsh 35coreos-live]$ cat live-generator | grep nm-wait-online [core@cosa-devsh 35coreos-live]$ cat module-setup.sh | grep nm-wait-online [core@cosa-devsh 35coreos-live]$ cd ../35coreos-multipath [core@cosa-devsh 35coreos-multipath]$ grep -E "^After|OnFailure" coreos-propagate-multipath-conf.service After=initrd-root-fs.target OnFailure=emergency.target OnFailureJobMode=isolate [core@cosa-devsh 35coreos-multipath]$ rpm-ostree status State: idle Deployments: ● ostree://170ace4e7eb28e850782ecb4532cab0c53dfbf33748dbfb18ad4ec69b19cc255 Version: 48.84.202206172122-0 (2022-06-17T21:25:43Z) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.45 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:5167 |