Bug 2104398

Summary: Network boot fails in dracut-initqueue due to nm-online wrongly pretend that the network is connected
Product: [Fedora] Fedora Reporter: Francis.Montagnac
Component: dracutAssignee: dracut-maint-list
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 36CC: dracut-maint-list, jamacku, jonathan, lnykryn, pvalena
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: ---
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-25 15:56:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Override nm-wait-online-initrd.service none

Description Francis.Montagnac 2022-07-06 07:25:16 UTC
Created attachment 1894842 [details]
Override nm-wait-online-initrd.service

Description of problem:

When doing a network boot, rescue or install, with:

  Server/x86_64/os/isolinux/vmlinuz
  Server/x86_64/os/isolinux/initrd.img

the boot fails sometimes in dracut-initqueue:

  Warning: dracut-initqueue: timeout, still waiting for following initqueue hooks:
  Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2froot.sh: "[ -e "/dev/root" ]"
  Warning: /lib/dracut/hooks/initqueue/finished/kickstart.sh: "[ -e /tmp/ks.cfg.done ]"
  Warning: /lib/dracut/hooks/initqueue/finished/nm.sh: "[ -f /tmp/nm.done ]"
  Warning: /lib/dracut/hooks/initqueue/finished/wait_for_settle.sh: "[ -f /tmp/settle.done ]"
  ...
  Warning: Could not boot.

then enter the emergency shell:

  systemd[1]: Starting dracut-emergency.service - Dracut Emergency Shell..

Version-Release number of selected component (if applicable): dracut-056-1.fc36

How reproducible:
Not always: depends on the machine and the ethernet port.

Steps to Reproduce:
1. Attempt a network boot using PXE and iPXE, rescue or install

Actual results:
The boot fails in emergency shell.

Expected results:
Normal boot.

Additional info:

This is due to dracut-initqueue starting too early, after
nm-wait-online-initrd that uses "/usr/bin/nm-online -s -q -t 3600"
that terminates before the network is ready.

Excerpt from the journal that shows that:

Jul 05 11:38:06 fedora systemd[1]: Finished nm-wait-online-initrd.service.
Jul 05 11:38:06 fedora systemd[1]: Starting dracut-initqueue.service - dracut initqueue hook...
...
Jul 05 11:38:07 fedora kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
...
Jul 05 11:38:09 xxxxxx NetworkManager[1011]: <info>  [1657021089.8530] device (eth0): Activation: successful, device activated.
Jul 05 11:38:09 xxxxxx NetworkManager[1011]: <info>  [1657021089.8542] manager: NetworkManager state is now CONNECTED_GLOBAL

I can provide the full journal if you need.

I applied a workaround: override the nm-wait-online-initrd.service to replace
nm-online by a loop waiting for the network to be really ready.
See attached file: nm-wait-online-routes-gw.conf

Comment 1 Ben Cotton 2023-04-25 17:33:37 UTC
This message is a reminder that Fedora Linux 36 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '36'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 36 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 2 Ludek Smid 2023-05-25 15:56:06 UTC
Fedora Linux 36 entered end-of-life (EOL) status on 2023-05-16.

Fedora Linux 36 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.