Description of problem: When I have 2 network interfaces in my PC (with ONBOOT=yes) NetworkManager-wait-online fails at boot. Version-Release number of selected component (if applicable): Name : NetworkManager Arch : x86_64 Epoch : 1 Version : 1.0.6 Release : 7.fc23 How reproducible: Put 2 NICs into the PC. Then boot it. Steps to Reproduce: 1. Put 2 NIC into the PC. Boot. => NetworkManager-wait-online fails 2. Disable one one them (ONBOOT=no) => NetworkManager-wait-online is ok 3. Actual results: ● NetworkManager-wait-online.service - Network Manager Wait Online Loaded: loaded (/usr/lib/systemd/system/NetworkManager-wait-online.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Mon 2015-11-02 21:05:06 CET; 1min 4s ago Process: 1676 ExecStart=/usr/bin/nm-online -s -q --timeout=30 (code=exited, status=1/FAILURE) Main PID: 1676 (code=exited, status=1/FAILURE) Nov 02 21:04:36 bat.localdomain systemd[1]: Starting Network Manager Wait Online... Nov 02 21:05:06 bat.localdomain systemd[1]: NetworkManager-wait-online.service: Main process exited, code=exited, status=1/FAILURE Nov 02 21:05:06 bat.localdomain systemd[1]: Failed to start Network Manager Wait Online. Nov 02 21:05:06 bat.localdomain systemd[1]: NetworkManager-wait-online.service: Unit entered failed state. Nov 02 21:05:06 bat.localdomain systemd[1]: NetworkManager-wait-online.service: Failed with result 'exit-code'. Expected results: Service should start properly. Additional info:
To be more clear about this: 2. Disable one one them (ONBOOT=no). Boot. => NetworkManager-wait-online is ok
Could you paste the ifcfg files for each interface into a reply? Also, after boot when it fails, can you paste the result of "nmcli dev"?
After the boot is completed it looks like this: [fansari@bat ~]$ nmcli dev DEVICE TYPE STATE CONNECTION virbr0 bridge connected virbr0 enp4s0 ethernet connected enp4s0 p4p1 ethernet connected p4p1 virbr0-nic tap connected virbr0-nic lo loopback unmanaged --
Created attachment 1089164 [details] p4p1
Created attachment 1089165 [details] enp4s0
I there any progress here? I still have this issue.
So, comment #3 shows that both interfaces were connected successfully. Would you post NetworkManager logs when you have the problem (both files ONBOOT=yes). Which of the files do you disable ONBOOT in?
Yes - the problem is not that the connections are not coming up. The problem is that the NetworkManager-wait is always hanging and this costs time at every boot. Regarding the disabling: if I disable either of these interfaces the NetworkManager-wait service works normally and does not hang. After boot it looks like this: fansari@bat ~]$ systemctl status NetworkManager-wait-online ● NetworkManager-wait-online.service - Network Manager Wait Online Loaded: loaded (/usr/lib/systemd/system/NetworkManager-wait-online.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Sun 2015-11-22 16:31:21 CET; 8min ago Process: 1598 ExecStart=/usr/bin/nm-online -s -q --timeout=30 (code=exited, status=1/FAILURE) Main PID: 1598 (code=exited, status=1/FAILURE) Nov 22 16:30:51 bat.localdomain systemd[1]: Starting Network Manager Wait Online... Nov 22 16:31:21 bat.localdomain systemd[1]: NetworkManager-wait-online.service: Main process exited, code=exited, status=1/FAILURE Nov 22 16:31:21 bat.localdomain systemd[1]: Failed to start Network Manager Wait Online. Nov 22 16:31:21 bat.localdomain systemd[1]: NetworkManager-wait-online.service: Unit entered failed state. Nov 22 16:31:21 bat.localdomain systemd[1]: NetworkManager-wait-online.service: Failed with result 'exit-code'.
Created attachment 1097537 [details] journalctl -b -u NetworkManager
Created attachment 1097538 [details] journalctl -b -u NetworkManager-wait-online
Is there a fix for this? Simply disabling this service did not help because it seems that it is called from somewhere else. I guessed this might be NetworkManager.dispatcher so I disabled this also but this did not help either. Currently the only solution (which is not a solution of course) I found is to "shortcut" the service by placing a copy of NetworkManager-wait-online.service to /etc/systemd/system and to set ExecStart=/usr/bin/true
It appears that your interfaces are taking 31 seconds to come up, but NM-wait-online's timeout is 30s by default. That's why it's failing. The interfaces are taking 31s to come up because of: Nov 22 16:30:56 bat.localdomain NetworkManager[1441]: <info> (enp4s0): device state change: ip-config -> ip-check (reason 'none') [70 80 0] Nov 22 16:31:22 bat.localdomain NetworkManager[1441]: <info> (enp4s0): device state change: ip-check -> secondaries (reason 'none') [80 90 0] which is quite a long delay during the ip-check phase. Some things that happen during this stage are running dispatcher scripts, sending neighbor notifications (ARPs) and pinging the gateway. Could you modify /lib/systemd/system/NetworkManager-dispatcher.service so that the ExecStart= line looks like this: ExecStart=/usr/libexec/nm-dispatcher --debug then run 'systemctl daemon-reload', then restart and reproduce the problem. Then run 'journalctl -b -u NetworkManager-dispatcher' and attach the output here and we'll see which script is taking too long. You'll see output like: Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Dispatching action 'dhcp4-change' for wlp4s0 Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Running script '/etc/NetworkManager/dispatcher.d/00-netreport' Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Script '/etc/NetworkManager/dispatcher.d/00-netreport' complete Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Running script '/etc/NetworkManager/dispatcher.d/10-ifcfg-rh-routes.sh' Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Script '/etc/NetworkManager/dispatcher.d/10-ifcfg-rh-routes.sh' complete Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Running script '/etc/NetworkManager/dispatcher.d/11-dhclient' Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Script '/etc/NetworkManager/dispatcher.d/11-dhclient' complete Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Running script '/etc/NetworkManager/dispatcher.d/20-chrony' Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Script '/etc/NetworkManager/dispatcher.d/20-chrony' complete Dec 15 15:12:21 localhost.localdomain nm-dispatcher[18636]: ** Message: Dispatch 'dhcp4-change' on wlp4s0 complete And I'll bet one of your scripts is taking too long. If that's true, we need to file a bug against the package that provides that script.
Created attachment 1106830 [details] journalctl -b -u NetworkManager-dispatcher
From what I see from the output of this logs it is more clear to me where the problem is located: It is in /etc/NetworkManager/dispatcher.d/13-named. When I disable the first line like this if [ "$2" = 'up' -o "$2" = 'down' ]; then # /usr/bin/systemctl reload named.service > /dev/null 2>&1 || true /usr/bin/systemctl reload named-chroot.service > /dev/null 2>&1 || true /usr/bin/systemctl reload named-sdb.service > /dev/null 2>&1 || true /usr/bin/systemctl reload named-sdb-chroot.service > /dev/null 2>&1 || true fi then there is no startup delay anymore. But this again is only a workaround not a final solution. If named is meant to be reloaded at this point then I conclude that it is not expected to be a problem. In my scenario I use named. The other services mentioned here are not used by me. When I check the output of the NetworkManager-dispatcher logs and compare them with the named logs I find something interesting: Dec 17 20:20:21 bat.localdomain nm-dispatcher[1909]: ** Message: Script '/etc/NetworkManager/dispatcher.d/13-named' complete Dec 17 20:20:20 bat.localdomain systemd[1]: Starting Berkeley Internet Name Domain (DNS)... It seems that the 13-named dispatcher waits until named is started. But even this is true: what has this to do with the number of NICs?
I did some more tests. The problem is not the number of NICs. I can only reproduce this behaviour when the second interface is using IPv6 autoconf and also have named. There is another machine (which is up all the time) where raded is running and pushes the autoconf to my machines. When I change the config to static IPv6 there is no delay with the 13-named dispatcher. But autoconf is a useful thing for IPv6 and I don't want to remove it. So I can reproduce this behaviour in this scenario: - one NIC with IPv4 - seond NIC with IPv6 and autoconf via radvd on second machine - named locally running
Dec 17 20:19:55 bat.localdomain nm-dispatcher[1909]: ** Message: Running script '/etc/NetworkManager/dispatcher.d/13-named' Dec 17 20:20:21 bat.localdomain nm-dispatcher[1909]: ** Message: Script '/etc/NetworkManager/dispatcher.d/13-named' complete The named script is taking 26 seconds to complete, and that's running every time an interface comes up. This is a problem with the named dispatcher script, and that's where the problem needs to be fixed. Alternatively, you could remove named if you don't need its functionality. If you do, then the script should get fixed :( Or you can also increase the NetworkManager-wait-online timeout if you like by modifying the systemd until file.
*** Bug 1316687 has been marked as a duplicate of this bug. ***
Thanks all. I'm going to remove the dispatcher script completely, as it is a relic from bind 9.9.x times. BIND 9.10.x and newer now listens on NETLINK for network configuration changes and can react to them.
bind-9.10.3-13.P4.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-79cad60d9c
bind-9.10.3-13.P4.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-05d4c87988
bind-9.10.3-13.P4.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-79cad60d9c
bind-9.10.3-13.P4.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-05d4c87988
bind-9.10.3-13.P4.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
bind-9.10.3-13.P4.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
I met this issue again in Fedora 27 and Fedora 28 with my Laptop.
I have this issue in Fedora 32. Even if I manually start the service after boot, the service failes.
So, I troubleshooted this a bit more. For me the situation was always that I had at least one wireless adapter installed on my system. If I disabled the wireless adapter, the problem went away. However, I wanted to be able to use my wireless adapters. So, I found that if I install a package named wpa_supplicant and made sure that the wpa_supplicant service was started, then this NetworkManager-wait-online service would start and work just fine. I know that this may not be the solution for everyone but it worked out for me. Hoping this at least helps someone in some way. /Charlie Ps. if this works for you, do not forget to set the wpa_supplicant service to autostart.
This problem is due to use of IPv6 Automatic for me. It is the same for me using Fedora 32, 33, Silverblue 33. Enable of wpa-supplicant do not help for me. Once I turn off IPv6 from all of my virtual bridge interfaces, the wait-online service started without issue.