Bug 837793 - nm-online in NetworkManager-wait-online.service not waiting for complete IP address assignment
Summary: nm-online in NetworkManager-wait-online.service not waiting for complete IP a...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: NetworkManager
Version: 17
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Dan Williams
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-05 09:33 UTC by Erik Terwan
Modified: 2016-01-21 21:17 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-31 23:35:47 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Erik Terwan 2012-07-05 09:33:23 UTC
I'm using NetworkManager-wait-online.service as the recommended method of waiting for the network to become online, before starting network related services.

But lately I'm noticing that nmb.service doesn't start (and so also my cifs mounts fail) because the network is not completely up and running by the time nmb.service starts.

So I tested the NetworkManager-wait-online.service to see if the network was indeed up and running by the time NetworkManager-wait-online.service is complete (according to systemd-analyze it takes about a 10 seconds waiting time). I adjusted the line ExecStart in /usr/lib/systemd/system/NetworkManager-wait-online.service to: "ExecStart=/usr/bin/nm-online -q --timeout=30 ; /bin/nm-tool" so that the status of the network gets logged in /var/log/messages after the network is supposed to be online.

Much to my surprise the status was different each time I booted. Somtime the status (output of nm-tool) was as following:

Jul  4 13:40:35 huiskamer nm-tool[1031]: NetworkManager Tool
Jul  4 13:40:35 huiskamer nm-tool[1031]: State: connected (global)
Jul  4 13:40:35 huiskamer nm-tool[1031]: - Device: eth0  [System eth0] --------------------------------------------------
Jul  4 13:40:35 huiskamer nm-tool[1031]: Type:              Wired
Jul  4 13:40:35 huiskamer nm-tool[1031]: Driver:            r8169
Jul  4 13:40:35 huiskamer nm-tool[1031]: State:             connected
Jul  4 13:40:35 huiskamer nm-tool[1031]: Default:           yes
Jul  4 13:40:35 huiskamer nm-tool[1031]: HW Address:        6C:F0:49:17:7D:DA
Jul  4 13:40:35 huiskamer nm-tool[1031]: Capabilities:
Jul  4 13:40:35 huiskamer nm-tool[1031]: Carrier Detect:  yes
Jul  4 13:40:35 huiskamer nm-tool[1031]: Speed:           100 Mb/s
Jul  4 13:40:35 huiskamer nm-tool[1031]: Wired Properties
Jul  4 13:40:35 huiskamer nm-tool[1031]: Carrier:         on
Jul  4 13:40:35 huiskamer nm-tool[1031]: IPv4 Settings:
Jul  4 13:40:35 huiskamer nm-tool[1031]: Address:         192.168.1.130
Jul  4 13:40:35 huiskamer nm-tool[1031]: Prefix:          24 (255.255.255.0)
Jul  4 13:40:35 huiskamer nm-tool[1031]: Gateway:         192.168.1.2
Jul  4 13:40:35 huiskamer nm-tool[1031]: DNS:             192.168.1.240
Jul  4 13:40:35 huiskamer nm-tool[1031]: IPv6 Settings:
Jul  4 13:40:35 huiskamer nm-tool[1031]: Address:         ****:****:****:****:5170:a69b:95d0:11bb
Jul  4 13:40:35 huiskamer nm-tool[1031]: Prefix:          64
Jul  4 13:40:35 huiskamer nm-tool[1031]: Gateway:         fe80::da5d:4cff:fe81:87b5
Jul  4 13:40:35 huiskamer nm-tool[1031]: Address:         ****:****:****:****:6ef0:49ff:fe17:7dda
Jul  4 13:40:35 huiskamer nm-tool[1031]: Prefix:          64
Jul  4 13:40:35 huiskamer nm-tool[1031]: Gateway:         fe80::da5d:4cff:fe81:87b5
Jul  4 13:40:35 huiskamer nm-tool[1031]: Address:         fe80::6ef0:49ff:fe17:7dda
Jul  4 13:40:35 huiskamer nm-tool[1031]: Prefix:          64
Jul  4 13:40:35 huiskamer nm-tool[1031]: Gateway:         fe80::da5d:4cff:fe81:87b5
Jul  4 13:40:35 huiskamer nm-tool[1031]: DNS:             ****:****:****:****::

This is the network completely up and running with assigned IPv4 and IPv6 addresses. But the status of the network wasn't always like this. I've also seen this:

Jul  4 13:45:29 huiskamer nm-tool[1044]: NetworkManager Tool
Jul  4 13:45:29 huiskamer nm-tool[1044]: State: connected (global)
Jul  4 13:45:29 huiskamer nm-tool[1044]: - Device: eth0  [System eth0] --------------------------------------------------
Jul  4 13:45:29 huiskamer nm-tool[1044]: Type:              Wired
Jul  4 13:45:29 huiskamer nm-tool[1044]: Driver:            r8169
Jul  4 13:45:29 huiskamer nm-tool[1044]: State:             connected
Jul  4 13:45:29 huiskamer nm-tool[1044]: Default:           yes
Jul  4 13:45:29 huiskamer nm-tool[1044]: HW Address:        6C:F0:49:17:7D:DA
Jul  4 13:45:29 huiskamer nm-tool[1044]: Capabilities:
Jul  4 13:45:29 huiskamer nm-tool[1044]: Carrier Detect:  yes
Jul  4 13:45:29 huiskamer nm-tool[1044]: Speed:           100 Mb/s
Jul  4 13:45:29 huiskamer nm-tool[1044]: Wired Properties
Jul  4 13:45:29 huiskamer nm-tool[1044]: Carrier:         on
Jul  4 13:45:29 huiskamer nm-tool[1044]: IPv4 Settings:
Jul  4 13:45:29 huiskamer nm-tool[1044]: Address:         192.168.1.130
Jul  4 13:45:29 huiskamer nm-tool[1044]: Prefix:          24 (255.255.255.0)
Jul  4 13:45:29 huiskamer nm-tool[1044]: Gateway:         192.168.1.2

(IPv4 almost complete (DNS server still missing, though) and IPv6 completely lacking), and this:

Jul  4 13:50:25 huiskamer nm-tool[1053]: NetworkManager Tool
Jul  4 13:50:25 huiskamer nm-tool[1053]: State: connected (global)
Jul  4 13:50:25 huiskamer nm-tool[1053]: - Device: eth0  [System eth0] --------------------------------------------------
Jul  4 13:50:25 huiskamer nm-tool[1053]: Type:              Wired
Jul  4 13:50:25 huiskamer nm-tool[1053]: Driver:            r8169
Jul  4 13:50:25 huiskamer nm-tool[1053]: State:             connected
Jul  4 13:50:25 huiskamer nm-tool[1053]: Default:           no
Jul  4 13:50:25 huiskamer nm-tool[1053]: HW Address:        6C:F0:49:17:7D:DA
Jul  4 13:50:25 huiskamer nm-tool[1053]: Capabilities:
Jul  4 13:50:25 huiskamer nm-tool[1053]: Carrier Detect:  yes
Jul  4 13:50:25 huiskamer nm-tool[1053]: Speed:           100 Mb/s
Jul  4 13:50:25 huiskamer nm-tool[1053]: Wired Properties
Jul  4 13:50:25 huiskamer nm-tool[1053]: Carrier:         on
Jul  4 13:50:25 huiskamer nm-tool[1053]: IPv6 Settings:
Jul  4 13:50:25 huiskamer nm-tool[1053]: Address:         ****:****:****:****:878:92c0:cdd6:730f
Jul  4 13:50:25 huiskamer nm-tool[1053]: Prefix:          64
Jul  4 13:50:25 huiskamer nm-tool[1053]: Gateway:         fe80::da5d:4cff:fe81:87b5
Jul  4 13:50:25 huiskamer nm-tool[1053]: Address:         ****:****:****:****:6ef0:49ff:fe17:7dda
Jul  4 13:50:25 huiskamer nm-tool[1053]: Prefix:          64
Jul  4 13:50:25 huiskamer nm-tool[1053]: Gateway:         fe80::da5d:4cff:fe81:87b5
Jul  4 13:50:25 huiskamer nm-tool[1053]: Address:         fe80::6ef0:49ff:fe17:7dda
Jul  4 13:50:25 huiskamer nm-tool[1053]: Prefix:          64
Jul  4 13:50:25 huiskamer nm-tool[1053]: Gateway:         fe80::da5d:4cff:fe81:87b5
Jul  4 13:50:25 huiskamer nm-tool[1053]: DNS:             ****:****:****:****::

(IPv6 complete, but IPv4 completely missing).

The nm-online man page says: "nm-online  is  a  utility  to find out whether we are online." My question is: "What does nm-online means when it says 'online'?"

It evidently isn't waiting for a complete assignment of IPv4 and IPv6 addresses. And so the on nm-online (NetworkManager-wait-online.service) depending services can't be sure of the status of the network.

This is not how (I understand that) it should work.

Comment 1 Mike Grant 2012-08-20 11:06:47 UTC
Confirmed on Fedora 17 with latest updates (NetworkManager-0.9.4.0-9.git20120521.fc17.x86_64, systemd-44-17.fc17.x86_64).  Here's a clear reproducer.

Steps to reproduce:
 1. boot up a networked Fedora 17 machine and log in
 2. observe the NM applet reports network up and working
 3. unplug the network, wait for NM to report it down
 4. Run "systemctl start NetworkManager-wait-online.service ; ping -c 1 www.bbc.co.uk", or "nm-online; ping -c 1 www.bbc.co.uk"
 5. Plug the network back in and wait

Expected results:
 The network should come up (applet icon should indicate this), the ping should work.

What actually happens:
 The network comes up, the icon changes, nm-online / NetworkManager-wait-online.service completes successfully, the ping fails with "ping: unknown host www.bbc.co.uk"

Additional info:
 This looks like nm-online returning too soon.  Adding a sleep command makes things work as expected (e.g. "systemctl start NetworkManager-wait-online.service ; sleep 5 ; ping -c 1 www.bbc.co.uk").

An unpleasant workaround is to alter /usr/lib/systemd/system/NetworkManager-wait-online.service to include a sleep:
...
[Service]
Type=oneshot
ExecStart=/usr/bin/nm-online -q --timeout=30
ExecStartPost=/usr/bin/sleep 5
...
With this sleep in the unit file, the ping works as expected.  As with any sleep, it may not be a entirely safe option.  5 seconds is more than enough for my system but may not be sufficient for others.

Related bugs:
 autofs https://bugzilla.redhat.com/show_bug.cgi?id=448510
 (maybe) ypbind https://bugzilla.redhat.com/show_bug.cgi?id=756123

Comment 2 Dan Williams 2012-12-04 17:02:46 UTC
NetworkManager signals that it is "online" when it has either IPv4 or IPv6 connectivity.  The effort to get IPv6-by-default (if IPv6 is available) requires that we *not* block the "we're online!" signal if the network got an IPv4 address but is still waiting for an IPv6 address, since per standards we must wait a long time before deciding that IPv6 simply isn't available.  That can be two minutes or more, during which nothing thinks you're online even though you have IPv4 connectivity.

However: for the specific case of nm-online, given the target audience, I think it's reasonable to wait for both IPv4 and IPv6 to be fully configured before the tool returns, if IPv6 is enabled.

So; we should modify nm-online to wait for IP addressing of the appropriate version if the configuration allows IP addressing for that version.

Comment 3 Mike Grant 2012-12-05 01:44:32 UTC
Note this isn't just a v4/v6 issue - nm-online is signalling it's online when the setup isn't complete (my test case demonstrates that DNS wasn't set up and was 100% repeatable in my environment).  Perhaps it's returning "online" as soon as an address appears on the interface, but before dhclient has set up DNS, etc.

It's not very useful to have quick response if there's only a half working interface.  This is a problem for automated processes that fire off immediately after getting the online signal, and immediately fail because they didn't have, say, DNS.  I believe this is what happened in the ypbind and autofs linked bugs, which get fired off by systemd immediately after the nm-wait-online service completes.

I'd suggest that nm-online should ideally be altered to wait until the interface setup is complete (e.g. dhclient is finished for DHCP setups, ipv6 autoconf is complete, wait for both v4 and v6, or whatever is suitable for the configuration type).  Alternatively, at the very least a nasty and unreliable hack of waiting a few seconds should allow things to "settle" in most cases.  The latter can be implemented for the systemd startup case by the ExecStartPost sleep above, but this doesn't deal with other uses of nm-online.

Comment 4 SpuyMore 2013-01-16 15:23:20 UTC
I have similar issue. At boot time nmbd and cupsd fail to start because the NetworkManager-wait-online service exits as soon as my WAN NIC is up, but both services need by LAN NIC to be up instead. And the latter comes up just a little bit later. It would therefore be great if we would be able to configure NetworkManager-wait-online service to wait for a specific interface and/or addressing type to complete, e.g. eth1 + IPV4 E.g. in a file /etc/sysconfig/NetworkManager-wait-online.

Comment 5 Fedora End Of Life 2013-07-03 22:00:46 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 6 Fedora End Of Life 2013-07-31 23:35:52 UTC
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.