From Pavel Šimerda and Corinna Vinschen (combined messages, slightly paraphrased): The underlying problem is how the network and network-online targets are defined. Basically (pardon fuzzy choice of words, please), network.target makes sure the network subsystem is started, but it doesn not make sure that all boottime-enabled interfaces are up, while network-online.target makes sure that all boottime-enabled interfaces are up. For this scenario we have to take into account that there are four types of network services: 1. Network services which always listen to 0.0.0.0 and/or ::. 2. Network services which potentially listen on explicit IP addresses, but which support address changes via the Linux extensions like rtnetlink/IP_FREEBIND. 3. Network services which potentially listen on explicit IP addresses but which don't support these extensions. 4. Network services which *connect* to remote services (e.g. using getaddrinfo and connect library calls). The ntp service is an example. Our problem here are the services of type 3. Such services often provide configuration files allowing to specify the listen addresses. If the service is started before all the boottime network addresses are available, they will simply fail. This is especially worrysome for services like sshd, but in my experience more services are affected by this, in my case at least dovecot, named, postfix, radicale, sshd. So, bottom line is, IMHO those services which allow to specify explicit listen addresses, and which are not capable of dealing with the situation that certain network addresses are not up when the service starts, must depend on network-online.target rather than just network.target. If services can be improved/patched (ideally upstream) to use rtnetlink/IP_FREEBIND, that's even better. Basically they should pull in the network-online.target as a dependency and start after it. None of the two should be forgotten. [Unit] Wants=network-online.target After=network-online.target
It would be good to determine whether it's better to use Wants= or Requires= and maybe try to be consistent. Techically it doesn't seem to make a difference, as services that need to finish *before* network-online.target either use WantedBy= in the Install section, or create the symlink in the `.wants` directory directly. For example NetworkManager (both Fedora and upstream git) now seems to contain the following symlink among packaged files: /usr/lib/systemd/system/network-online.target.wants/NetworkManager-wait-online.service Therefore whether a service Requires or Wants network-online.target, the network-online.target is typically always avilable and always successful. Philosophically, I think Wants is the right one as conceptually network-online.target *might* fail and the applications just want to delay the start, not start conditionally.
It might be still worth removing the network-online patch from the Fedora 20 branch of systemd to fix the regression until all units in Fedora are fixed.
(In reply to Pavel Šimerda (pavlix) from comment #2) > It might be still worth removing the network-online patch from the Fedora 20 > branch of systemd to fix the regression until all units in Fedora are fixed. And although it's hard to believe :) we are getting close to test releases for Fedora 21, and I'd really like to ship that in a functional state.
(In reply to Matthew Miller from comment #3) > And although it's hard to believe :) we are getting close to test releases > for Fedora 21, and I'd really like to ship that in a functional state. I don't see this as an argument for reverting the change in F21, though, for the following reasons. 1) The services need to be fixed anyway in order to work properly with NetworkManager, which is the default network configuration solution in F21. 2) The positive part of the change allows custom software with initscripts to order itself after network-online.target which is by default represented with NetworkManager. Therefore, in my opinion, it's critical to fix the services and it's important to keep the change for F21. It would be nice to fix the bug introduced by the change but with services fixed, it's not critical. For the noncritical fix, I found another solution... [Unit] Description=good ol' network setup script DefaultDependencies=no After=local-fs.target Before=sysinit.target [Service] Type=oneshot RemainAfterExit=yes ExecStart=/etc/rc.d/init.d/network start ExecStop=/etc/rc.d/init.d/network stop [Install] WantedBy=sysinit.target It was suggested by Radek Hladík in an internet discussion. It implements the network.service that runs the network initscript but adds the necessary ordering directives.
(In reply to Pavel Šimerda (pavlix) from comment #4) > [Unit] > Description=good ol' network setup script > DefaultDependencies=no > After=local-fs.target > Before=sysinit.target > [Service] > Type=oneshot > RemainAfterExit=yes > ExecStart=/etc/rc.d/init.d/network start > ExecStop=/etc/rc.d/init.d/network stop > [Install] > WantedBy=sysinit.target > > It was suggested by Radek Hladík in an internet discussion. It implements > the network.service that runs the network initscript but adds the necessary > ordering directives. I posted the contents as it was in the discussion. It may need a bit of care like ordering before network.target instead of sysinit.target and stuff like that.
I think that there is one thing broken. If you have After=network.target in a service, it should be guaranteed that init will terminate you before network is down during shutdown. But currently network.service is Before=network-online.target so the network could be put down before network.target is finished. I have posted a patch for that to systemd upstream mailing list. http://lists.freedesktop.org/archives/systemd-devel/2014-July/021495.html
(In reply to Pavel Šimerda (pavlix) from comment #4) > (In reply to Matthew Miller from comment #3) > > And although it's hard to believe :) we are getting close to test releases > > for Fedora 21, and I'd really like to ship that in a functional state. > > I don't see this as an argument for reverting the change in F21, though, for > the following reasons. > > 1) The services need to be fixed anyway in order to work properly with > NetworkManager, which is the default network configuration solution in F21. > > 2) The positive part of the change allows custom software with initscripts > to order itself after network-online.target which is by default represented > with NetworkManager. > > Therefore, in my opinion, it's critical to fix the services and it's > important to keep the change for F21. It would be nice to fix the bug > introduced by the change but with services fixed, it's not critical. Yeah, works for me as long as we can identify and fix all of the services (and clearly document in the packaging guidelines how this needs to be -- probably a ticket https://fedorahosted.org/fpc/newticket)
(In reply to Matthew Miller from comment #7) > Yeah, works for me as long as we can identify and fix all of the services > (and clearly document in the packaging guidelines how this needs to be -- > probably a ticket https://fedorahosted.org/fpc/newticket) Not sure how much should packaging guidelines substitute upstream documentation which more or less exists in this case. I also believe that systemd service files should be part of upstream packages whenever possible. But feel free to start a ticket if you think it's the right way to handle it. We indeed should identify services that still use network.target for ordering as well as any other services that don't comply.
(In reply to Lukáš Nykrýn from comment #6) > I think that there is one thing broken. If you have After=network.target in > a service, it should be guaranteed that init will terminate you before > network is down during shutdown. The specific use case is confusing itself, as it's never clear whether this ordering is done by mistake or is intentional. Are there examples of services that actually need it? > But currently network.service is > Before=network-online.target so the network could be put down before > network.target is finished. So a service ordered after network-online.service is still safe, correct? Only services that only need the shutdown ordering are broken. > I have posted a patch for that to systemd upstream mailing list. > http://lists.freedesktop.org/archives/systemd-devel/2014-July/021495.html What exactly does the patch do? I'm not familiar with the code. You're saying we should continue to guarantee that /etc/init.d/network stops after network.target. Shouldn't we also continue to guarantee that /etc/init.d/network starts before network.target (which is in turn started before network-online.target) and thus maintain backwards compatibility?
> The specific use case is confusing itself, as it's never clear whether this > ordering is done by mistake or is intentional. Are there examples of > services that actually need it? I don't know, but I can imagine that before the service is terminated it wants to do some finalization actions and it would be better if network is still up. > > What exactly does the patch do? I'm not familiar with the code. You're > saying we should continue to guarantee that /etc/init.d/network stops after > network.target. Shouldn't we also continue to guarantee that > /etc/init.d/network starts before network.target (which is in turn started > before network-online.target) and thus maintain backwards compatibility? The patch simply adds Before=network.target for services that have Provides: $network. So yes also during startup initscripts which provides network will be started before network.target.
(In reply to Lukáš Nykrýn from comment #10) > I don't know, but I can imagine that before the service is terminated it > wants to do some finalization actions and it would be better if network is > still up. OK, let's say any sort of communication server may want to notify clients that it's going down with some specific message, not just sockets closed by the system. > > What exactly does the patch do? I'm not familiar with the code. You're > > saying we should continue to guarantee that /etc/init.d/network stops after > > network.target. Shouldn't we also continue to guarantee that > > /etc/init.d/network starts before network.target (which is in turn started > > before network-online.target) and thus maintain backwards compatibility? > > The patch simply adds Before=network.target for services that have Provides: > $network. Aha, that's the mechanics. Can we have this patch added to Fedora 20+? > So yes also during startup initscripts which provides network will > be started before network.target. Is there any other result of "Provides: $network"? Because I would think it is good enough to just order such a service before network.target which is ordered before network-online.target and thus provides the original boot order plus what we expected from the $network=network-online.target change. Basically $network means different things when provided and when depended upon, which is ok, as it's just a hack anyway.
> Aha, that's the mechanics. Can we have this patch added to Fedora 20+? It is not in the upstream yet. I will wait fo little bit longer and then push it myself. > > > So yes also during startup initscripts which provides network will > > be started before network.target. > > Is there any other result of "Provides: $network"? Because I would think it > is good enough to just order such a service before network.target which is > ordered before network-online.target and thus provides the original boot > order plus what we expected from the $network=network-online.target change. > Basically $network means different things when provided and when depended > upon, which is ok, as it's just a hack anyway. Provides: $network also means before network-online.target, but that is redundant.
(In reply to Pavel Šimerda (pavlix) from comment #8) > Not sure how much should packaging guidelines substitute upstream > documentation which more or less exists in this case. I also believe that > systemd service files should be part of upstream packages whenever possible. Agreed. I think it'd be good to give guidance at https://fedoraproject.org/wiki/Packaging:Systemd, though — no big production and it certainly can point to the upstream documentation. The goal is to make it as easy as possible for packagers who aren't necessarily familiar with systemd. > But feel free to start a ticket if you think it's the right way to handle it. Sure -- do you have some suggested wording?
(In reply to Matthew Miller from comment #13) > Sure -- do you have some suggested wording? I'll try to give a couple of points... but feel free to fix it. Or I can start a wiki page for staging the text if that's preferable. 1) Services that require to be started after network is fully configured should pull in network-online.target and order itself after it. [Unit] Wants=network-online.target After=network-online.target Note that pulling in network-online.target may extend the overall boot time. 2) Services that don't need to wait for network configuration but would should be stopped before network is taken down should order itself after network.target. [Unit] After=network.target 3) Services that don't have any of the requirements above should not reference those targets.
Let's start with: https://fedoraproject.org/wiki/Networking/Ideas/ServiceOrdering
(In reply to Lukáš Nykrýn from comment #12) > > Aha, that's the mechanics. Can we have this patch added to Fedora 20+? > It is not in the upstream yet. I will wait fo little bit longer and then > push it myself. Any updates?
It's in upstream http://cgit.freedesktop.org/systemd/systemd/commit/?id=805b573fad06b845502e76f3db3a0efa7583149d
(In reply to Lukáš Nykrýn from comment #17) > It's in upstream > http://cgit.freedesktop.org/systemd/systemd/commit/ > ?id=805b573fad06b845502e76f3db3a0efa7583149d Great, what about Fedora >= 20?
Does anyone have a list of services which are still broken? I'm having issues with autofs which I think stem from this, but I'm not sure what to do to confirm that.
(In reply to Jason Tibbitts from comment #19) > Does anyone have a list of services which are still broken? I'm having > issues with autofs which I think stem from this, but I'm not sure what to do > to confirm that. We do not have a list (apart from blocking bugs) and it is not trivial to detect wrong packages because After=network.target is valid for services that do not need to start after configured network but still benefit from being stopped before network is torn down.
(In reply to Pavel Šimerda (pavlix) from comment #20) > We do not have a list (apart from blocking bugs) and it is not trivial to > detect wrong packages because After=network.target is valid for services > that do not need to start after configured network but still benefit from > being stopped before network is torn down. Well, sendmail.service, httpd.service, and proftpd.service interpolate the value of hostname (or hostname -f) into their configuration... so on DHCP-based hosts you'd want to wait until hostname is set to something sane. I've noticed empirically that apcupsd.service fails if it's configured to use SNMP to poll the UPS and there's no valid route (i.e. the network is down). That's probably a bug in apcupsd and it should be trying a lot harder, but currently it doesn't.
This bug has been open a while... can we get some forward momentum on it?
I just hit this with a fresh install of Fedora 25. I am running samba attached to eth1, an internal interface. Previously I was using network instead of NetworkManager. It is crazy that this has been open since 2014. Though I suspect it would require a manager or product manager to care enough. Since it would require changing the .service files in many different packages.
(In reply to Nathan G. Grennan from comment #23) > I just hit this with a fresh install of Fedora 25. I am running samba > attached to eth1, an internal interface. Previously I was using network > instead of NetworkManager. > > It is crazy that this has been open since 2014. Though I suspect it would > require a manager or product manager to care enough. Since it would require > changing the .service files in many different packages. This is a tracker bug for issues as they are found. There isn't any specific action on this bug itself -- if you find a problem, please file a new bug and mark that as blocking this one. However: a) we are using unpatched upstream systemd configuration for samba. We *can* make a fix locally, but we prefer not to. b) it's not always, actually, a problem. Many services will start before network is online and work just fine; it's usually _better_ to fix that where possible than to change the targets.
(In reply to Jason Tibbitts from comment #19) > Does anyone have a list of services which are still broken? I'm having > issues with autofs which I think stem from this, but I'm not sure what to do > to confirm that. Services I have to restart following a change in network parameters are: rsyslog (hostname) spamassassin (hostname) cyrus-imapd (hostname) sendmail (hostname) mimedefang (hostname) apcupsd (hostname) proftpd (hostname) httpd (hostname)
Having to restart following a change in network parameters might, or might not, be related to use of network-online.target. At least one of the tickets you filed was against a package which already uses After=network-online.target. I that at least in some cases you are seeing a different issue, where either network-online still isn't late enough or something is racing. And note that whether a daemon can adapt to a change of IP is rather different than handling a change in hostname. If something is binding to 0.0.0.0 or already supports IP_FREEBIND then just adding after=network-online.target isn't going to make any difference to your issue.
(In reply to Jason Tibbitts from comment #26) > Having to restart following a change in network parameters might, or might > not, be related to use of network-online.target. At least one of the > tickets you filed was against a package which already uses > After=network-online.target. I that at least in some cases you are seeing a > different issue, where either network-online still isn't late enough or > something is racing. > > And note that whether a daemon can adapt to a change of IP is rather > different than handling a change in hostname. If something is binding to > 0.0.0.0 or already supports IP_FREEBIND then just adding > after=network-online.target isn't going to make any difference to your issue. Alas there's no way to poll for a change in hostname asynchronously: you'd have to call gethostbyname() each time through your loop which might be a little painful (even though as system calls go, it's relatively lightweight). But a change in your IP address (which can be notified asynchronously) often portends a change in your hostname as well.
Part of the problem is also that the unit files such as /usr/lib/systemd/system/sendmail.service are usually not marked as configuration files so even if you change it manually from network.target to netork-online.target, it will revert back with another patching.
But there's never any need to edit the unit files under /lib/systemd/system.... Either copy the sendmail.service file to /etc/systemd/system and edit it, as you wish, or create /etc/systemd/system/sendmail.d and add overrides there. man systemd.unit, search for "drop-in" for more info on the latter method.
All the bugs blocked on this tracker are closed. Can we close this now?
(In reply to Kevin Fenzi from comment #30) > All the bugs blocked on this tracker are closed. Can we close this now? Not clear. Some aged out without ever being fixed (I've verified that years later they're still a problem and reopened them in a couple of cases). I think there's a certain amount of woolymindedness going on here: just what does the network.target mean, and when is it actually meaningful? Going straight to the source https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/: Running Services After the Network is up So you have configured your service to run after network.target but it still gets run before your network is up? And now you are wondering why that is and what you can do about it? LSB init scripts know the $network facility. As this facility is defined only very unprecisely people tend to have different ideas what it is supposed to mean. It's kind of shocking (to me, anyway) some of the comments out there about how "clearly" network.target is the correct setting. Concepts in systemd In systemd, three target units take the role of $network: ... * network-online.target is a target that actively waits until the nework is "up", where the definition of "up" is defined by the network management software. Usually it indicates a configured, routable IP address of some kind. [...] and then: Cut the crap! How do I make sure that my service starts after the network is really online? Well, that depends on your setup and the services you plan to run after it (see above). If you need to delay you service after the network is up, include After=network-online.target Wants=network-online.target in the .service file. What I get out of this, is are we talking about services that predominantly available intra-host, i.e. on loopback? Or are we talking about services that are network-facing (which I believe most are, and "loopback" is the exception, not the rule)? Most of the services we're addressing are the later: why, for instance, would you typically want to ssh into the loopback port? why would you use SMTP to loopback if you can submit an email via the command-line using "sendmail" or "postdrop".