Bug 635081 - Broken NetworkManager/named service startup ordering
Summary: Broken NetworkManager/named service startup ordering
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Lennart Poettering
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F14Target
TreeView+ depends on / blocked
 
Reported: 2010-09-17 18:06 UTC by Nicolas Mailhot
Modified: 2011-02-18 17:05 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2011-02-18 17:05:06 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Nicolas Mailhot 2010-09-17 18:06:08 UTC
This system runs a local named instance: provides naming for lan services, and overrides the ip of the external (internet) access fqdn (the isp access box is not performing the same translations on external and internal accesses, so if you access the port 80 on its fqdn from the internet, it's properly routed to the lan http server, but if you access the same port on the same fqdn from the lan, it's routed to a useless private isp box interface. So systems inside the lan need to have the external fqdn mapped to a lan ip to access the same services)

Therefore, the system itself uses the local named instance for dns resolving. resolv.conf contains
nameserver 127.0.0.1
nameserver ::1

NetworkManager still queries the dhcp service on the isp box to know what ip to assign on the isp box/system interface (eth0)

systemd is not ordering properly NetworkManager and named services startup. The named service needs to be restarted manually after boot for resolving to work in other network-using services such as apache

systemctl restart named.service

httpd-2.2.16-1.1.fc14.x86_64
httpd-tools-2.2.16-1.1.fc14.x86_64
NetworkManager-0.8.1-6.git20100831.fc14.x86_64
NetworkManager-glib-0.8.1-6.git20100831.fc14.x86_64
NetworkManager-gnome-0.8.1-6.git20100831.fc14.x86_64
systemd-10-1.fc14.x86_64
systemd-gtk-10-1.fc14.x86_64
systemd-sysvinit-10-1.fc14.x86_64
systemd-units-10-1.fc14.x86_64

Comment 1 Matthias Clasen 2010-10-08 22:46:22 UTC
Moving systemd bugs to f15, since the systemd feature got delayed.

Comment 2 Lennart Poettering 2010-11-21 21:51:46 UTC
Sounds like a dependency ordering problem in NM. 

If I understand this correctly, then you are asking for NM to be started before bind, not after? Or what is this about?

Comment 3 Nicolas Mailhot 2010-11-22 10:49:25 UTC
NM needs to be started before bind because otherwise bind does not have any interface to listen on.

However bind also needs to be started before any other network service, because if it is not they can not resolve their names (since bind also provides local name resolution)

Comment 4 Lennart Poettering 2010-11-25 01:36:02 UTC
Uh? so you are saying that bind needs to be stated both before and after the network interfaces cam up? That sounds very very broken.

Why isn't bind listening on localhost? I really don't understand this setup.

Comment 5 Nicolas Mailhot 2010-11-25 07:29:57 UTC
I'm saying the ordering needs to be

nm (till interfaces are up)
bind
other services that use network

bind can not resolve external adresses without an external link
other services rely on bind for resolving. They need resolving for more than just localhost

Comment 6 Nicolas Mailhot 2010-11-25 07:32:18 UTC
(and no bind is not listening only on localhost since it provides resolving for other systems on the lan, not just its own system. So it needs to attach to external interfaces and startup once they are up)

Comment 7 Lennart Poettering 2011-02-16 23:13:05 UTC
OK, so you request that bind should be ordered after NM? Then let's reassign this to bind, and ask for such a header be added to their LSB header in the init script.

Should-Start: NetworkManager

That line should be enough to ensure that bind gets started after NM if both are enabled.

Comment 8 Nicolas Mailhot 2011-02-17 09:13:13 UTC
It's not that simple, bind should be ordered after NM, but *before* any network service. Because network services will try to resolve the addresses they bind to, and that will depend on local dns service on some configurations (like mine)

How does one express the second part in systemd ?

Comment 9 Lennart Poettering 2011-02-18 12:38:00 UTC
Well, the LSB solution for this is that those services which rely on DNS add a dependency on $named:

http://refspecs.freestandards.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/facilname.html

If all services did that properly systemd would have no problem.

So I'd like to claim that this is already an LSB issue, nothing really new in systemd here, except that we rely more on the LSB header info than previous solutions did.

Comment 10 Lennart Poettering 2011-02-18 12:39:33 UTC
[ That said, the systemd solution for this is that bind would be socket activatable, and if that's not in the cards that people then order themselves properly after nss-lookup.target (which is the systemd name for $named). ]

Comment 11 Adam Tkac 2011-02-18 12:54:24 UTC
(In reply to comment #7)
> OK, so you request that bind should be ordered after NM? Then let's reassign
> this to bind, and ask for such a header be added to their LSB header in the
> init script.
> 
> Should-Start: NetworkManager
> 
> That line should be enough to ensure that bind gets started after NM if both
> are enabled.

Hm, that's weird.

NetworkManager has this in it's /etc/init.d/NetworkManager initscript:
...
### BEGIN INIT INFO
# Provides: network_manager $network
...

and BIND has this in it's initscript (/etc/init.d/named):
...
### BEGIN INIT INFO
# Provides: $named
# Required-Start: $local_fs $network $syslog
...

So BIND currently requires $network which is provided by NetworkManager. I don't understand why systemd orders named before NetworkManager. If I understand LSB headers well it shouldn't be needed to add anything to /etc/init.d/named, should it?

(Just FYI:
$ rpm -q NetworkManager bind
NetworkManager-0.8.2-8.git20101117.fc15.x86_64
bind-9.7.3-0.6.rc1.fc15.x86_64
)

Comment 12 Lennart Poettering 2011-02-18 14:25:12 UTC
Nicolas, so looking at this the order between NM and bind is actually correct and I verified now that systemd parses that correctly. 

So, what is this bug about? Just the ordering between some dns-using services and bind? If so, the title of the bug should be changed at least. And those services in question should be fixed to order themselves correctly after $named resp. nss-lookup.target.

But I don't think there is anything wrong with bind here, so I am reassigning this back to systemd for now, until I understand what you actually are asking for.

Nicolas, please elaborate where exactly you see a bug here?

Comment 13 Nicolas Mailhot 2011-02-18 16:07:07 UTC
If you look at the bug you'll see it was opened with systemd-10

I agree that the starts-after-systemd bit seems to have been fixed in rawhide lately (didn't have time to check thoroughly but I haven't had any failure lately)

However I did have a few start-after-other-network-services problems in january, and as far as I know if it seems to work now that's pure luck. I'm pretty sure every network-using service expects name resolution to work as soon as the network is declared ready, and that means starts-after any resolving service if present (bind, dnsmask, etc), but you've confirmed nothing makes sure it's the case right now in systemd

Comment 14 Lennart Poettering 2011-02-18 17:05:06 UTC
OK, then I'll close this bug for now, assuming that the issues in systemd are fixed. If there are services which lack the reference to $named resp. nss-lookup.target then please file bugs against those packages.


Note You need to log in before you can comment on or make changes to this bug.