Red Hat Bugzilla – Bug 632620
fails to start ypbind
Last modified: 2014-08-31 19:29:52 EDT
Created attachment 446532 [details]
dmesg from boot with systemd.log_level=debug systemd.log_target=kmsg
Description of problem:
After the system is booted, ypbind is not running, despite being configured to run
Version-Release number of selected component (if applicable):
Steps to Reproduce:
2.attempt to log in
3.discover that user accounts do not exist because ypbind is not running
Works for me in brief testing with:
init: Failed to load configuration for dbus-org.freedesktop.NetworkManager.service: No such file or directory
init: Trying to enqueue job dbus-org.freedesktop.NetworkManager.service/start
init: D-Bus activation failed for dbus-org.freedesktop.NetworkManager.service: Invalid argument
...looks like bug 624773. ypbind happens later and may be affected by these problems.
init: Got SIGCHLD for process 973 (network)
init: Child 973 died (code=killed, status=15/TERM)
init: Child 973 belongs to network.service
init: network.service: control process exited, code=killed status=15
init: network.service got final SIGCHLD for state final-sigterm
init: network.service changed final-sigterm -> failed
init: Job network.service/start finished, success=no
init: Unit network.service entered failed state.
init: Got SIGCHLD for process 1108 (dhclient)
init: Child 1108 died (code=killed, status=15/TERM)
init: Got SIGCHLD for process 1197 (ypbind)
init: Child 1197 died (code=killed, status=15/TERM)
init: Got SIGCHLD for process 1189 (service)
init: Child 1189 died (code=killed, status=15/TERM)
init: Got SIGCHLD for process 1135 (dhclient-script)
init: Child 1135 died (code=killed, status=15/TERM)
It's a combination of 624773 and 630225 (in that the network service calls to nmcli were trying to bring up NM). Everything got timed out and therefore killed.
Which bug should we mark it as a dup of?
Created attachment 446539 [details]
dmesg with NetworkManager removed
You mean you uninstalled the NetworkManager package?
Hm, the network is still not starting right:
init: network.service operation timed out. Terminating.
init: network.service changed start -> final-sigterm
I noticed this:
init: Trying to enqueue job ypbind.service/reload
init: Added job ypbind.service/reload to transaction.
init: Enqueued job ypbind.service/reload as 99
This is coming from /etc/dhcp/dhclient.d/nis.sh, which does a 'condrestart' on ypbind if there is NIS info in the DHCP lease.
Moving this out of the way fixed it, according to the reporter. I wonder if:
a) it's hanging
b) it's confusing the accounting, and causing it not to start ypbind
- 'network' is started
- dhcp gets a lease
- NIS dhclient script runs 'service ypbind condrestart'
- this is mapped to 'systemctl condrestart ypbind.service'
- the systemctl command hangs, until it's killed by a timeout
(the ypbind sysv service is *not* invoked by systemd)
hmm, i think this is an ugly ordering issue... the restarting of ypbind waits until network is up, but the network up synchronously waits until the ypbind restart waits. which hence is a deadlock. humm. there are several possible fixes thinkable. Not sure which one is best though.
Why does the restart of yobind wait for network? It's not listed as a dependency.
Actually, this isn't even a 'restart' - it's a 'condrestart'.
My reading of how condrestart works implies that it should *never* block waiting on some ypbind dependencies or ordering; ypbind dependencies aren't met => ypbind isn't running => do nothing and exit.
Does this problem still exist on current F15?
I don't know. Many months ago I put "service ypbind start" in my /etc/rc.local on my rawhide systems because it was clear this bug was not getting any effort made to fix it.
I am seeing this problem on F15 beta. Right after booting, ypbind does not work. If I restart it by hand, it runs fine. This is with a fresh F15 beta install (running network manager; did not try with traditional network start). I am amazed that ypbind gets broken in just about every release...
With today's rawhide, ypbind won't start at all. /var/log/messages shows
Apr 21 14:52:42 fenlason-lab4 systemd: Failed to load environment files: No such file or directory
Apr 21 14:52:42 fenlason-lab4 systemd: ypbind.service failed to run 'start-pre' task: No such file or directory
... that sounds like an issue with the ypbind service file, not with systemd itself.
Yupp, reassigning to ypbind.
(In reply to comment #13)
> I am seeing this problem on F15 beta. Right after booting, ypbind does not
> work. If I restart it by hand, it runs fine. This is with a fresh F15 beta
> install (running network manager; did not try with traditional network start).
> I am amazed that ypbind gets broken in just about every release...
There is a new update ypbind-1.32-8.fc15, that fixes bug #693873 caused by NetworkManager's changes. I think it can solve your problem.
(In reply to comment #14)
> With today's rawhide, ypbind won't start at all. /var/log/messages shows
> Apr 21 14:52:42 fenlason-lab4 systemd: Failed to load environment files: No
> such file or directory
> Apr 21 14:52:42 fenlason-lab4 systemd: ypbind.service failed to run
> 'start-pre' task: No such file or directory
EnvironmentFile is optional in ypbind, so the following change should solve the problem.
The new ypbind with this change is already in rawhide, you can try if it works as expected for you: http://koji.fedoraproject.org/koji/buildinfo?buildID=240958
Afaik, there is no other unresolved issue in ypbind according this bug, so I'm closing it. Feel free to reopen it if you need.