Bug 632620 - fails to start ypbind
Summary: fails to start ypbind
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: ypbind
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Honza Horak
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-10 15:13 UTC by Jay Fenlason
Modified: 2014-08-31 23:29 UTC (History)
10 users (show)

Fixed In Version: ypbind-1.32-10
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-04-27 10:15:58 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
dmesg from boot with systemd.log_level=debug systemd.log_target=kmsg (118.74 KB, text/plain)
2010-09-10 15:13 UTC, Jay Fenlason
no flags Details
dmesg with NetworkManager removed (113.60 KB, text/plain)
2010-09-10 15:48 UTC, Jay Fenlason
no flags Details

Description Jay Fenlason 2010-09-10 15:13:27 UTC
Created attachment 446532 [details]
dmesg from boot with systemd.log_level=debug systemd.log_target=kmsg

Description of problem:
After the system is booted, ypbind is not running, despite being configured to run

Version-Release number of selected component (if applicable):
systemd-9-3.fc15.x86_64
initscripts-9.20-1.fc15.x86_64
sysvinit-tools-2.87-5.dsf.fc15.x86_64

How reproducible:
Always

Steps to Reproduce:
1.boot
2.attempt to log in
3.discover that user accounts do not exist because ypbind is not running
4.
  
Actual results:
no ypbind

Expected results:
running ypbind

Additional info:

Comment 1 Bill Nottingham 2010-09-10 15:33:24 UTC
Works for me in brief testing with:

sysvinit-tools-2.87-5.dsf.fc14.x86_64
systemd-9-3.fc14.x86_64
initscripts-9.20-1.fc14.x86_64
ypbind-1.32-1.fc14.x86_64

Comment 2 Michal Schmidt 2010-09-10 15:34:10 UTC
init[1]: Failed to load configuration for dbus-org.freedesktop.NetworkManager.service: No such file or directory
init[1]: Trying to enqueue job dbus-org.freedesktop.NetworkManager.service/start
init[1]: D-Bus activation failed for dbus-org.freedesktop.NetworkManager.service: Invalid argument

...looks like bug 624773. ypbind happens later and may be affected by these problems.

Comment 3 Bill Nottingham 2010-09-10 15:40:28 UTC
init[1]: Got SIGCHLD for process 973 (network)
init[1]: Child 973 died (code=killed, status=15/TERM)
init[1]: Child 973 belongs to network.service
init[1]: network.service: control process exited, code=killed status=15
init[1]: network.service got final SIGCHLD for state final-sigterm
init[1]: network.service changed final-sigterm -> failed
init[1]: Job network.service/start finished, success=no
init[1]: Unit network.service entered failed state.
init[1]: Got SIGCHLD for process 1108 (dhclient)
init[1]: Child 1108 died (code=killed, status=15/TERM)
init[1]: Got SIGCHLD for process 1197 (ypbind)
init[1]: Child 1197 died (code=killed, status=15/TERM)
init[1]: Got SIGCHLD for process 1189 (service)
init[1]: Child 1189 died (code=killed, status=15/TERM)
init[1]: Got SIGCHLD for process 1135 (dhclient-script)
init[1]: Child 1135 died (code=killed, status=15/TERM)

It's a combination of 624773 and 630225 (in that the network service calls to nmcli were trying to bring up NM). Everything got timed out and therefore killed.

Which bug should we mark it as a dup of?

Comment 4 Jay Fenlason 2010-09-10 15:48:41 UTC
Created attachment 446539 [details]
dmesg with NetworkManager removed

Comment 5 Michal Schmidt 2010-09-10 15:57:34 UTC
You mean you uninstalled the NetworkManager package?

Hm, the network is still not starting right:

init[1]: network.service operation timed out. Terminating.
init[1]: network.service changed start -> final-sigterm

Comment 6 Bill Nottingham 2010-09-10 16:12:08 UTC
I noticed this:

init[1]: Trying to enqueue job ypbind.service/reload
init[1]: Added job ypbind.service/reload to transaction.
init[1]: Enqueued job ypbind.service/reload as 99


This is coming from /etc/dhcp/dhclient.d/nis.sh, which does a 'condrestart' on ypbind if there is NIS info in the DHCP lease.

Moving this out of the way fixed it, according to the reporter. I wonder if:

a) it's hanging
b) it's confusing the accounting, and causing it not to start ypbind

Comment 7 Bill Nottingham 2010-09-10 17:15:19 UTC
Confirmed:

- 'network' is started
- dhcp gets a lease
- NIS dhclient script runs 'service ypbind condrestart'
- this is mapped to 'systemctl condrestart ypbind.service'
- the systemctl command hangs, until it's killed by a timeout
  (the ypbind sysv service is *not* invoked by systemd)

Comment 8 Lennart Poettering 2010-09-10 23:37:27 UTC
hmm, i think this is an ugly ordering issue... the restarting of ypbind waits until network is up, but the network up synchronously waits until the ypbind restart waits. which hence is a deadlock. humm. there are several possible fixes thinkable. Not sure which one is best though.

Comment 9 Bill Nottingham 2010-09-13 18:11:16 UTC
Why does the restart of yobind wait for network? It's not listed as a dependency.

Comment 10 Bill Nottingham 2010-09-14 15:25:44 UTC
Actually, this isn't even a 'restart' - it's a 'condrestart'.

My reading of how condrestart works implies that it should *never* block waiting on some ypbind dependencies or ordering; ypbind dependencies aren't met => ypbind isn't running => do nothing and exit.

Comment 11 Lennart Poettering 2011-04-12 12:24:47 UTC
Does this problem still exist on current F15?

Comment 12 Jay Fenlason 2011-04-12 15:23:25 UTC
I don't know.  Many months ago I put "service ypbind start" in my /etc/rc.local on my rawhide systems because it was clear this bug was not getting any effort made to fix it.

Comment 13 Jussi Eloranta 2011-04-21 17:55:11 UTC
I am seeing this problem on F15 beta. Right after booting, ypbind does not work. If I restart it by hand, it runs fine. This is with a fresh F15 beta install (running network manager; did not try with traditional network start). I am amazed that ypbind gets broken in just about every release...

Comment 14 Jay Fenlason 2011-04-21 18:58:53 UTC
With today's rawhide, ypbind won't start at all.  /var/log/messages shows
Apr 21 14:52:42 fenlason-lab4 systemd[1]: Failed to load environment files: No such file or directory
Apr 21 14:52:42 fenlason-lab4 systemd[1]: ypbind.service failed to run 'start-pre' task: No such file or directory

Comment 15 Bill Nottingham 2011-04-25 18:21:52 UTC
... that sounds like an issue with the ypbind service file, not with systemd itself.

Comment 16 Lennart Poettering 2011-04-27 02:15:15 UTC
Yupp, reassigning to ypbind.

Comment 17 Honza Horak 2011-04-27 08:56:34 UTC
(In reply to comment #13)
> I am seeing this problem on F15 beta. Right after booting, ypbind does not
> work. If I restart it by hand, it runs fine. This is with a fresh F15 beta
> install (running network manager; did not try with traditional network start).
> I am amazed that ypbind gets broken in just about every release...

There is a new update ypbind-1.32-8.fc15, that fixes bug #693873 caused by NetworkManager's changes. I think it can solve your problem.

Comment 18 Honza Horak 2011-04-27 10:15:58 UTC
(In reply to comment #14)
> With today's rawhide, ypbind won't start at all.  /var/log/messages shows
> Apr 21 14:52:42 fenlason-lab4 systemd[1]: Failed to load environment files: No
> such file or directory
> Apr 21 14:52:42 fenlason-lab4 systemd[1]: ypbind.service failed to run
> 'start-pre' task: No such file or directory

EnvironmentFile is optional in ypbind, so the following change should solve the problem.

-EnvironmentFile=/etc/sysconfig/ypbind
+EnvironmentFile=-/etc/sysconfig/ypbind

The new ypbind with this change is already in rawhide, you can try if it works as expected for you: http://koji.fedoraproject.org/koji/buildinfo?buildID=240958

Afaik, there is no other unresolved issue in ypbind according this bug, so I'm closing it. Feel free to reopen it if you need.


Note You need to log in before you can comment on or make changes to this bug.