Description of problem: When my system is booted and I try to log in with some NIS username, I get "login incorrect". This is happening when the system starts up and I try to log in on some getty. Please note that once I attempt to log in using my NIS if first fails, then after 1-3 seconds I see on the getty console that something (probably systemd) is reloading sendmail. Why? By reloading the sendmail service, the NIS service is shut down and restarted. Once that has happened and I try to log in with an NIS username, the login succeeds. Not sure what's going on here. Version-Release number of selected component (if applicable): $ rpm -q ypbind ypbind-1.32-7.fc16.x86_64 (basically the version of ypbind with LSB headers in init scripts) How reproducible: Always Steps to Reproduce: 1. Boot system, make sure ypbind service is enabled and started on boot 2. Switch to a virtual terminal: CTRL+ALT+2 3. Attempt to log in using some NIS username 4. Login fails <--- Why? 5. Wait 3 seconds, and sendmail gets reloaded (maybe because the failed login triggered sending email to root?) 6. sendmail restart reloads ypbind service 7. When this is finished, logging in using some NIS username succeeds Actual results: Login with NIS username on first attempt immediately after reboot. Expected results: Can login immediately using some NIS username after a reboot. Additional info: I can log in as root and see that the ypbind service is started. Maybe this isn't a ypbind problem but a systemd/login problem. If so, I'd appreciate if you could change the component appropriately. Thanks!
Unfortunately, I'm not able to reproduce it, will keep trying. Maybe some additional info can help: What version of gdm do you have installed? How do you remark the sendmail reloading? Some message from systemd, syslog? Is ypbind really working properly before trying to log in (does e.g. ypcat passwd.byname returns correct data)?
Created attachment 492394 [details] Test-case illustrating NIS login problem.
Created attachment 492395 [details] ypbind_status.log file as produced by debug_ypbind.sh
Created attachment 492397 [details] sendmail_status.log as produced by debug_ypbind.sh I'm pretty sure sendmail is unrelated. But for some reason if I do nothing for 60 seconds after loggging in as root on a virtual terminal immediately after boot, the sendmail service gets reloaded. This in turn reloads the NIS service, and all is well and shiny. Really not sure what's going on... Something seems iffy with regard to NIS domain binding, though.
I forgot to mention that I had the testcase script included in /root/.bashrc. So the sequence of events was: 1. boot system 2. log in as root as soon as getty's become available 3. debug_ypbind.sh ran
I see this behavior in Fedora 15 beta as well. When I bring the system up and try to login to gdm, it fails to authenticate. If I switch to a tty and run 'ypwhich', I get an 'Internal NIS Error'. If I do a 'service ypbind restart', things start working.
Severin and Jonathan, it would be great if you can test a new build ypbind-1.32-8.fc15, that has been pushed to updates recently, and see if the problem persists.
(In reply to comment #7) > Severin and Jonathan, it would be great if you can test a new build > ypbind-1.32-8.fc15, that has been pushed to updates recently, and see if the > problem persists. $ rpm -q ypbind ypbind-1.32-8.fc15.x86_64 I've been using it for a while now. Problem persists :( Can you reproduce, Jan?
Yes, I'm still seeing it with ypbind-1.32-8.fc15.x86_64.
I have the same problem. when I reboot I am running FC15 and ypbind-1.32-8.fc15.x86_64 rpcinfo -p localhost does not show ypbind, only portmap. When I run "service start ypbind" manually it works OK.
(In reply to comment #8) > I've been using it for a while now. Problem persists :( Can you reproduce, Jan? Unfortunately not, ypbind still works as expected for me. Even logs from your script don't show anything suspicious, both services are started correctly and nothing is restarted (according their PID). Severin, Jonathan and Kevin, can you provide output of the following commands, please? rpm -q systemd rpm -q dbus rpm -q NetworkManager chkconfig --list NetworkManager chkconfig --list network cat /etc/sysconfig/ypbind Does /var/log/messages say something about ypbind?
(In reply to comment #11) > (In reply to comment #8) > > I've been using it for a while now. Problem persists :( Can you reproduce, Jan? > > Unfortunately not, ypbind still works as expected for me. Even logs from your > script don't show anything suspicious, both services are started correctly and > nothing is restarted (according their PID). I know they are started. Yet they are not functional immediately :( > Severin, Jonathan and Kevin, can you provide output of the following commands, > please? > rpm -q systemd systemd-24-1.fc15.x86_64 > rpm -q dbus dbus-1.4.6-3.fc15.x86_64 > rpm -q NetworkManager NetworkManager-0.8.998-2.git20110406.fc15.x86_64 > chkconfig --list NetworkManager $ chkconfig --list NetworkManager Note: This output shows SysV services only and does not include native systemd services. SysV configuration data might be overriden by native systemd configuration. NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off Note that I've enaabled the NetworkManager service by: $ systemctl enable NetworkManager.service > chkconfig --list network $ chkconfig --list network Note: This output shows SysV services only and does not include native systemd services. SysV configuration data might be overriden by native systemd configuration. network 0:off 1:off 2:off 3:off 4:off 5:off 6:off Note that I've disabled the network service by: $ systemctl disable network.service. Since this service doesn't support systemd natives yet, it's equivalent to $ chkconfig network off > cat /etc/sysconfig/ypbind $ cat /etc/sysconfig/ypbind cat: /etc/sysconfig/ypbind: No such file or directory > Does /var/log/messages say something about ypbind? # grep ypbind /var/log/messages Apr 27 08:42:54 dhcp-10-15-16-134 setsebool: The allow_ypbind policy boolean was changed to 1 by root Apr 27 08:43:05 dhcp-10-15-16-134 ypbind: NIS domain: example.com, NIS server: Apr 27 08:43:47 dhcp-10-15-16-134 setsebool: The allow_ypbind policy boolean was changed to 1 by root Apr 27 08:43:49 dhcp-10-15-16-134 ypbind: NIS domain: example.com, NIS server: nis.example.com Note that the first log entry (Apr 27 08:43:05) does not show the NIS server, yet the second entry (Apr 27 08:43:49) does. I assume the first log entry stems from booting up the system. The second when it gets restarted for whatever reason (approx. 1 min after I attempt the first NIS login). # cat /etc/sysconfig/network NETWORKING=yes HOSTNAME=dhcp-10-15-16-134.example.com NTPSERVERARGS=iburst NISDOMAIN=example.com Hope that helps. Thanks!
It seems like there is no connection available when ypbind is starting. Can you, please, check if the network connection is ready before ypbind tries to start? You can for example add the following two lines in your /etc/init.d/ypbind script (you can use your NIS master instead of google.com): fi fi fi + PINGRESULT=`ping -c 1 google.com 2>&1` + logger -t ypbind $"ping google.com: '$PINGRESULT'" echo -n $"Starting NIS service: " selinux_on daemon $exec $OTHER_YPBIND_OPTS ..and then check /var/log/messages. There can be a better way how to check the connection to the server, this is just the fastest what I've done.
(In reply to comment #13) > It seems like there is no connection available when ypbind is starting. Can > you, please, check if the network connection is ready before ypbind tries to > start? > > You can for example add the following two lines in your /etc/init.d/ypbind > script (you can use your NIS master instead of google.com): > > fi > fi > fi > + PINGRESULT=`ping -c 1 google.com 2>&1` > + logger -t ypbind $"ping google.com: '$PINGRESULT'" > echo -n $"Starting NIS service: " > selinux_on > daemon $exec $OTHER_YPBIND_OPTS > > ..and then check /var/log/messages. There can be a better way how to check the > connection to the server, this is just the fastest what I've done. Here are the relevant parts from /var/log/messages: Apr 28 09:12:56 dhcp-10-15-16-134 NetworkManager[846]: <info> Activation (eth0) successful, device activated. Apr 28 09:12:56 dhcp-10-15-16-134 NetworkManager[846]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete. Apr 28 09:12:56 dhcp-10-15-16-134 dbus: [system] Activating service name='org.fr eedesktop.nm_dispatcher' argv0='/lib64/dbus-1/dbus-daemon-launch-helper' Apr 28 09:12:56 dhcp-10-15-16-134 dbus: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' Apr 28 09:12:57 dhcp-10-15-16-134 ypbind: ping google.com: 'PING google.com (74. 125.115.103) 56(84) bytes of data.#01264 bytes from vx-in-f103.1e100.net (74.125 .115.103): icmp_req=1 ttl=41 time=56.8 ms#012#012--- google.com ping statistics ---#0121 packets transmitted, 1 received, 0% packet loss, time 0ms#012rtt min/av g/max/mdev = 56.872/56.872/56.872/0.000 ms' Apr 28 09:12:57 dhcp-10-15-16-134 dbus: avc: received policyload notice (seqno=3) Apr 28 09:12:57 dhcp-10-15-16-134 dbus: avc: received policyload notice (seqno=3) To me that looks as if network connectivity is there when ypbind is started. Also note that NIS server info is set via DHCP. For example, this is a message I see in /var/log/messages prior to the above things: Apr 28 09:12:55 dhcp-10-15-16-134 NetworkManager[846]: <info> nis '10.11.255.1 56' Thoughts?
Ok, scratch my comment above. You were right, the network link is not ready at the time when ypbind is started (see attached log). Not sure why this is happening, though. Looks like a bug in NetworkManager and/or systemd. What do you think?
Created attachment 495542 [details] Correct snipped of /var/log/messages showing network is not up when ypbind is started.
On second thought, why does NIS claim it's successfully brought up if there is no network connectivity? I think it should at print "[failed]" instead of "[ok]".
(In reply to comment #15) > Ok, scratch my comment above. You were right, the network link is not ready at > the time when ypbind is started (see attached log). Not sure why this is > happening, though. Looks like a bug in NetworkManager and/or systemd. What do > you think? On the one hand systemd should ensure that network's started before ypbind (if it is defined in LSB header). However, I can imagine a case when ypbind is started right after NetworkManager, but before DHCP properly assigns the address, so the master server is not accessible during ypbind start. On the other hand ypbind has a code in SysV init script to handle this situation (it waits several seconds for the connection), so this shouldn't be the problem, afaik. (In reply to comment #17) > On second thought, why does NIS claim it's successfully brought up if there is > no network connectivity? I think it should at print "[failed]" instead of > "[ok]". This is a feature related to my first paragraph. ypbind listens to DBus even if a machine is offline. If a connection status changes, it should "wake up" and continues in binding. I think this could be the problem, but not sure yet. I will continue investigating and get you know If I find something. Thanks for now, it helped quite a lot.
ypbind-1.32-8.fc15.1 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/ypbind-1.32-8.fc15.1
I think I've found the problem. It was dbus-related again. You can try the newest update (see comment #19).
*** Bug 700160 has been marked as a duplicate of this bug. ***
Package ypbind-1.32-8.fc15.1: * should fix your issue, * was pushed to the Fedora 15 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing ypbind-1.32-8.fc15.1' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/ypbind-1.32-8.fc15.1 then log in and leave karma (feedback).
(In reply to comment #20) > I think I've found the problem. It was dbus-related again. You can try the > newest update (see comment #19). Thanks. However, this update doesn't fix my problem :( I'm experiencing the same behaviour as prior ypbind-1.32-8. $ rpm -q ypbind ypbind-1.32-8.fc15.1.x86_64
That's not good news :( I still cannot reproduce it by myself, so I will try to find a way how to get some more info from you remotely. Thanks for your patience ;)
I'd need to discover what is happening wrong during boot, so I've prepared a debug build of ypbind which has more messages (logged to /tmp/ypbind-mt.log). This build is available here: http://koji.fedoraproject.org/koji/taskinfo?taskID=3047183 I'd like to ask everyone who can reproduce this bug to: * install this version * reboot your machine * log in (do whatever you need to success) * attach /tmp/ypbind-mt.log here Thank you in advance for your collaboration.
(In reply to comment #27) > I'd need to discover what is happening wrong during boot, so I've prepared a > debug build of ypbind which has more messages (logged to /tmp/ypbind-mt.log). > > This build is available here: > http://koji.fedoraproject.org/koji/taskinfo?taskID=3047183 Installed http://koji.fedoraproject.org/koji/getfile?taskID=3047184&name=ypbind-1.32-8.fc15.1.debug.1.x86_64.rpm > I'd like to ask everyone who can reproduce this bug to: > * install this version > * reboot your machine After reboot ypbind service was not started. It also failed to start manually (see very bottom of attached log). > * log in (do whatever you need to success) Could not get it to bind to the NIS server at all :( > * attach /tmp/ypbind-mt.log here Will be attached in a minute. Thanks!
Created attachment 496526 [details] ypbind-mt.log of session where ypbind service fails to start Note that I've also added output of # service ypbind status and # grep ypbind /var/log/messages at the very bottom of the log file.
I'm back on ypbind-1.32-8.fc15.1.x86_64 for now.
(In reply to comment #28) > After reboot ypbind service was not started. It also failed to start manually > (see very bottom of attached log). Some segmentation fault in debug message caused it failed. I hope this will work properly: http://koji.fedoraproject.org/koji/taskinfo?taskID=3047814 If you can, please, give it a try again, thanks.
It works for me now - whenever I reboot, it seems that ypbind is connecting. I can run 'ypcat {map}' and get the correct output without having to restart ypbind. Thanks for the fix! (In reply to comment #22) > Package ypbind-1.32-8.fc15.1: > * should fix your issue, > * was pushed to the Fedora 15 testing repository, > * should be available at your local mirror within two days. > Update it with: > # su -c 'yum update --enablerepo=updates-testing ypbind-1.32-8.fc15.1' > as soon as you are able to. > Please go to the following url: > https://admin.fedoraproject.org/updates/ypbind-1.32-8.fc15.1 > then log in and leave karma (feedback).
(In reply to comment #31) > Some segmentation fault in debug message caused it failed. I hope this will > work properly: > http://koji.fedoraproject.org/koji/taskinfo?taskID=3047814 > > If you can, please, give it a try again, thanks. This debug build doesn't work either, don't waste your time by testing it.
Let's sum this issue up. ypbind starts during boot without full functionality on some machines. According comment #15, network seems not to be available at the time when ypbind is being started, even if LSB header says it has to be. Generally, ypbind listens on DBus and waits for the connection (it waits for a message "network is ready"), if NetworkManager is enabled. I can imagine such a case, that DBus message "network is ready" from NetworkManager comes before ypbind starts to listen on DBus. Then this message is lost for ypbind, it stays off-line and prints "NIS domain: domainname, NIS server: " (without any server) into /var/log/messages. If this is really the problem, we can add e.g. "sleep 5" before ypbind daemon is started into ypbind.init script and it should work. Unfortunately, I still can't reproduce this failure, so if you want to test this hypothesis, add the following lines into /etc/init.d/ypbind (ping is used to test the connectivity): @@ -91,6 +91,11 @@ start() { fi echo -n $"Starting NIS service: " selinux_on + PINGRESULT=`ping -c 1 google.com 2>&1` + logger -t ypbind $"ping #1 google.com: '$PINGRESULT'" + sleep 5 + PINGRESULT=`ping -c 1 google.com 2>&1` + logger -t ypbind $"ping #2 google.com: '$PINGRESULT'" daemon $exec $OTHER_YPBIND_OPTS retval=$? echo
(In reply to comment #33) > (In reply to comment #31) > > Some segmentation fault in debug message caused it failed. I hope this will > > work properly: > > http://koji.fedoraproject.org/koji/taskinfo?taskID=3047814 > > > > If you can, please, give it a try again, thanks. > > This debug build doesn't work either, don't waste your time by testing it. Heh, I just figured :)
(In reply to comment #35) > Heh, I just figured :) Sorry for that ;)
(In reply to comment #34) > Let's sum this issue up. ypbind starts during boot without full functionality > on some machines. According comment #15, network seems not to be available at > the time when ypbind is being started, even if LSB header says it has to be. > > Generally, ypbind listens on DBus and waits for the connection (it waits for a > message "network is ready"), if NetworkManager is enabled. I can imagine such a > case, that DBus message "network is ready" from NetworkManager comes before > ypbind starts to listen on DBus. Then this message is lost for ypbind, it stays > off-line and prints "NIS domain: domainname, NIS server: " (without any server) > into /var/log/messages. > > If this is really the problem, we can add e.g. "sleep 5" before ypbind daemon > is started into ypbind.init script and it should work. > > Unfortunately, I still can't reproduce this failure, so if you want to test > this hypothesis, add the following lines into /etc/init.d/ypbind (ping is used > to test the connectivity): > @@ -91,6 +91,11 @@ start() { > fi > echo -n $"Starting NIS service: " > selinux_on > + PINGRESULT=`ping -c 1 google.com 2>&1` > + logger -t ypbind $"ping #1 google.com: '$PINGRESULT'" > + sleep 5 > + PINGRESULT=`ping -c 1 google.com 2>&1` > + logger -t ypbind $"ping #2 google.com: '$PINGRESULT'" > daemon $exec $OTHER_YPBIND_OPTS > retval=$? > echo Hmm, there might be some truth to this, but if you have a look at attachment 492394 [details], you'll realize that I have to sleep for pretty much a minute. I've tried with lower values, which didn't work. So I'm expecting the above will have network down for both pings. Correct me if I'm wrong, but there has got to be a better way than just sleeping. E.g. ask NetworkManager some other way if network is up or forcing DBus to resend the message... I'm really not familiar enough with services involved to be of any help here. Having said that, I'll report what I find using the above additions to the start-up script later today. Thanks!
(In reply to comment #37) > Hmm, there might be some truth to this, but if you have a look at attachment > 492394 [details], you'll realize that I have to sleep for pretty much a minute. I've > tried with lower values, which didn't work. This sleep is run a bit earlier, exactly before ypbind is started even the first time. In your test script sleep was run after the whole init script finished. What I want to find out is if this failure is only a question of timing (and maybe systemd failure). > So I'm expecting the above will > have network down for both pings. Correct me if I'm wrong, but there has got to > be a better way than just sleeping. E.g. ask NetworkManager some other way if > network is up or forcing DBus to resend the message... I'm not planning any additional sleeping as a patch, this should be only a test to get known where the problems is.
# grep ypbind /var/log/messages | tail May 4 08:52:55 dhcp-10-15-16-134 setsebool: The allow_ypbind policy boolean was changed to 1 by root May 4 08:52:58 dhcp-10-15-16-134 ypbind: NIS domain: example.com, NIS server: nis.example.com May 4 09:39:53 dhcp-10-15-16-134 setsebool: The allow_ypbind policy boolean was changed to 1 by root May 4 09:39:53 dhcp-10-15-16-134 ypbind: ping #1 google.com: 'ping: unknown host google.com' May 4 09:39:58 dhcp-10-15-16-134 ypbind: ping #2 google.com: 'ping: unknown host google.com' May 4 09:40:08 dhcp-10-15-16-134 ypbind: NIS domain: example.com, NIS server: May 4 09:40:35 dhcp-10-15-16-134 setsebool: The allow_ypbind policy boolean was changed to 1 by root May 4 09:40:36 dhcp-10-15-16-134 ypbind: ping #1 google.com: 'PING google.com (74.125.91.99) 56(84) bytes of data.#01264 bytes from qy-in-f99.1e100.net (74.125.91.99): icmp_req=1 ttl=41 time=90.2 ms#012#012--- google.com ping statistics ---#0121 packets transmitted, 1 received, 0% packet loss, time 0ms#012rtt min/avg/max/mdev = 90.284/90.284/90.284/0.000 ms' May 4 09:40:41 dhcp-10-15-16-134 ypbind: ping #2 google.com: 'PING google.com (74.125.91.147) 56(84) bytes of data.#01264 bytes from qy-in-f147.1e100.net (74.125.91.147): icmp_req=1 ttl=41 time=82.6 ms#012#012--- google.com ping statistics ---#0121 packets transmitted, 1 received, 0% packet loss, time 0ms#012rtt min/avg/max/mdev = 82.616/82.616/82.616/0.000 ms' May 4 09:40:43 dhcp-10-15-16-134 ypbind: NIS domain: example.com, NIS server: nis.example.com Note that the time between the first ping (09:39:53) and until NIS become ready (09:40:35) is almost a minute.
(In reply to comment #38) > (In reply to comment #37) > > Hmm, there might be some truth to this, but if you have a look at attachment > > 492394 [details], you'll realize that I have to sleep for pretty much a minute. I've > > tried with lower values, which didn't work. > > This sleep is run a bit earlier, exactly before ypbind is started even the > first time. In your test script sleep was run after the whole init script > finished. What I want to find out is if this failure is only a question of > timing (and maybe systemd failure). Ok. > > So I'm expecting the above will > > have network down for both pings. Correct me if I'm wrong, but there has got to > > be a better way than just sleeping. E.g. ask NetworkManager some other way if > > network is up or forcing DBus to resend the message... > > I'm not planning any additional sleeping as a patch, this should be only a test > to get known where the problems is. Sounds good.
(In reply to comment #39) > # grep ypbind /var/log/messages | tail > > May 4 08:52:55 dhcp-10-15-16-134 setsebool: The allow_ypbind policy boolean > was changed to 1 by root > ... > May 4 09:40:43 dhcp-10-15-16-134 ypbind: NIS domain: example.com, NIS server: > nis.example.com > > Note that the time between the first ping (09:39:53) and until NIS become ready > (09:40:35) is almost a minute. It seems like NetworkManager is running, but there is no connection for almost a minute. Why? Generally this isn't a problem for ypbind, it wakes up directly after network is ready. Another question is who restarts ypbind and why? I'm reassigning this to systemd, because there is definitely something odd during boot and ypbind seems not to be the reason imho, just a victim.
Try if this helps: systemctl enable NetworkManager-wait-online.service
(In reply to comment #34) > Generally, ypbind listens on DBus and waits for the connection (it waits for a > message "network is ready"), if NetworkManager is enabled. I can imagine such a > case, that DBus message "network is ready" from NetworkManager comes before > ypbind starts to listen on DBus. Then this message is lost for ypbind, it stays > off-line The usual way to avoid such race conditions is to: 1. start listening for the events 2. query the current state in this order.
(In reply to comment #42) > Try if this helps: > systemctl enable NetworkManager-wait-online.service This solves my problem indeed. I can log in (via NIS) as soon as getty's or GDM is ready :) However, systemd still tells me that NetworkManager-wait-online.service failed: # systemctl status NetworkManager-wait-online.service NetworkManager-wait-online.service - Network Manager Wait Online Loaded: loaded (/lib/systemd/system/NetworkManager-wait-online.service) Active: failed since Wed, 04 May 2011 14:41:36 -0400; 5min ago Process: 849 ExecStart=/usr/bin/nm-online -q --timeout=30 (code=exited, status=1/FAILURE) CGroup: name=systemd:/system/NetworkManager-wait-online.service Thoughts? Thanks!
ypbind-1.32-8.fc15.1 has been pushed to the Fedora 15 stable repository. If problems still persist, please make note of it in this bug report.
It looks like all NetworkManager-wait-online.service is doing is to wait for 30 seconds to come online. Although it fails to do so within that timeout limit (why?), NIS seems to be able to start ok. Is this really the best solution we have for this? BTW. changing back to MODIFIED, since ypbind-1.32-8.fc15.1 update didn't change anything for me.
Please attach a bigger piece of /var/log/messages encompassing the whole boot. Hopefully NetworkManager logs enough information to find out what's taking so long to go online.
Created attachment 497097 [details] A longer /var/log/messages /var/log/messages of today is attached.
(In reply to comment #46) > It looks like all NetworkManager-wait-online.service is doing is to wait for 30 > seconds to come online. Although it fails to do so within that timeout limit > (why?), NIS seems to be able to start ok. Is this really the best solution we > have for this? I have a couple of questions, too. I think a key to solve this issue is in finding whoever and why restarts ypbind (and reloads sendmail). Can systemd do this? If so, why? According to comment #48 connection seems to be slowed down by slow DHCP, does restarting of ypbind have something in common with that? Or is this irrelevant? Note: According to comment #3, ypbind is correctly stopped (Process: 1638 ExecStop=/etc/rc.d/init.d/ypbind stop (code=exited, status=0/SUCCESS)), so it doesn't crash. However, you can see a segmentation fault of ypbind in attachment #497097 [details] from comment #48, but it was just a try of ad-hoc version of ypbind, so the crash is not relevant for now. > BTW. changing back to MODIFIED, since ypbind-1.32-8.fc15.1 update didn't change > anything for me. It helped at least someone (comment #32).
(In reply to comment #49) > I have a couple of questions, too. I think a key to solve this issue is in > finding whoever and why restarts ypbind (and reloads sendmail). Can systemd do > this? If so, why? sendmail is reloaded by /etc/NetworkManager/dispatcher.d/10-sendmail I don't know about the ypbind restart. The output of dmesg after booting with "log_buf_len=1M systemd.log_level=debug systemd.log_target=kmsg" might give a hint.
(In reply to comment #49) > > BTW. changing back to MODIFIED, since ypbind-1.32-8.fc15.1 update didn't change anything for me. > > It helped at least someone (comment #32). I know, but bodhi closed the ticket which appeared to be wrong.
I suspect ypbind may be restarted by /etc/dhcp/dhclient.d/nis.sh.
(In reply to comment #43) > The usual way to avoid such race conditions is to: > 1. start listening for the events > 2. query the current state > in this order. Just for a case anyone wants to know, it is done this way. (In reply to comment #52) > I suspect ypbind may be restarted by /etc/dhcp/dhclient.d/nis.sh. Thanks for the pointer, that's it. The script above is called by NetworkManager's /usr/libexec/nm-dhcp-client.action and there is a condrestart in it. So if I recall everything I know, it seems to me that everything is working correctly. A small summary: There is a slow DHCP by Severin, so network connection is ready after approx. 50s after boot. It is not able to log in using NIS until network connection is ready, which makes sense. As soon as DHCP is ready (and also network connection), ypbind is restarted and it is now possible to log in using NIS. So, there is nothing weird for me suddenly. Do I miss something?
Created attachment 501108 [details] /var/log/messages of failed start using network I'm also getting this problem using ypbind-1.32-8.fc15.1.x86_64 I'm using network rather than NetworkManager, but ypbind still fails to start on boot (have tried both). This is the state of the service just after boot; # systemctl status ypbind.service ypbind.service - LSB: Starts the ypbind daemon Loaded: loaded (/etc/rc.d/init.d/ypbind) Active: inactive (dead) CGroup: name=systemd:/system/ypbind.service # ypcat passwd No such map passwd.byname. Reason: Can't bind to server which serves this domain Manually restarting ypbind works fine.
(In reply to comment #54) > Created attachment 501108 [details] > /var/log/messages of failed start using network > > I'm also getting this problem using ypbind-1.32-8.fc15.1.x86_64 > > I'm using network rather than NetworkManager, but ypbind still fails to start > on boot (have tried both). > > This is the state of the service just after boot; > > # systemctl status ypbind.service > ypbind.service - LSB: Starts the ypbind daemon > Loaded: loaded (/etc/rc.d/init.d/ypbind) > Active: inactive (dead) > CGroup: name=systemd:/system/ypbind.service > > Manually restarting ypbind works fine. I don't see any attempts to run ypbind in the log and no "Process:" line in service status. It looks like systemd doesn't even try to start ypbind during start. Is it configured so? What "chkconfig --list ypbind" say?
(In reply to comment #55) > I don't see any attempts to run ypbind in the log and no "Process:" line in > service status. It looks like systemd doesn't even try to start ypbind during > start. Is it configured so? What "chkconfig --list ypbind" say? # chkconfig --list ypbind ypbind 0:off 1:off 2:on 3:on 4:on 5:on 6:off definately enabled, though it is getting ignored for some reason
(In reply to comment #56) > definately enabled, though it is getting ignored for some reason Boot with "log_buf_len=1M systemd.log_level=debug systemd.log_target=kmsg", do "dmesg > dmesg.txt" and attach the file please.
(In reply to comment #56) > definately enabled, though it is getting ignored for some reason Does it behave the same if NetworkManager is enabled?
Created attachment 501122 [details] dmesg log of systemd
(In reply to comment #58) > Does it behave the same if NetworkManager is enabled? When using NetworkManager I get the same error as Severin
(In reply to comment #59) > Created attachment 501122 [details] > dmesg log of systemd [ 5.689155] systemd[1]: Installed new job ypbind.service/start as 114 ... [ 23.410621] systemd[1]: Trying to enqueue job ypbind.service/try-restart/replace [ 23.410676] systemd[1]: Installed new job ypbind.service/try-restart as 238 [ 23.410686] systemd[1]: Enqueued job ypbind.service/try-restart as 238 [ 23.410726] systemd[1]: Job ypbind.service/try-restart finished, result=skipped ... and then ypbind.service/start is forgotten by systemd. Looks like bug 633774.
(In reply to comment #61) > ... and then ypbind.service/start is forgotten by systemd. Looks like bug > 633774. I've played a bit with that and it seems kwyjibo's failure can be reproduced with a similar service file like in bug #633774. Then I've realized another weird behavior of systemd, describe by the following scenario: I have disabled both network and NetworkManager, so there is no network after boot at all (note: ypbind service has $network target in its LSB header). But systemd tries to start ypbind after boot even if no network is there. Of course, ypbind fails as expected, but AFAIK systemd shouldn't even try to start it unless a network is there.
(In reply to comment #62) > I have disabled both network and NetworkManager, so there is no network after > boot at all (note: ypbind service has $network target in its LSB header). > But systemd tries to start ypbind after boot even if no network is there. Yes, systemd interprets the LSB headers "Required-Start:" and "Should-Start:" merely as ordering dependencies (like After=... in native units). > Of course, ypbind fails as expected, but AFAIK systemd shouldn't even try to > start it unless a network is there. This could be achieved by changing systemd to interpret "Required-Start:" as both After=... and Requisite=... In theory it would better match the intention of the LSB standard, but I do not see a big advantage in practice and I am afraid the change might cause more breakage. File the request as a separate bug to have it considered.
I'd like to add that we see a similar issue of ypbind failing during the boot sequence due to the network not being ready. We do not use NetworkManager and use network since it's more suitable for our situation of hard-wired desktop machines with fixed IP addresses. The problem is the network switch takes some time after the interface is brought up before it begins passing packets. Our solution is to run a small script in between network and ypbind that pings the gateway till it gets a ping back. (This is more reliable than waiting some empirical amount of time.) It would be nice if either network or ypbind would take care of making sure the net is actually passing packets before ypbind tries to connect to the NIS server.
We face the same issue. We have NIS enabled for authentication and NFS shares mounted at boot. We use NetworkManager. On some machines, on Fedora 15, the boot is very quick and when we get the login prompt, ypbind isn't started yet. So are the NFS shares. We must wait at least 15 seconds. On other machines, some recents Dell, everything works fine, ypbind is always started before getting a login prompt. Enabling the NetworkManager-wait-online.service makes ypbind start at time on faulty machines but NFS shares are never mounted. I don't really understand why it works on some machines (all of the same model) and why it doesn't on anothers (another model, all of the same).
I don't know exactly when this happened, but this has ceased being a problem in my environment. There have been a lot of updates lately to startup-related packages (selinux policy, kernel, etc). My NIS clients consistently boot correctly now (where NIS usernames can successfully login at the initial GDM screen without first manually bouncing ypbind).
(In reply to comment #66) > I don't know exactly when this happened, but this has ceased being a problem in > my environment. Just re-verified that this bug still occurs on a fully up-to-date system.
ypbind-1.32-8.fc15.1.i686 systemd-26-8.fc15.i686 NetworkManager-0.8.9997-6.git20110721.fc15.i686 I'm seeing something very similar to this. ypbind never starts, though all services are started and up. ifconfig shows interface em1 up, with an address via DHCP, and DNS works. External NFS directories are mounted. ypbind 0:off 1:off 2:off 3:on 4:on 5:on 6:off network 0:off 1:off 2:on 3:on 4:on 5:on 6:off NetworkManager.service - Network Manager Loaded: loaded (/lib/systemd/system/NetworkManager.service) Active: active (running) since Thu, 18 Aug 2011 10:34:40 -0700; 23h ago /etc/sysconfig/network-scripts/ifcfg-em1 has the line: NM_CONTROLLED="no" Running "ypbind -debug -verbose" on the command line, it never sees em1 marked as up, and seems to be waiting for info via dbus that it's up. But the interface *is* up, just not under the control of NetworkManager. Adding OTHER_YPBIND_OPTS="-no-dbus" to /etc/sysconfig/network bypasses this problem. After reboot (or a ypbind restart) everything works as expected. I hope this diagnose the issue or help others work around it.
(In reply to comment #68) > /etc/sysconfig/network-scripts/ifcfg-em1 has the line: > NM_CONTROLLED="no" > > Running "ypbind -debug -verbose" on the command line, it never sees em1 marked > as up, and seems to be waiting for info via dbus that it's up. But the > interface *is* up, just not under the control of NetworkManager. I actually don't understand why you are running NetworkManager, that handles no connections. Or you have one connection handled by NetworkManager and another not?
(In reply to comment #69) > I actually don't understand why you are running NetworkManager, that handles no > connections. Or you have one connection handled by NetworkManager and another > not? You're correct, I really don't need NetworkManager, but I had left it running (lazy me). "chkconfig --list NetworkManager" gives me a blank, so I'd previously ignored it. In fact, if I disable NetworkManager (and recycle network): systemctl stop NetworkManager.service systemctl disable NetworkManager.service /etc/init.d/network restart Now ypbind operates correctly without adding the -no-dbus option. It appears that ypbind assumes that if NetworkManager is up that it controls *all* interfaces. That doesn't allow for an interface that may have opted out with the NM_CONTROLLED="no" option. Perhaps that isn't a good assumption?
(In reply to comment #70) > Now ypbind operates correctly without adding the -no-dbus option. It appears > that ypbind assumes that if NetworkManager is up that it controls *all* > interfaces. That doesn't allow for an interface that may have opted out with > the NM_CONTROLLED="no" option. Perhaps that isn't a good assumption? When starting ypbind, it checks if NetworkManager is running and if so, it waits for a connection on dbus. If NM is not running, ypbind supposes system is connected. So if we want to use NM together with NM_CONTROLLED="no" option, then -no-dbus option seems to be the only and right solution.
*** Bug 738000 has been marked as a duplicate of this bug. ***
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
Is this still a problem or can this bug be closed?
This is still a problem for us but the time we have to wait between GDM appearing and the time we can login through NIS is now acceptable (between 5 and 20 seconds). ypbind now try to restart until NM has started. But in our case, GDM appears too soon so the user believes he can log in although ypbind is not ready.
Can anybody who encounter this issue test it again with a new update, available in [1]? Or just try to add "Before=systemd-user-sessions.service" into your ypbind.service file (how to customize a unit file see [2]). It doesn't fix the problem itself, but it could be able to login immediately after boot. [1] https://admin.fedoraproject.org/updates/ypbind-1.33-11.fc16 [2] https://fedoraproject.org/wiki/Systemd#How_do_I_customize_a_unit_file.2F_add_a_custom_unit_file.3F
I am on the CC list since we have a related but subsidiary problem: We make extensive use of automounted nfs filesystem sharing. The automount maps are shared via NIS, so startup of autofs must wait until NIS is running. The new update does not fix this problem, which occurs during the init sequence not after. Our machines are connected via Cisco Catalyst 2960 switches, which take about 30 seconds to negotiate the link after the network is started on a given machine. The present ypbind init script usually does not succeed in waiting for the network to be ready, and ends up in a failed condition. Then autofs runs but does not have any maps, so it does not see the desired remote filesystems. We have been dealing with this problem by use of an rc.local script that waits till the network is ready (by pinging the local router till it succeeds), then restarts ypbind and autofs.
(In reply to comment #77) > Our machines are connected via Cisco Catalyst 2960 switches, which take about > 30 seconds to negotiate the link after the network is started on a given > machine. Sounds like the kind of problem that NETWORKDELAY=<delay in seconds> in /etc/sysconfig/network was designed to solve.
Adding NETWORKDELAY=30 to /etc/sysconfig/network had no discernable effect.
Is this bug not dupe of 709637?
Can you guys have a look at bug 709637 or rather bug 756123 to see if it matches your issue. ( and we can close this as an dupe ) Ian mentions a workaround in comment 1 in 756123 for autofs init script which you can test to see if it solves your issue and or just rebuild and test the latest autofs package in koji which has native systemd units. Thanks.
Sorry, I'm not using NIS at the moment so can't test.
Referring to my comments 77 and 79, I should add that we have been using /etc/init.d/network not NetworkManager. Ian's workaround in comment 1 of 756123 cannot be applied. I do not find the diff context (there is no mention of ypwhich) in the current autofs init script (autofs-5.0.6-5.fc16). However, the good news is that the problem is solved if you use the right services. I disabled network and enabled NetworkManager and NetworkManager-wait-online.service. I added NETWORKWAIT=yes to /etc/sysconfig/network. The ypbind and autofs services are enabled. With these changes, the problem goes away: ypbind and autofs successfully start up during the init sequence and autofs uses the NIS-provided maps. To answer the question in comment 81: Yes, I think this bug is essentially a duplicate of bug 709637 and bug 756123. All three bugs are rooted in ypbind trying to start and failing before the network is ready, and autofs in turn starting when ypbind has failed.
Just some thoughts before I close this as duplicate I've been wondering since in F15 ypbind still uses the legacy sysv init script that the real underlying problem might be that ybind lacks $network in Required-Start: section of the LSB Headers of the legacy sysv init script. Robert could you revert to you previous network setup without NetworkManager and add "$network" to the Required-Start: section of the ypbind legacy sysv init script to see if that fixes your issue. Honza could just fix it for F15 if that turned out to be the case.
Sorry, there is no longer a legacy ypbind init script there. But I know that this does not fix it since I see from backups that when we were running F15, the dependency was already there. Excerpt from /etc/init.d/ypbind of 11 Aug 2011: ### BEGIN INIT INFO # Provides: ypbind # Required-Start: $local_fs $remote_fs $network $rpcbind waitfornet # Required-Stop: $local_fs $remote_fs $network $rpcbind (The waitfornet dependency is our workaround addition.) The reason the $network dependency does not solve it is that the network service startup runs and exits with success, so the Required-Start is satisfied, but the network is not yet actually functioning due to external factors, namely the switch to which the machine is connected is not yet passing packets.
Given that F15 is reaching it's EOL. Can someone confirm that this still an issue on F16/F17/rawhide so I can move it against that $release or is it safe to close this bug since this problem has been addressed in $release?
This is still an issue for me in F16 and F17
There are SO MANY comments here that I read about half of them and then gave up... scrolled to the bottom, saw the comment immediately preceeding this one... and decided to leave this comment. I ran into this issue on Fedora 17 but I found a work around. My host would boot but yes, ypbind would try to start before network was present and fail. It would fail to fix itself after network became available. Doing a systemctl status ypbind.service an error was mentioned that there was a hostname lookup failure. Of course it can't look up the host over the network if network isn't available but if I add the NIS host to /etc/hosts it CAN do a lookup. So, I added an entry to /etc/hosts for my NIS server and rebooted. I have not had a problem since. With the NIS host added to /etc/hosts ypbind seems to be able to recover after network becomes available. I know that isn't the best solution because the IP address of the NIS host might change and it isn't fun having to manually update /etc/hosts on the hosts, but at least it works.
This message is a notice that Fedora 15 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 15. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '15' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 15 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping