Description of problem: The ypbind service is not starting, due to the fact that the ypbind init.d script is checking for ypbind being registered with rpcbind too quickly after starting it. Here's the offending section of the script: numbers at the start of the line refer to the analysis below: ---------------------------- echo -n $"Starting NIS service: " selinux_on [1] daemon ypbind $OTHER_YPBIND_OPTS RETVAL=$? echo if [ $RETVAL -ne 0 ]; then selinux_off logger -t ypbind "failed to start!" return $RETVAL fi echo -n $"Binding NIS service: " # the following fixes problems with the init scripts continuing # even when we are really not bound yet to a server, and then things # that need NIS fail. timeout=$NISTIMEOUT while [ $timeout -gt 0 ]; do [2] /usr/sbin/rpcinfo -p | LC_ALL=C fgrep -q ypbind && \ /usr/bin/ypwhich > /dev/null 2>&1 RETVAL=$? if [ $RETVAL -eq 0 ]; then break; fi echo -n "..." # ypwhich has a hardcode 15sec timeout # so subtract that from NISTIMEOUT to # to see of we should continue to wait [3] timeout=`expr $timeout - 15` done ------------- [1] ypbind is started as a daemon at this point, so the script will continue past this point. At that instant, ypbind is not yet registered with rpcbind [2] If ypbind is NOT yet registered with rpcbind, the first part of the "&" condition will fail, and the ypwhich will NOT be executed. [3] The script *assumes* that the ypwhich call will delay 15 seconds, but since it was not executed, no delay occurs, and the loop will immediately continue. As a result, the (default 3) iterations of the loop which are supposed to take 45 seconds happen pretty much instantaneously and the loop terminates. The script then assumes that the ypbind daemon isn't running properly and kills it. The simple and stupid solution is to put a "sleep 1" before the loop, to give ypbind plenty of time to start and register with rpcbind. The better solution would be to split the check of rpcbind and the ypwhich into 2 lines, and in the case that rpcbind fails to show ypbind then delay and try again - however, this will *still* cause the start-up to be delayed by 1 second in the failure case. Even better would be to really check the elapsed time rather than assuming the 15 second timeout, and actually wait the specified time - this might be done by starting a "sleep 45" as a backgrounded command and looking for it to terminate. Version-Release number of selected component (if applicable): ypbind-1.20.4-2.fc8 How reproducible: Every time Steps to Reproduce: 1. service ypbind restart or service ypbind start Actual results: ypbind starts and then is killed by the startup script. (see discussion) Expected results: ypbind runs Additional info:
I have run into this same issue after installing fedora core 8. My work around was to add a sleep 1 statement. The problem is that the failure only occurs (at least with me) during the bootup stage. After the system booted up, I would log in as root and execute '/etc/rc.d/init.d/ypbind start' and it would come right up. Thus it was a bit deceptive as to whether the startup script was busted or not. Anyway, my sleep 1 added to the script (I put it right after the echo -n "...") fixed my problem.
I did not run into this problem when I first installed Fedora 8 on November 12. Since then a number of updates have been installed. Now when I reboot, the boot time script that starts ypbind reports failure. When I start ypbind manually, it works. From watching the boot, I can see the script that is checking whether ypbind started correctly is not waiting anything like 45 seconds. I think it didn't wait at all, but I can't say for sure. This is running on a computer with 2 CPU's. ypbind: ypbind-1.20.4-2.fc8 ypwhich: yp-tools-2.9-2 rpcinfo: rpcbind-0.1.4-11.fc8
The Problem persists.
I have the same problem as comment #2
Fixed in ypbind-1.20.4-3.fc9
will this be coming to Fedora 8?