Bug 38395
Summary: | rpc.statd dies shortly after boot for DHCP+NIS clients | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Rex Dieter <rdieter> |
Component: | nfs-utils | Assignee: | Pete Zaitcev <zaitcev> |
Status: | CLOSED NOTABUG | QA Contact: | David Lawrence <dkl> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 6.2 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2001-05-01 20:37:19 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Rex Dieter
2001-04-30 14:32:00 UTC
Which kernel version are you running? Ah, sorry I didn't mention kernel versions... This happened all the way back to kernel-2.2.16-3 and up to kernel-2.2.19-6.2.1 Just FYI, I watched the boot process carefully this morning, and it turns out that the nfs (and nfslock) services do indeed start AFTER the eth0 interface is brought up... so my theory about the DHCP changing addresses messing up rpc.statd is most likely not correct. As of 2.2.19+, lockd is started automatically by the kernel as needed, so you shouldn't be seeing the traditional "Starting NFS lockd" messages at boot time. Can you verify the versions of the kernel and nfs-utils with rpm -q? With the latest kernel, no, I did not see starting lockd, but I DO see this: Starting NFS file locking services: Starting NFS statd: OK Here's the specific versions of rpms you requested: [root@... RPMS]# rpm -q nfs-utils nfs-utils-0.3.1-0.6.x.1 [root@...RPMS]# rpm -q kernel kernel-2.2.17-14 kernel-2.2.19-6.2.1 I might try adjusting the priority of the service so that it starts a bit later in the boot process. These services do get started immediately after eth0 is brought up, and maybe it's not quite completely initialized yet or something... I wonder if you might have gotten a bad RPM for nfs-utils. You shouldn't be seeing the message "Starting NFS file locking services" at all with nfs-utils-0.3.1-0.6.x.1. Can you completely remove the nfs-utils rpm and then reinstall it? Also, I don't think this is a problem with nfs services starting too soon after the networking is brought up. After you see "Starting network services...[ OK ]", networking should be completing up and ready to run. This is the relavent portion of the init script /etc/rc.d/init.d/nfslock, and from what I can tell, rpc.statd DOES get started regardless of kernel version. rpc.lockd is the part that doesn't get displayed with recent kernels (as it should be): start() { # Start daemons. echo "Starting NFS file locking services: " if [ "$KERNVER" -lt 24 -a "$KERNREL" -lt 18 ]; then echo -n "Starting NFS lockd: " daemon rpc.lockd echo fi echo -n "Starting NFS statd: " daemon rpc.statd RETVAL=$? echo [ $RETVAL -eq 0 ] && touch /var/lock/subsys/nfslock return $RETVAL } I verified my current nfs-utils package: rpm --checksig nfs-utils-0.3.1-0.6.x.1.i386.rpm nfs-utils-0.3.1-0.6.x.1.i386.rpm: md5 gpg OK I redownloaded nfs-utils from ftp.redhat.com, and diffed it against my rpm with no differences found. Oh, I forgot to mention, yes, I've tried removing and re-instaling nfs-utils without a change in behavior. As a matter of fact, I've tried it on a bunch of machines we have here (I'd say 5 or 6 of them), and they all still exhibit this same bad behavior of rpm.statd dying. I've been able to narrow the circumstances of rpc.statd's death. All these machines in question are also NIS clients. If the nfslock service is started before ypbind is up and running (this is what a normal boot does), this is the scenario where rpc.statd dies. If rpc.statd is started after ypbind, then all is well. I repeated this several times after booted up: 0. service nfslock stop: OK (or failed if dead already). 1. service ypbind stop: OK 2. service nfslock start: OK 3. service nfslock status: rpc.statd not running 4. service ypbind start: OK 5. service nfslock start: OK 6. service nfslock status: rpc.statd (PID:xxx) running. Now I'm even more confused. How/why does rpc.statd depend upon NIS? and why only for DHCP clients (as I said before static hosts are fine)? (most likely) FINAL UPDATE: I think I can conclude that this problem was due in large part to actions of my own. I had lowerred the MINUID for our NIS server (modifying /var/yp/Makefile) to 20 to accomodate our very old users with low uid's. In doing so, the rpcuser account created on the NIS server by the new nfs-utils package caused this user to be distributed via NIS. I have an update script that runs on our client machines (in /etc/rc.d/rc.local) which I had used to update the nfs-utils. The %pre script portion of the install was supposed to create a local user name rpcuser, but it failed because this user "already exists" at this point. This explains why the rpc.stat service failed when ypbind was not yet running and why it worked when ypbind WAS running. So, at this point, my only complaint is that the installation of the nfs-utils rpm gave me no error or indication of a mis-installation (about failing to properly create the necessary rpcuser account). Perhaps the %pre script in nfs-utils that creates the rpcuser ought to be modified to squawk a little if it fails? |