Upgrading to glibc-2.2-9 and nscd-2.2-9 causes NFS lockd to fail with a report of the form: lockd: cannot monitor 10.10.10.10 lockd: failed to monitor 10.10.10.10 Running RedHat 7.0 with all patches, talking to a solaris server. Downgrading to glibc/nscd 2.2-5 solves the problem.
Did you get any other related messages in /var/log? The messages come from the kernel, so it does not tell much about what's going on. Also, can you try glibc-2.2-9 with nscd 2.2-5 resp. glibc-2.2-9 with no nscd running?
Sorry, yes. I also get Dec 19 18:36:10 chiron rpc.statd[1681]: gethostbyname error for chiron Dec 19 18:36:10 chiron rpc.statd[1681]: STAT_FAIL to chiron for SM_MON of 38.245.76.2 where 38.245.76.2 is the host I'm NFS mounting. I'll try the other suggestions later today.
Can you try something like: #include <netdb.h> #include <stdio.h> #include <netinet/in.h> int main(void) { struct hostent *h; h = gethostbyname("chiron"); if (h == NULL) printf ("gethostbyname failed\n"); else printf ("%s %s\n", h->h_name, inet_ntoa(**(struct in_addr **)h->h_addr_list)); exit (0); } ? It is gethostbyname("chiron") which fails in rpc.statd, so I wonder if it fails in this proglet as well...
Running glibc-2.2-9 with no nscd, or with nscd-2.2-5 has no effect: I get the same error messages. The test program you sent correctly prints my (fully-qualified) hostname and IP address.
Ok, so can you please: kill nscd, kill rpc.statd start strace -o /tmp/statd.log /sbin/rpc.statd -F and see if it prints the error as well and if yes, what it writes in the strace log? I was unable to reproduce it so far...
rpc.statd (with the -2.2-9 packages running) dies immediately once started with -F. The output of strace is at http://www.east.isi.edu/~csp/misc/statd.log The last thing I see before it dies is No such file or directory for /etc/localtime, but that file exists and has the correct permissions. rpc.statd works fine when run without -F, but gives to useful output (the strace just shows it forking the daemon process).
I think this is related to the fact that rpc.statd runs chrooted, I wonder why name resolving actually worked before for it, will do some debugging.
glibc-2.2-12 still has the same problem. In my case I'm serving nfs from a 6.2 linux box, and the client which has locking problems is a fully updated 7.0.
I have 2 NFS servers: the first has 2 ethernets, the second only one. Apparently the locking fails only on the first server, and it fails only after the glibc upgrade on them. Other server and client already upgraded without problems. Reading rps.statd sources (nfs-utils 0.2.1), apparently fails the gethostbyname(myname) on the server. Hope this helps. Gabriele Turchi P.S.: My english is alpha version...
Workaround found. # mkdir /var/lib/nfs/statd/lib # cp /lib/libnss_dns.so.2 /var/lib/nfs/statd/lib # cp /lib/libresolv.so.2 /var/lib/nfs/statd/lib # /etc/init.d/nfslock stop # /etc/init.d/nfslock start Now "It Works For Me". Gabriele Turchi
This should be fixed in the rawhide nfs-utils package; rpc.statd no longer runs chrooted.
Please push this update to rpc.statd out on RHN. The glibc update that broke it came in via RHN.