nfs serving dies on one of my servers which is running Red Hat 6.2 up to date with all the latest errata. At the moment I cannot tell why or how nfs died due to a lack of information (including syslog and dmesg). However this bug report is that once nfs had died, it was impossible to kill the nfsd processes to restart nfs or to reboot cleanly. The steps I tried were: 1) "service nfs stop": this simply hung and I killed it after waiting ten minutes. 2) I found the nfsd PIDs and ran "kill -9" on them. This failed to do anything and the processes continued to appear in the process table. 3) I tried to start a new set of nfs servers but they would not start (presumably with the old ones still running). 4) I removed all exports from /etc/exports and ran "exportfs -av" to notice there were no exports and kill the nfs processes. The nfsd processes did not die. 5) I noticed that nfs is provided in the Red hat 6.2 shipped kernel by a kernel module. I tried "rmmod" to remove all nfs modules but this just returned some error very much like: module X in use 6) running out of options I rebooted the server. However this hung going into runlevel 6 when it came to stop the nfs services with the SysV init script, as I had tried manually in step 1. I waited ten minutes again and nothing happened. In the end I had to do a hard reset on the server with all the possibilities for data loss this created. Upon the next boot, nfs started perfectly OK. Although nfsd is a front-end to the kernel nfs server, would it be possible to patch it in some way such that even when it had died, you could kill it with "kill -9" so that it could be restarted without hard-resetting the server? Or is kernel nfs serving such that if it dies there is no way out? In which case, will Red Hat ship with both kernel and user space nfs servers which customers can choose from out of the box (like vixie-cron vs. anacron)? Thanks.
I'm seeing more severe behavior. The machine will come up okay. However, as soon as a mount request is made, everything goes down the drain. Any open shells stop accepting input. login running on console ttys stop bringing up passwords prompts. The upside is that the interfaces and firewall continue routing and filtering correctly, but named stops responding queries. sshd stops accpeting logins. Ctrl-Alt-Del fails to reboot the machine. Bewm... hit the reset switch. I forced a replacement of my nfs-utils package... same behavior. I went ahead and made all my logging synchronous and this is all that shows up: Aug 25 00:51:37 router nfs: rpc.mountd shutdown failed Aug 25 00:51:37 router nfs: rpc.mountd startup succeeded Aug 25 00:51:37 router kernel: Installing knfsd (copyright (C) 1996 okir.de). Aug 25 00:51:54 router mountd[721]: export request from 127.0.0.1 Aug 25 00:53:17 router mountd[721]: authenticated mount request from host52.haus.nebcorp.com:947 for /home (/home) The next log line is from after the reboot: Aug 25 00:57:44 router syslogd 1.3-3: restart. This behavior has started suddenly and is not on the heels of any configuration change.
I am happy for the bug report to be closed as "not a bug" as I have evaluated the alternatives independently and am happy that actually Red Hat 6.2 is shipped with the best stable NFS support available. Cheers and sorry for opening this originally when it was not really required, Paul
Thankfully NFS improved much since then. Closing