Description of problem: I try configure NFS service on Red Hat Enterprise Linux ES release 3 (Taroon Update 8). The service has to be able to serve big amount of NFS requests, so i configured the RPCNFSDCOUNT=240 in /etc/sysconfig/nfs . When i start the NFS service it says "Starting NFS daemon: [FAILED]". In fact it starts 78 nfsd server threads, but can't start the others. It logs "nfsd[PID]: nfssvc: Cannot allocate memory" in /var/log/messages. If i try to start less than or equal to 78 nfsd server threads, all is ok and their number is acurate. The machine has 4GB of RAM and has to other servers/services running on it. Version-Release number of selected component (if applicable): nfs-utils-1.0.6-44EL How reproducible: Each time the nfs service is started. Steps to Reproduce: 1. Configure RPCNFSDCOUNT=240 in /etc/sysconfig/nfs 2. Execute `service nfs start' 3. Count the number of `nfsd' server threads and check /var/log/messages about errors from the nfsd server. Actual results: The number of nfsd server threads started is not the desirable one and can't satisfy the needs for serving of big amount of NFS requests. Expected results: The number of nfsd server threads started has to be the 240 as the value of RPCNFSDCOUNT. Additional info: The linux kernel running on the machine is the following: # uname -a Linux host.tld 2.4.21-47.ELsmp #1 SMP Wed Jul 5 20:38:41 EDT 2006 i686 i686 i386 GNU/Linux Just for comparison ... we have another RHEL 3 ES server, runing 2.4.21-15.0.4.ELsmp kernel, with 4GB of RAM and it has no problems to start 240 nfsd server threads and run a lot of other servers/services.
Are both server using nfs-utils-1.0.6-44EL?
This one which is able to start 240 nfsd server threads is Red Hat Enterprise Linux ES release 3 (Taroon Update 3) and has nfs-utils-1.0.6-33EL installed and used on it.
A bit more info, also just for comparison ... We have a third server with 4GB of RAM, also acting as NFS service server. It is Red Hat Enterprise Linux ES release 3 (Taroon Update 3) and runs 2.4.21-27.ELsmp kernel with nfs-utils-1.0.6-31EL package installed and used. This server is unable to start more than 188 nfsd server threads. The reason is the same: nfsd[PID]: nfssvc: Cannot allocate memory
I have managed to install the old 2.4.21-15.0.4.ELsmp kernel on the machine, i tryed to configure as NFS server. After machine reboot with this kernel, it is able to start any number of nfsd server threads, even more than 240. My conclusion is that all kernels after 2.4.21-15.0.4 are BUGGY according to the memory management and interaction between the kernel and NFS server (nfsd).
Anybody working on this BUG ?
We have to set up a new NFS server which will use fiber-channel connected external storage. The FC HBA will be QLE2460. Support for this card is present in the qla driver version which is part of 2.4.21-47 kernel (as i was able to find out). Since i can't start as many NFS server threads as i want with kernels newer than 2.4.21-15.0.4, i need an urgent fix of the problem, this bug is regarded to.
A few comments -- On a two processor Dell system, with 1GB memory, I have been able to start 512 server threads. The kernel that I am using is kernel-2.4.21-47.3.EL and the nfs-utils is nfs-utils-1.0.6-44EL. Not that I think that not starting the number of threads which was specified is right, but do you really think that you need that many threads? Even the largest servers that I have ever seen, running under an extreme load like SpecSFS use about 100 threads at peak load. These are systems with 32 or more processors and are running at loads of tens of thousands of operations per second. I think that we are going to need some more information in order to try to help to diagnose this situation.
We have no reproducer and our internal tests indicate that it is working correctly with larger numbers of threads. Closing INSUFICIENT DATA until Support provides a reproducer.