Description of problem: Kernel printk: lockd: too many open TCP sockets, consider increasing number of nfsd threads. Have increased the number to rpc.nfsd 256 and higher, testing locking single nfs file on ~200+ servers at the same time, on nfs server watching number of uniq socket connections to nlockmgr port. Once the number of connections reaches 80 - the above printk is logged and additional connections are denied until some of the first 80 established ports start closing. Changing the number of nfsd threads running higher or lower makes no change in number of rpc connections - 80 it is....eventually all clients do finish locking the file however not in the time expected as this is causing alot of timeo client retries and very timely lock response. net/sunrpc/svcsock.c reads: if (serv-_sv_tmpcnt > (serv->sv_nrthreads+3)*20) {..... if "sv_nrthreads=1" then the number of socket connections is limited to 80, this is what is happening. I have rebuilt a test kernel mod'ing 20->40 in the above code and this does in fact increase the number of sockets limited to 160. Seems to me that sv_nrthreads should equal the actual number of rpc.nfsd's running else this limitation makes no sense. Version-Release number of selected component (if applicable): Have tested on both ES4.4 - ES4.6 kernel baselines How reproducible: everytime rpc sockets reach 80 on nfs server Steps to Reproduce: above.... Actual results: socket connections limited to 80 established connections. Expected results: number of rpc socket connections descrease/increase with the number of nfsd's running.
Actually... It looks like the problem is that lockd is running out of sockets. So increasing the number of nfsd's won't have any effect here. The warning message here is bogus. It comes from generic RPC code but is a warning about "nfsd" sockets (that, at the very least, should be fixed). Unfortunately, you can't increase the number of lockd threads -- it's necessarily single-threaded... From a glance at the upstream code, it looks like the same problem exists there. Are you also able to reproduce this with recent fedora or something closer to current mainline kernels?
Fixing this will probably require changing the check you mention, but I need to first understand the purpose of that check in the first place (i.e. are there other hard limits that we'll hit if we remove it).
I have just tested with 2.6.18-53.1.4 (ES5.1) and the same limit/printk exists. Looking at the code for 5.2 kernel it will also exist there. We were originally thinking that this was a lockd thread limitation, but glancing at referencing code it seemed to increment sv_nrthreads when more nfsd's were started, which would be in line with the message, guess this is not the case. I havent seen any degraded I/O performance since increasing the count.
Increasing the number of nfsd threads increases sv_nrthreads for nfsd only. It doesn't have any effect on lockd. I think this BZ points out the need for a couple of things: 1) fix this printk to be more generic. It shouldn't explicitly mention nfsd threads since the number of nfsd sockets isn't a problem in this case. 2) check and see why this limit on the number of sockets exists in the first place. It's probably there to try and limit DoS attacks on an RPC service, but it seems like this limit ought to be tunable (or maybe just go away entirely for services that are single-threaded). I'll probably need to toss this question out to the upstream linux-nfs mailing list since the reason for setting this limit where it is isn't exactly clear...
Sent a patch to make the warning message more generic upstream. I also asked for clarification about why the hardcoded check uses: (sv_nrthreads + 3) * 20 ...as a formula.
Created attachment 320451 [details] patch -- remove svc_check_conn_limits RFC patch that I've sent upstream. This just removes the check altogether (and some other code that won't be needed with it gone). This may get shot down altogether or need some modification, but it's at least a starting point for discussion. Awaiting comment there now.
Created attachment 320884 [details] patchset -- add sv_maxconn field to svc_serv After some upstream discussion, this patchset seems to be pretty close to being accepted. We'll probably also be able to do something similar for RHEL, but it's likely to look different since we'll have to fix up kABI.
Bruce Fields took the latest patchset into his tree so it seems likely to go upstream. I don't think this will be appropriate for RHEL4 though. It's a kABI-breaker, for one thing. It's also too late for 4.8 and I don't think it meets the threshold of criticality that 4.9 will have. For this reason, I'm going to go ahead and close this WONTFIX and clone the bug for RHEL5. We can evaluate it for inclusion there.
RHEL5 equivalent BZ opened as bug 468092
OK, Thanks Jeff. What you have derived at seems to explain alot. I have still been running in my test environment with my modified kernel manually increasing the lockd thread cap, which will work for me for now, will just have to live with the longer timeout to drop unused ports on our production systems. Dont think at this point is causing other system degradation. Thanks again and will follow the BZ and look for potential resolution under 5.3 with kernel tuning at that point.
Ok, note that this isn't going to go into 5.3, we're looking at 5.4 at the earliest.