Hide Forgot
Description of problem: From https://bugzilla.suse.com/show_bug.cgi?id=901628: A few weeks ago we had some trouble at a customer with a NFS server. The clients most of the time could not mount any shares, but in rare cases they had success. We found out, that during the times when mounts failed, rpc.mountd hung on a write() to a TCP socket. netstat showed, that Send-Q was full and Recv-Q counted up slowly. After a long time the write ended with an error ("TCP timeout" IIRC) and rpc.mountd worked normally for a short while until it again hung on write() for the same reason. The problem was caused by a MTU size configured wrong. So, one single bad client (or as much clients as the number of threads used by rpc.mountd) can block rpc.mountd entirely. But what happens, if someone intentionally sends RPC requests, but doesn't read() the answers? I wrote a small tool to test this situation. It fires DUMP requests to rpc.mountd as fast as possible, but does not read from the socket. The result is the same as with the problem above: rpc.mountd hangs in write() and no longer responds to other requests while no TCP timeout breaks up this situation. So it's quite easy to even intentionally block rpc.mountd from remote. I've done some further investigations. I tested rpcbind to see, whether it has the same weakness. But rpcbind uses rpc_control(SVCSET_CONNMAXREC) to switch to nonblocking mode of libtirpc. That nonblocking mode shows two positive effects: - an attacker sending requests as fast as possible to rpcbind will have no success. As soon as rpcbind/libtirpc finds more than one request readable at the socket, it closes the connection. - if the socket buffer is full, the write() fail with -EAGAIN. libtirpc uses a loop to retry the write for max. 2 seconds. Then it closes the connection. Unfortunately the write retry loop in libtirpc has a bug. It increments the length of and decrements the pointer to the retry buffer on each failed write(). I've sent a patch to libtirpc-devel about 3 weeks ago, but didn't get a response yet. (I'll attach the patch) Regarding rpc.mountd, I've found, that using multiple processes (e.g. -t 4) doesn't work well. When using libtirpc or when not using libtirpc but setting -p xxxx option, the listening sockets (tcp listener and udp socket) are not in non-blocking mode. Thus, if a single connection request comes in, all threads wake up from the select(), but only one accept() succeeds. All other threads will wait in accept() for further connection requests. If a RPC-request comes in via UDP, what happens is very similar: all threads wake up, one thread handles the request, all others wait in read() for further UDP requests. As TCP connections are assigned to specific threads, all connections handled by one thread will be block as long as the thread waits in accept() or read(). Thus, I've written two patches (attached), that set all listeners to non-blocking in support/nfs/*. A version of the patches for 1.3.1 was sent to linux-nfs, but I got no reply yet. A further patch (attached) inserts rpc_control(SVCSET_CONNMAXREC) into nfs_svc_create() in support/nfs/svc_create.c for the case of libtirpc. That patch hardens rpc.mount against DOS attacks (and probably also statd, as it also uses nfs_svc_create()). Please see this patch as experimental only. I'm not sure, whether setting MAXREC might have negative side effects as I'm not a RPC expert. Bodo
Created attachment 957207 [details] rpc.mountd: set nonblocking mode if no libtirpc
Created attachment 957208 [details] rpc.mountd: set nonblocking mode with libtirpc
Created attachment 957209 [details] rpc.mountd: set libtirpc nonblocking mode to avoid DOS
Created attachment 957210 [details] reproducer
nfs-utils-1.3.1-2.1.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/nfs-utils-1.3.1-2.1.fc21
Package nfs-utils-1.3.1-2.1.fc21: * should fix your issue, * was pushed to the Fedora 21 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing nfs-utils-1.3.1-2.1.fc21' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-15038/nfs-utils-1.3.1-2.1.fc21 then log in and leave karma (feedback).
nfs-utils-1.3.1-2.2.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/nfs-utils-1.3.1-2.2.fc21
nfs-utils-1.3.1-4.1.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.