Bug 1163886 - rpc.mountd can be blocked by a bad client
Summary: rpc.mountd can be blocked by a bad client
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: nfs-utils
Version: 21
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1163889
TreeView+ depends on / blocked
 
Reported: 2014-11-13 15:58 UTC by Steve Dickson
Modified: 2015-01-14 07:28 UTC (History)
3 users (show)

Fixed In Version: nfs-utils-1.3.1-4.1.fc21
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1163889 (view as bug list)
Environment:
Last Closed: 2015-01-14 07:28:19 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
rpc.mountd: set nonblocking mode if no libtirpc (4.18 KB, application/mbox)
2014-11-13 16:02 UTC, Steve Dickson
no flags Details
rpc.mountd: set nonblocking mode with libtirpc (1.59 KB, application/mbox)
2014-11-13 16:03 UTC, Steve Dickson
no flags Details
rpc.mountd: set libtirpc nonblocking mode to avoid DOS (2.57 KB, application/mbox)
2014-11-13 16:04 UTC, Steve Dickson
no flags Details
reproducer (2.42 KB, text/plain)
2014-11-13 16:09 UTC, Steve Dickson
no flags Details

Description Steve Dickson 2014-11-13 15:58:00 UTC
Description of problem:
From https://bugzilla.suse.com/show_bug.cgi?id=901628:

A few weeks ago we had some trouble at a customer with a NFS server. The clients most of the time could not mount any shares, but in rare cases they had success.

We found out, that during the times when mounts failed, rpc.mountd hung on a write() to a TCP socket. netstat showed, that Send-Q was full and Recv-Q counted up slowly. After a long time the write ended with an error ("TCP timeout" IIRC) and rpc.mountd worked normally for a short while until it again hung on write() for the same reason. The problem was caused by a MTU size configured wrong. So, one single bad client (or as much clients as the number of threads used by rpc.mountd) can block rpc.mountd entirely.

But what happens, if someone intentionally sends RPC requests, but doesn't read() the answers? I wrote a small tool to test this situation. It fires DUMP requests to rpc.mountd as fast as possible, but does not read from the socket. The result is the same as with the problem above: rpc.mountd hangs in write() and no longer responds to other requests while no TCP timeout breaks up this situation. So it's quite easy to even intentionally block rpc.mountd from remote.

I've done some further investigations. I tested rpcbind to see, whether it has the same weakness. But rpcbind uses rpc_control(SVCSET_CONNMAXREC) to switch to nonblocking mode of libtirpc. That nonblocking mode shows two positive effects:
- an attacker sending requests as fast as possible to rpcbind will have no
  success. As soon as rpcbind/libtirpc finds more than one request readable
  at the socket, it closes the connection.
- if the socket buffer is full, the write() fail with -EAGAIN. libtirpc
  uses a loop to retry the write for max. 2 seconds. Then it closes the
  connection.

Unfortunately the write retry loop in libtirpc has a bug. It increments
the length of and decrements the pointer to the retry buffer on each failed
write(). I've sent a patch to libtirpc-devel about 3 weeks ago, but didn't get a
response yet. (I'll attach the patch)

Regarding rpc.mountd, I've found, that using multiple processes (e.g. -t 4) doesn't work well. When using libtirpc or when not using libtirpc but setting -p xxxx option, the listening sockets (tcp listener and udp socket) are not in non-blocking mode. Thus, if a single connection request comes in, all threads wake up from the select(), but only one accept() succeeds. All other threads will wait in accept() for further connection requests.
If a RPC-request comes in via UDP, what happens is very similar: all threads wake up, one thread handles the request, all others wait in read() for further UDP requests.
As TCP connections are assigned to specific threads, all connections handled by one thread will be block as long as the thread waits in accept() or read(). Thus, I've written two patches (attached), that set all listeners to non-blocking in support/nfs/*. A version of the patches for 1.3.1 was sent to linux-nfs, but I got no reply yet.

A further patch (attached) inserts rpc_control(SVCSET_CONNMAXREC) into nfs_svc_create() in support/nfs/svc_create.c for the case of libtirpc.
That patch hardens rpc.mount against DOS attacks (and probably also statd,
as it also uses nfs_svc_create()). Please see this patch as experimental only. I'm not sure, whether setting MAXREC might have negative side effects as I'm not a RPC expert.

Bodo

Comment 1 Steve Dickson 2014-11-13 16:02:41 UTC
Created attachment 957207 [details]
rpc.mountd: set nonblocking mode if no libtirpc

Comment 2 Steve Dickson 2014-11-13 16:03:17 UTC
Created attachment 957208 [details]
rpc.mountd: set nonblocking mode with libtirpc

Comment 3 Steve Dickson 2014-11-13 16:04:14 UTC
Created attachment 957209 [details]
rpc.mountd: set libtirpc nonblocking mode to avoid DOS

Comment 4 Steve Dickson 2014-11-13 16:09:23 UTC
Created attachment 957210 [details]
reproducer

Comment 5 Fedora Update System 2014-11-13 20:13:36 UTC
nfs-utils-1.3.1-2.1.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/nfs-utils-1.3.1-2.1.fc21

Comment 6 Fedora Update System 2014-11-14 12:06:50 UTC
Package nfs-utils-1.3.1-2.1.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing nfs-utils-1.3.1-2.1.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-15038/nfs-utils-1.3.1-2.1.fc21
then log in and leave karma (feedback).

Comment 7 Fedora Update System 2014-11-17 22:53:20 UTC
nfs-utils-1.3.1-2.2.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/nfs-utils-1.3.1-2.2.fc21

Comment 8 Fedora Update System 2015-01-14 07:28:19 UTC
nfs-utils-1.3.1-4.1.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.