Description of Problem:
rpc.statd, when started, only creates sockets on ports >1024.
When a client rpc.statd receives a SM_NOTIFY after a server crash, it tries to
notify lockd so that it can reclaim its held locks on the newly
restarted server. However the kernel lockd checks that such notices
come from privileged ports on localhost, and issue an error:
Oct 16 15:26:02 pc20 kernel: lockd: rejected NSM callback from 7f000001:1027
Oct 16 15:26:02 pc20 rpc.statd: recv_rply: [127.0.0.1] RPC status 5
So lockd does not reclaim its locks.
The same problem occurs when a client crashes : on reboot it sends a notify to
the server rpc.statd, which tries to notify lockd that the client locks must be
released. lockd refuses, and stale locks remain.
Version-Release number of selected component (if applicable):
tested on RH7.3 with nfs-utils-0.3.3-5, and RH8.0 nfs-utils-1.0.1-2.
On a RH6.2 with nfs-utils-0.3.1-0.6.x.1, there is no problem. rpc.statd
has an additional port < 1024 to send its notices.
Steps to Reproduce:
1. Needs a nfs server and a nfs client machines.
2. after /etc/init.d/nfslock is started, or rpc.statd is started,
type (on root) (client and server)
# netstat -ap --ip | grep rpc.statd
You will see 2 open ports > 1024.
3. Mount an NFS partition on the client from the server.
4. Lock a file from the client, do some "sync" to be sure the non-volatile
nfs states are saved, and crash the client or the server.
5. On reboot, you should see on the other machine logs or console :
lockd: rejected NSM callback from 7f000001:[some port >1024]
4b. alternatively, simulate a crash of the server:
On the server :
# /etc/init.d/nfslock stop
# touch /var/lib/nfs/statd/sm/[IP of the Client]
rpc.statd does nothing, and locks are forgotten or stale.
Locks should be reclaimed when a server reboots, or cleared when a client reboots.
I examined the patch that RedHat applies to the base nfs-utils sources to
create the RPMs.
The "drop-privs.patch" has changed between 6.2-Update and 7.3 (I didn't look
in other 7.x)
In the 6.2 patch, the opening of the privileged socket is done before dropping
root uid; in the 7.3 it is done after (and thus, it fails). However the comment
" we're going to drop root privs, but before we do that,
* make sure to get our port <1024 socket"
is still at the same place.
I suspect bugs #59245 and #64757 are caused by the same problem.
I attach a patch to apply on top of drop-privs.patch to do the socket
opening at the right time; it made things work for me.
Created attachment 80645 [details]
patch to open socket before dropping privileges
I have noticed this bug as well and I do not think the patch solves all our
problems as statd_get_socket() gets called later as well. You will have to wait
until that piece of the code runs as well or move all the calls to
statd_get_socket() to a place before you drop_privs()
Juan Gomez/IBM (email@example.com)
The patch will work OK. "statd_get_socket" caches its socket fd, so if you
continue to call it after dropping privs you'll still get the old privileged
socket back. (Tested locally.)
I'll update the bugzilla entry again once I've got a fixed test nfs-utils
I've pushed 7.3 (nfs-utils-0.3.3-5.1) and 8.0 (nfs-utils-1.0.1-2.2) rpms to
for testing. These are unsigned private builds but should fix the problem, and
if you can confirm that I'll queue them for an errata.
*** Bug 59245 has been marked as a duplicate of this bug. ***
This package makes the messages dissappear on my test system.
(to other testers: Care should be taken when deploying the test packages since
they do not contain the mountd security fix).
Red Hat Linux and Red Hat Powertools are currently no longer supported by Red
Hat, Inc. In an effort to clean up bugzilla, we are closing all bugs in MODIFIED
state for these products.
However, we do want to make sure that nothing important slips through the
cracks. If, in fact, these issues are not resolved in a current Fedora Core
Release (such as Fedora Core 5), please open a new issues stating so. Thanks.