Bug 76065

Summary: rpc.statd can not open privileged socket; impact on lock reclaiming/releasing
Product: [Retired] Red Hat Linux Reporter: Emmanuel Preveraud <emmanuel.preveraud>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED NEXTRELEASE QA Contact: Ben Levenson <benl>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3CC: ekanter, juang, nicku, pawsa, per.starback, shishz
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-04 20:32:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch to open socket before dropping privileges none

Description Emmanuel Preveraud 2002-10-16 13:58:14 UTC
Description of Problem:
rpc.statd, when started, only creates sockets on ports >1024. 
When a client rpc.statd receives a SM_NOTIFY after a server crash, it tries to
notify lockd so that it can reclaim its held locks on the newly
restarted server. However the kernel lockd checks that such notices
come from privileged ports on localhost, and issue an error:

Oct 16 15:26:02 pc20 kernel: lockd: rejected NSM callback from 7f000001:1027
Oct 16 15:26:02 pc20 rpc.statd[557]: recv_rply: [127.0.0.1] RPC status 5

So lockd does not reclaim its locks.

The same problem occurs when a client crashes : on reboot it sends a notify to
the server rpc.statd, which tries to notify lockd that the client locks must be
released. lockd refuses, and stale locks remain.



Version-Release number of selected component (if applicable):
tested on RH7.3 with nfs-utils-0.3.3-5, and RH8.0 nfs-utils-1.0.1-2.


On a RH6.2 with nfs-utils-0.3.1-0.6.x.1, there is no problem. rpc.statd
has an additional port < 1024 to send its notices.

Steps to Reproduce:
1. Needs a nfs server and a nfs client machines.
2. after /etc/init.d/nfslock is started, or rpc.statd is started,
type (on root) (client and server)
# netstat -ap --ip | grep rpc.statd
You will see 2 open ports > 1024.
3. Mount an NFS partition on the client from the server. 
4. Lock a file from the client, do some "sync" to be sure the non-volatile
   nfs states are saved, and crash the client or the server.
5. On reboot, you should see on the other machine logs or console :
   lockd: rejected NSM callback from 7f000001:[some port >1024]

4b. alternatively, simulate a crash of the server:
     On the server :
      # /etc/init.d/nfslock stop
      # touch /var/lib/nfs/statd/sm/[IP of the Client]
      # /sbin/rpc.statd


Actual Results:
rpc.statd does nothing, and locks are forgotten or stale.

Expected Results:
Locks should be reclaimed when a server reboots, or cleared when a client reboots.


Additional Information:

I examined the patch that RedHat applies to the base nfs-utils sources to
create the RPMs.

The "drop-privs.patch" has changed between 6.2-Update and 7.3 (I didn't look
in other 7.x)
In the 6.2 patch, the opening of the privileged socket is done before dropping
root uid; in the 7.3 it is done after (and thus, it fails). However the comment
" we're going to drop root privs, but before we do that,
* make sure to get our port <1024 socket"
is still at the same place.

I suspect bugs #59245 and #64757 are caused by the same problem.

I attach a patch to apply on top of drop-privs.patch to do the socket
opening at the right time; it made things work for me.

Comment 1 Emmanuel Preveraud 2002-10-16 14:00:40 UTC
Created attachment 80645 [details]
patch to open socket before dropping privileges

Comment 2 Need Real Name 2002-10-29 22:59:47 UTC
I have noticed this bug as well and I do not think the patch solves all our 
problems as statd_get_socket() gets called later as well. You will have to wait 
until that piece of the code runs as well or move all the calls to 
statd_get_socket() to a place before you drop_privs()

Juan Gomez/IBM (juang.com)

Comment 3 Stephen Tweedie 2002-11-11 15:48:24 UTC
The patch will work OK.  "statd_get_socket" caches its socket fd, so if you
continue to call it after dropping privs you'll still get the old privileged
socket back.  (Tested locally.)

I'll update the bugzilla entry again once I've got a fixed test nfs-utils
package built.

Comment 4 Stephen Tweedie 2002-11-11 16:08:48 UTC
I've pushed 7.3 (nfs-utils-0.3.3-5.1) and 8.0 (nfs-utils-1.0.1-2.2) rpms to 

  http://people.redhat.com/sct/packages/nfs-utils/

for testing.  These are unsigned private builds but should fix the problem, and
if you can confirm that I'll queue them for an errata.

Comment 5 Stephen Tweedie 2002-11-11 22:19:49 UTC
*** Bug 59245 has been marked as a duplicate of this bug. ***

Comment 6 Pawel Salek 2003-08-02 10:00:35 UTC
This package makes the messages dissappear on my test system. 

(to other testers: Care should be taken when deploying the test packages since
they do not contain the mountd security fix).

Comment 7 Bill Nottingham 2006-08-04 20:32:54 UTC
Red Hat Linux and Red Hat Powertools are currently no longer supported by Red
Hat, Inc. In an effort to clean up bugzilla, we are closing all bugs in MODIFIED
state for these products.

However, we do want to make sure that nothing important slips through the
cracks. If, in fact, these issues are not resolved in a current Fedora Core
Release (such as Fedora Core 5), please open a new issues stating so. Thanks.