434795 – lockd not using settings in sysconfig/nfs

Bug 434795 - lockd not using settings in sysconfig/nfs

Summary: lockd not using settings in sysconfig/nfs

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	nfs-utils
Sub Component:
Version:	5.4
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Steve Dickson
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	474449 (view as bug list)
Depends On:
Blocks:	963033
TreeView+	depends on / blocked

Reported:	2008-02-25 15:43 UTC by Mike McGrath
Modified:	2018-10-20 03:51 UTC (History)
CC List:	18 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	963033 (view as bug list)
Environment:
Last Closed:	2009-09-02 10:02:56 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Patch for /etc/init.d/nfs - fixes lockd port assignment (385 bytes, patch) 2008-07-15 14:39 UTC, Msquared	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2009:1321	0	normal	SHIPPED_LIVE	Low: nfs-utils security and bug fix update	2009-09-01 10:27:56 UTC

Description Mike McGrath 2008-02-25 15:43:04 UTC

Description of problem:

It seems lockd isn't reading the config file options in /etc/sysconfig/nfs
correctly.  I'm presently using version nfs-utils-1.0.9-26.el5

How reproducible:

So far just once, after upgrading we rebooted the machine.  Upon reboot none of
our clients could lock files on their nfs shares.


Steps to Reproduce:
1. /etc/sysconfig/nfs contains:
LOCKD_TCPPORT=48624
LOCKD_UDPPORT=48624

rpcinfo -p contains:

    100021    1   udp  32768  nlockmgr
    100021    3   udp  32768  nlockmgr
    100021    4   udp  32768  nlockmgr
    100021    1   tcp  39476  nlockmgr
    100021    3   tcp  39476  nlockmgr
    100021    4   tcp  39476  nlockmgr

and sysctl -a | grep nlm contains:

fs.nfs.nlm_tcpport = 48624
fs.nfs.nlm_udpport = 48624
  
Actual results:

file locking over nfs not working.

Expected results:

I'd expect rpcinfo to agree with whats in sysctl -a.  It appears thats not the case.

Additional info: This is the share that Fedora uses for its Buildsystem.  I can
help give any testing back that might help, just let me know or stop by
irc.freenode.net in #fedora-admin

Comment 1 Mike McGrath 2008-02-25 15:43:37 UTC

I forgot to mention this is a regression from the previous version which didn't
seem to have this issue.

Comment 2 Colin.Simpson 2008-06-23 11:24:32 UTC

This problem still seems present in 5.2.

Comment 3 Colin.Simpson 2008-06-25 13:38:50 UTC

This is caused by a change to the startup file, it used to say:

LOCKDARG=""
[ -n "$LOCKD_TCPPORT" ] && LOCKDARG="nlm_tcpport=$LOCKD_TCPPORT"
    [ -n "$LOCKD_UDPPORT" ] && \
      LOCKDARG="$LOCKDARG nlm_udpport=$LOCKD_UDPPORT"
    [ -n "$LOCKDARG" ]  && \
      modprobe lockd $LOCKDARG

So the startup script read LOCKD_TCPPORT and LOCKD_UDPPORT from
/etc/sysconfig/nfs and applied these as parameters to the module as it loads lockd.

The script now reads:

[ -n "$LOCKD_TCPPORT" ] && \
   /sbin/sysctl -w fs.nfs.nlm_tcpport=$LOCKD_TCPPORT >/dev/null 2>&1
[ -n "$LOCKD_UDPPORT" ] && \
   /sbin/sysctl -w fs.nfs.nlm_udpport=$LOCKD_UDPPORT >/dev/null 2>&1
fi  

Sadly these variables in proc do not exist until the module is actually loaded. 
So these variables will not get applied when the module finally gets loaded
presumably as a dependancy on the nfs modules getting loaded. 

The solution is presumably either to revert to the original way or modprobe the
lockd module in before the sysctl commands. 

This is clearly a regression, so I would have thought would be urgent. This
breaks all 5.2 systems with NFS and iptables firewalls in place.

Comment 4 Colin.Simpson 2008-06-25 15:21:02 UTC

Interestingly Fedora 9, has the original form of this startup script.

Workaround for the moment is to put,
options lockd nlm_udpport=4002 nlm_tcpport=4002

into /etc/modprobe.conf

Comment 5 Colin.Simpson 2008-06-26 10:24:34 UTC

I'm still replying to my own ticket here. Maybe there is more going on here than
meets the eye. When setting the "options" in /etc/modprobe.conf I described we
are still seeing problems on one system where it's locking hasn't properly
shifted to the new ports.

rpcinfo -t srv17ux01 100021
rpcinfo: RPC: Timed out
program 100021 version 0 is not available

It may have worked initially but it seems to have stopped.

Comment 6 Colin.Simpson 2008-06-27 12:04:19 UTC

Looks like there are two bugs here, the bug in the start up script as above but
also lockd stops responding after a while. Someone has logged this as bug #453094. 

lockd dies whether the lockd nlm_udpport and nlm_tcpport are set at all.

Comment 7 Msquared 2008-07-15 14:39:41 UTC

Created attachment 311840 [details]
Patch for /etc/init.d/nfs - fixes lockd port assignment

Update your nfs-utils to 1.0.9-33.el5 then apply this patch to /etc/init.d/nfs
to fix the port assignment bug.

Should make your iptables firewall happy again.  :o)

Comment 8 Matthew Kent 2008-10-27 17:56:26 UTC

Still an issue in 5.2 with nfs-utils-1.0.9-35z.el5_2

Comment 9 Franco M. Bladilo 2008-11-03 20:58:46 UTC

Seeing the same issue here, any ETA in having this solved upstream?

Comment 10 Leon Flaks 2008-12-08 18:30:20 UTC

Just noticed the variation of this bug on newly installed f10.
I have port 4001 assigned for lockd in /etc/sysconfig/nfs:

LOCKD_TCPPORT=4001
LOCKD_UDPPORT=4001

tcp protocol follows this directive, but udp does not:

relevant 'rpcinfo -p localhost' output:

100021    1   udp  56418  nlockmgr
100021    3   udp  56418  nlockmgr
100021    4   udp  56418  nlockmgr

100021    1   tcp   4001  nlockmgr
100021    3   tcp   4001  nlockmgr
100021    4   tcp   4001  nlockmgr

Just did the update to nfs-utils-1.1.4-2.fc10.i386 with the same result. I also see version 1.1.4-4 in koji and the changelog has no mention about it.
This is i386 architecture. Did not test on 64-bit.
Also tried to use patch from Comment #7 - no difference.

Comment 11 Tethys 2008-12-14 18:17:19 UTC

This is still a problem for me with nfs-utils-1.0.9-35z.el5_2

The workaround in comment #4 seems to work for me, though.

Comment 14 Steve Dickson 2009-04-29 17:53:14 UTC

Fixed in nfs-utils-1.0.9-42.el5

Comment 15 Steve Dickson 2009-05-19 13:42:57 UTC

*** Bug 474449 has been marked as a duplicate of this bug. ***

Comment 18 Colin.Simpson 2009-06-26 16:48:56 UTC

This bug hasn't reached ON_Q, so is the release of nfs-utils-1.0.9-42.el5 in the pipeline?

Comment 20 Dan Astoorian 2009-08-10 14:34:54 UTC

Is the patch attached to this bug the only fix that was applied to nfs-utils-1.0.9-42.el5 (which has not yet been released)?

The patch does not seem to be working for me under EL5.  Although fs.nfs.nlm_tcpport and fs.nfs.nlm_udpport are being set, it's apparently happening after the NLM service has already bound to its ports.

Adding "modprobe lockd" to the corresponding place in /etc/init.d/nfslock (rather than /etc/init.d/nfs) seems to work better; if this hasn't been done in nfs-utils-1.0.9-42.el5, it probably should be added.

Note that on one of my servers I saw the symptom described in comment #10; this server NFS-mounts other filesystems using UDP via /etc/fstab, so I suspect that /etc/init.d/netfs (which runs after /etc/init.d/nfslock but before /etc/init.d/nfs) was causing the NLM service to start for UDP, but since there were no filesystems mounted with TCP, the NLM TCP port didn't get allocated until sometime after /etc/init.d/nfs had run and set fs.nfs.nlm_tcpport.

This observation is significant because the patch from comment #7 may appear to be effective if tested only on systems with no NFS client mounts.

Comment 21 errata-xmlrpc 2009-09-02 10:02:56 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1321.html

Note You need to log in before you can comment on or make changes to this bug.