Bug 199620 - nfslock script starts statd before lockd is up so lock recovery fails
Summary: nfslock script starts statd before lockd is up so lock recovery fails
Alias: None
Product: Fedora
Classification: Fedora
Component: nfs-utils   
(Show other bugs)
Version: 9
Hardware: All Linux
Target Milestone: ---
Assignee: Jeff Layton
QA Contact: Ben Levenson
Whiteboard: bzcl34nup
Keywords: Reopened
Depends On:
TreeView+ depends on / blocked
Reported: 2006-07-20 19:36 UTC by Jeff Layton
Modified: 2008-05-20 15:22 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-05-20 15:22:15 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
trivial patch to init script (460 bytes, patch)
2006-07-20 19:40 UTC, Jeff Layton
no flags Details | Diff

Description Jeff Layton 2006-07-20 19:36:08 UTC
+++ This bug was initially created as a clone of Bug #199586 +++

Description of problem:

The default chkconfig line in the nfslock init script starts it long before
lockd is up. This causes clients to try to recover their locks too early. Here's
a sample network trace, that shows PROGRAM_NOT_AVAILABLE after the client
attempted to recover locks on a reboot:

163.999543 -> Portmap V2 GETPORT Call STAT(100024)
164.000380 -> Portmap V2 GETPORT Reply (Call In 73)
164.000832 -> STAT V1 NOTIFY Call
164.003574 -> STAT V1 NOTIFY Reply (Call In 75)
164.004416 -> TCP 32866 > sunrpc [SYN] Seq=0 Len=0
MSS=1460 TSV=407443 TSER=0 WS=0
164.004700 -> TCP sunrpc > 32866 [SYN, ACK] Seq=0
Ack=1 Win=5792 Len=0 MSS=1460 TSV=4294700213 TSER=407443 WS=2
164.004765 -> TCP 32866 > sunrpc [ACK] Seq=1 Ack=1
Win=5840 Len=0 TSV=407443 TSER=4294700213164.004947 -> Portmap V2 GETPORT Call NLM(100021) V:1 TCP
164.005222 -> TCP sunrpc > 32866 [ACK] Seq=1 Ack=61
Win=5792 Len=0 TSV=4294700214 TSER=407443
164.005832 -> Portmap V2 GETPORT Reply (Call In 80)

Changing nfslock.init chkconfig line to this:

# chkconfig: 345 61 19

seems to fix the problem. Opening this for RHEL4, since that's where I
originally noticed the problem, but it looks like FC has the same issue.

-- Additional comment from jlayton@redhat.com on 2006-07-20 12:27 EST --
Going ahead and adding this to the 4.5 proposed list. Should be a pretty trivial
fix and bad lock recovery can cause data corruption. I've not seen any customer
complaints about this particular problem yet, but with the work happening on
lock recovery, it's probably just a matter of time.

Comment 1 Jeff Layton 2006-07-20 19:40:09 UTC
Created attachment 132769 [details]
trivial patch to init script

A solution is to make the nfslock script run after the nfs script. This trivial
fix should fix the chkconfig line so that that happens by default.

Comment 2 Steve Dickson 2007-03-09 13:33:42 UTC
This client should continue to retry when trying to reclaim a lock.... 
regardless of the error that was return.... and if the client does
not continue to retry... its a bug in the client... imho... 

Comment 3 Bug Zapper 2008-04-03 17:49:41 UTC
Based on the date this bug was created, it appears to have been reported
against rawhide during the development of a Fedora release that is no
longer maintained. In order to refocus our efforts as a project we are
flagging all of the open bugs for releases which are no longer
maintained. If this bug remains in NEEDINFO thirty (30) days from now,
we will automatically close it.

If you can reproduce this bug in a maintained Fedora version (7, 8, or
rawhide), please change this bug to the respective version and change
the status to ASSIGNED. (If you're unable to change the bug's version
or status, add a comment to the bug and someone will change it for you.)

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we're following is outlined here:

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

Comment 4 Bug Zapper 2008-05-07 00:41:21 UTC
This bug has been in NEEDINFO for more than 30 days since feedback was
first requested. As a result we are closing it.

If you can reproduce this bug in the future against a maintained Fedora
version please feel free to reopen it against that version.

The process we're following is outlined here:

Comment 5 Jeff Layton 2008-05-07 00:55:21 UTC
This one slipped through the cracks. I'll have a look at it again when I get the
chance and see if this is a bug in the client like Steve suggests...

Comment 6 Bug Zapper 2008-05-14 02:14:54 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:

Comment 7 Jeff Layton 2008-05-20 15:22:15 UTC
Ok, looks like Steve was right on this, and this seems to work properly on
rawhide (at least). Closing as NOTABUG.

Note You need to log in before you can comment on or make changes to this bug.