Bug 492671 - NFS lock recovery on server reboot fails
NFS lock recovery on server reboot fails
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: nfs-utils (Show other bugs)
All Linux
high Severity high
: rc
: ---
Assigned To: Steve Dickson
Depends On:
  Show dependency treegraph
Reported: 2009-03-27 18:44 EDT by Fabio Olive Leite
Modified: 2010-10-27 12:05 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 492669
Last Closed: 2010-10-27 12:05:29 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Fabio Olive Leite 2009-03-27 18:44:56 EDT
+++ This bug was initially created as a clone of Bug #492669 +++

Description of problem:

When an NFS server boots, it starts the rpc.statd service before the kernel lockd service is up and running. If clients held locks when the server rebooted, they will try to recover their locks while the server lockd service is not yet available and will fail.

This problem is in fact in both client and server.
- Client should keep retrying until service is available;
- Server should not advertise its status change before NLM service is up.

This bugzilla is about the second part. This problem can easily be worked around by starting the nfslock service in rc.local. This bugzilla is a request for a change in service start ordering on boot so that nfs (which loads lockd module) starts before nfslock (which starts rpc.statd).

Version-Release number of selected component (if applicable):

4.8 packages.

How reproducible:


Steps to Reproduce:
1. Set up NFS client and server
2. Start tcpdump capture in client
3. Run program that locks file
4. Reboot server
5. After server comes up, stop capture and check that after NSM NOTIFY call the client tried to re-issue the lock and received a PROGRAM NOT AVAILABLE error from rpcbind at the server.

Actual results:

Client can't recover lock.

Expected results:

Client recovers lock.

Additional info:

NSM and NLM specifications are vague, and yes, NFS clients SHOULD retry the lock operation a few times with some wait period in between retries (vague eh?), but still, ensuring NSM is started _after_ NLM at the server leads to faster and more reliable lock recovery, regardless of what the clients do.

Note You need to log in before you can comment on or make changes to this bug.