Bug 461085 - lockd: return NLM_LCK_DENIED_GRACE_PERIOD after long periods
lockd: return NLM_LCK_DENIED_GRACE_PERIOD after long periods
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.4
All Linux
high Severity urgent
: rc
: ---
Assigned To: Peter Staubach
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-03 22:50 EDT by Hiroaki Nakano
Modified: 2010-10-23 00:17 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-18 15:24:19 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Fix patch for kernel-2.6.9-42.EL (1.46 KB, patch)
2008-09-29 06:06 EDT, Hiroaki Nakano
no flags Details | Diff
Proposed patch (1.78 KB, patch)
2008-12-04 09:51 EST, Peter Staubach
no flags Details | Diff

  None (edit)
Description Hiroaki Nakano 2008-09-03 22:50:12 EDT
Description of problem:

From NFS client, when we are getting file lock by fcntl, application is stacked.
We analize NLM protocol, so NLM_LCK_DENIED_GRACE_PERIOD is returned from NFS server.
Grace period is set up at 30 seconds, but it returns after several months months when we require
lock.
We did find this problem in production.

We pinpointed the cause of this problem was in lockd. 
In the comparison of time_before in lockd, it is generated to reverse 
the sign of the difference when becoming the period of LONG_MAX/2. 

We report this problem and buf fix patch to J.Bruce Fields, NFS maintainer,
so he merges improved buf fix patch with his git repository.

We hope that you are back porting the bug fix patch.


Version-Release number of selected component (if applicable):

Red Hat AS4 Update4
kernel-2.6.9-43.EL (i386)


How reproducible:

After machine reboot or NFS server service restart,
you will get fctnl lock at first passing 25days to 50 days
(assuming HZ=1000).

Steps to Reproduce:
1.NFS Server machine boot.
2.spend 25days - 50days
3.call fcntl lock.
  
Actual results:

application which get fcntl lock is freezed.

Expected results:

application can get fcntl lock after 30seconds.

Additional info:

- LKML thread head mail
http://lkml.org/lkml/2008/8/14/115

- bug fix patch at J.Bruce Fields tree
http://git.linux-nfs.org/?p=bfields/linux-topics.git;a=commitdiff;h=3ff893a7683f2a011ebcc4043604249ad610acb0
Comment 1 Hiroaki Nakano 2008-09-29 06:06:35 EDT
Created attachment 317946 [details]
Fix patch for kernel-2.6.9-42.EL

This patch is fixed BUG 461085 at RedHat EL 4 Update 4. It occurs by using time_before to compare jiffies with jiffies + grace_period_expire. I use timer functions to solve the basic cause of lockd bug that not consider a long time after lockd start.
Comment 4 Peter Staubach 2008-12-02 13:51:37 EST
Yes, the problem is that jiffies wrap fairly quickly.  The solution is
not to make comparisons against the jiffies value, but to schedule a
timeout to turnoff the grace period once it has started.

This solution doesn't match that from upstream, but seems good enough.
Comment 5 Peter Staubach 2008-12-04 09:51:15 EST
Created attachment 325689 [details]
Proposed patch
Comment 6 RHEL Product and Program Management 2008-12-04 10:08:39 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 7 Hiroaki Nakano 2008-12-09 20:09:12 EST
(In reply to comment #5)
> Created an attachment (id=325689) [details]
> Proposed patch

It seems a good patch.
The kernel code of Red Hat Server 5 includes a same problem, can be fixed by like this patch or Bruce's patch.
Comment 8 Peter Staubach 2008-12-10 09:52:48 EST
Thanx for the feedback.

Regarding RHEL-5 -- I'm ahead of you there already.  Please see bz474590. :-)
Comment 9 Vivek Goyal 2008-12-17 11:08:22 EST
Committed in 78.22.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 15 errata-xmlrpc 2009-05-18 15:24:19 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html

Note You need to log in before you can comment on or make changes to this bug.