Red Hat Bugzilla – Bug 461085
lockd: return NLM_LCK_DENIED_GRACE_PERIOD after long periods
Last modified: 2010-10-23 00:17:46 EDT
Description of problem:
From NFS client, when we are getting file lock by fcntl, application is stacked.
We analize NLM protocol, so NLM_LCK_DENIED_GRACE_PERIOD is returned from NFS server.
Grace period is set up at 30 seconds, but it returns after several months months when we require
We did find this problem in production.
We pinpointed the cause of this problem was in lockd.
In the comparison of time_before in lockd, it is generated to reverse
the sign of the difference when becoming the period of LONG_MAX/2.
We report this problem and buf fix patch to J.Bruce Fields, NFS maintainer,
so he merges improved buf fix patch with his git repository.
We hope that you are back porting the bug fix patch.
Version-Release number of selected component (if applicable):
Red Hat AS4 Update4
After machine reboot or NFS server service restart,
you will get fctnl lock at first passing 25days to 50 days
Steps to Reproduce:
1.NFS Server machine boot.
2.spend 25days - 50days
3.call fcntl lock.
application which get fcntl lock is freezed.
application can get fcntl lock after 30seconds.
- LKML thread head mail
- bug fix patch at J.Bruce Fields tree
Created attachment 317946 [details]
Fix patch for kernel-2.6.9-42.EL
This patch is fixed BUG 461085 at RedHat EL 4 Update 4. It occurs by using time_before to compare jiffies with jiffies + grace_period_expire. I use timer functions to solve the basic cause of lockd bug that not consider a long time after lockd start.
Yes, the problem is that jiffies wrap fairly quickly. The solution is
not to make comparisons against the jiffies value, but to schedule a
timeout to turnoff the grace period once it has started.
This solution doesn't match that from upstream, but seems good enough.
Created attachment 325689 [details]
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
(In reply to comment #5)
> Created an attachment (id=325689) [details]
> Proposed patch
It seems a good patch.
The kernel code of Red Hat Server 5 includes a same problem, can be fixed by like this patch or Bruce's patch.
Thanx for the feedback.
Regarding RHEL-5 -- I'm ahead of you there already. Please see bz474590. :-)
Committed in 78.22.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.