RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1115545 - NFS4: remove incorrect "Lock reclaim failed!" warning when delegations are used
Summary: NFS4: remove incorrect "Lock reclaim failed!" warning when delegations are used
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Dave Wysochanski
QA Contact: JianHong Yin
URL:
Whiteboard:
Depends On: 1025441 1156428
Blocks: 1075802 1159933
TreeView+ depends on / blocked
 
Reported: 2014-07-02 15:00 UTC by Dave Wysochanski
Modified: 2018-12-06 17:08 UTC (History)
2 users (show)

Fixed In Version: kernel-2.6.32-527.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-07-22 08:09:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
WIP testcase for this bug - does not currently work (4.74 KB, application/octet-stream)
2014-10-23 16:49 UTC, Dave Wysochanski
no flags Details
Supporting files and WIP testcase for this bug - does not currently work (6.74 KB, application/x-gzip)
2014-10-23 16:51 UTC, Dave Wysochanski
no flags Details
testcase for this bug that now works - requires a kernel which fixes local ratelimiting printk bug https://bugzilla.redhat.com/show_bug.cgi?id=1156428 (295.46 KB, application/x-gzip)
2014-10-24 16:53 UTC, Dave Wysochanski
no flags Details
testcase for this bug that now works - requires a kernel which fixes local ratelimiting printk bug https://bugzilla.redhat.com/show_bug.cgi?id=1156428 (296.68 KB, application/x-gzip)
2014-10-24 18:35 UTC, Dave Wysochanski
no flags Details
test log showing test failure on kernel with patch for printk ratelimiting bug (4.17 KB, application/octet-stream)
2014-10-24 18:36 UTC, Dave Wysochanski
no flags Details
test log showing test pass on kernel with patch for printk ratelimiting bug plus patches to fix this bug (3.38 KB, application/octet-stream)
2014-10-24 18:36 UTC, Dave Wysochanski
no flags Details
test log showing test pass on kernel with patch for printk ratelimiting bug plus patches to fix this bug (4.19 KB, application/octet-stream)
2014-10-24 18:45 UTC, Dave Wysochanski
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 1118233 0 None None None Never
Red Hat Product Errata RHSA-2015:1272 0 normal SHIPPED_LIVE Moderate: kernel security, bug fix, and enhancement update 2015-07-22 11:56:25 UTC

Description Dave Wysochanski 2014-07-02 15:00:51 UTC
Description of problem:
We need the following upstream patch which should be an easy backport.
commit 6686390bab6a0e049fa7040631aee08b35a55293
Author: NeilBrown <neilb>
Date:   Mon Aug 12 16:52:47 2013 +1000

    NFS: remove incorrect "Lock reclaim failed!" warning.
    
    After reclaiming state that was lost, the NFS client tries to reclaim
    any locks, and then checks that each one has NFS_LOCK_INITIALIZED set
    (which means that the server has confirmed the lock).
    However if the client holds a delegation, nfs_reclaim_locks() simply aborts
    (or more accurately it called nfs_lock_reclaim() and that returns without
    doing anything).
    
    This is because when a delegation is held, the server doesn't need to
    know about locks.
    
    So if a delegation is held, NFS_LOCK_INITIALIZED is not expected, and
    its absence is certainly not an error.
    
    So don't print the warnings if NFS_DELGATED_STATE is set.


Version-Release number of selected component (if applicable):
2.6.32-488.el6


How reproducible:
Should be relatively easy to repro.  Requires
1. NFS4 server with delegations
2. NFS4 client doing locks
3. Some method for triggering lock reclaim


Steps to Reproduce:
TBD

Actual results:
"Lock reclaim failed" printed in /var/log/messages


Expected results:
No message should be printed since the condition being checked for is not relevant to NFS4 delegations.

Additional info:

Comment 1 Dave Wysochanski 2014-07-02 15:22:21 UTC
Looking at that commit, the logic is wrong in the test.  So there's a second commit needed.

commit 1acd1c301f4faae80f4d2c7bbd9a4553b131c0e3
Author: Jeff Layton <jlayton>
Date:   Thu Oct 31 13:03:04 2013 -0400

    nfs: fix inverted test for delegation in nfs4_reclaim_open_state
...
-                               if (test_bit(NFS_DELEGATED_STATE, &state->flags) != 0) {
+                               if (!test_bit(NFS_DELEGATED_STATE, &state->flags)) {

Comment 3 Dave Wysochanski 2014-10-23 16:49:33 UTC
Created attachment 950025 [details]
WIP testcase for this bug - does not currently work

Comment 4 Dave Wysochanski 2014-10-23 16:51:31 UTC
Created attachment 950026 [details]
Supporting files and WIP testcase for this bug - does not currently work

Comment 5 Dave Wysochanski 2014-10-24 11:56:50 UTC
Well, I have a testcase for this bug that should cause the "Lock reclaim failed to fire from inside  nfs4_reclaim_open_state():
pr_warn_ratelimited("NFS: "
		"%s: Lock reclaim "
		"failed!\n", __func__);

But for some reason, the "Lock reclaim failed" message was still not printed.  I wrote a lot of stap and finally ended up patching an nfs module with printks.  Then I discovered that, amazingly, printk_ratelimited has been broken in RHEL6, apparently since introduction in 6.1 due to missing this patch:

commit bb1dc0bacb8ddd7ba6a5906c678a5a5a110cf695
Author: Yong Zhang <yong.zhang>
Date:   Tue Apr 6 14:35:02 2010 -0700

    kernel.h: fix wrong usage of __ratelimit()
    
    When __ratelimit() returns 1 this means that we can go ahead.
    
    Signed-off-by: Yong Zhang <yong.zhang>
    Cc: Ingo Molnar <mingo>
    Cc: Joe Perches <joe>
    Signed-off-by: Andrew Morton <akpm>
    Signed-off-by: Linus Torvalds <torvalds>

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 7f07074..9365227 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -426,7 +426,7 @@ static inline char *pack_hex_byte(char *buf, u8 byte)
                .burst = DEFAULT_RATELIMIT_BURST,       \
        };                                              \
                                                        \
-       if (!__ratelimit(&_rs))                         \
+       if (__ratelimit(&_rs))                          \
                printk(fmt, ##__VA_ARGS__);             \


So what this means is that anything underneath a pr_*_ratelimited macro would only be printed when you get a burst of messages that should be supressed, which is the opposite of the intent of ratelimiting!

I think maybe the reason no one has noticed is due to the low usage of ratelimiting - from what I counted there were only a handful of pr_warn_ratelimit calls, and most were in nfs.

I'll have to open a separate bz for the above patch.

The patch which introduced the ratelimiting went in along with a group of patches to rhel6.1 for nfs:
commit cf2a1c571fe2b88a3954f2f4a2cd35641c4b8977
Author: Steve Dickson <SteveD>
Date:   Mon Nov 15 12:18:33 2010 -0500

    [kernel] kernel.h: add printk_ratelimited and pr_<level>_rl
    
    Message-id: <1289823513-15346-71-git-send-email-steved>
    Patchwork-id: 29304
    O-Subject: [RHEL6.1 PATCH 70/70] kernel.h: add printk_ratelimited and
        pr_<level>_rl
    Bugzilla: 653066
    RH-Acked-by: Prarit Bhargava <prarit>
    RH-Acked-by: J. Bruce Fields <bfields>

Comment 7 Dave Wysochanski 2014-10-24 16:53:09 UTC
Created attachment 950431 [details]
testcase for this bug that now works - requires a kernel which fixes local ratelimiting printk bug https://bugzilla.redhat.com/show_bug.cgi?id=1156428

Comment 8 Dave Wysochanski 2014-10-24 18:35:48 UTC
Created attachment 950476 [details]
testcase for this bug that now works - requires a kernel which fixes local ratelimiting printk bug https://bugzilla.redhat.com/show_bug.cgi?id=1156428

Comment 9 Dave Wysochanski 2014-10-24 18:36:19 UTC
Created attachment 950477 [details]
test log showing test failure on kernel with patch for printk ratelimiting bug

Comment 10 Dave Wysochanski 2014-10-24 18:36:49 UTC
Created attachment 950478 [details]
test log showing test pass on kernel with patch for printk ratelimiting bug plus patches to fix this bug

Comment 12 Dave Wysochanski 2014-10-24 18:45:21 UTC
Created attachment 950479 [details]
test log showing test pass on kernel with patch for printk ratelimiting bug plus patches to fix this bug

Comment 15 RHEL Program Management 2014-11-10 23:11:53 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 16 Rafael Aquini 2015-01-30 17:11:25 UTC
Patch(es) available on kernel-2.6.32-527.el6

Comment 29 JianHong Yin 2015-05-27 07:42:51 UTC
reproduced at RHEL-6.5,RHEL-6.7-20150506.0(kernel-2.6.32-558)
  https://beaker.engineering.redhat.com/jobs/965878
verified at RHEL-6.5,RHEL-6.7-20150527.n.0(kernel-2.6.32-563)
  https://beaker.engineering.redhat.com/jobs/965899

Comment 31 errata-xmlrpc 2015-07-22 08:09:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1272.html


Note You need to log in before you can comment on or make changes to this bug.