Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 585935 - Bug in RHEL-5.4/5.5 nfs_access_cache_shrinker
Bug in RHEL-5.4/5.5 nfs_access_cache_shrinker
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.6
All Linux
urgent Severity high
: rc
: ---
Assigned To: nfs-maint
Jian Li
: ZStream
: 560688 742537 (view as bug list)
Depends On:
Blocks: 749459
  Show dependency treegraph
 
Reported: 2010-04-26 09:38 EDT by Steve Dickson
Modified: 2014-03-03 19:06 EST (History)
14 users (show)

See Also:
Fixed In Version: kernel-2.6.18-290.el5
Doc Type: Bug Fix
Doc Text:
Previously, when the iput() function was called while it held the nfs_access_lru lock could result in problems since iput() can sleep, and it can also attempt to allocate memory. This update removes an optimisation that is not present in the mainline kernel series. Now, iput() is never called while holding a spinlock in the <function>nfs_access_cache_shrinker() function, thus preventing this bug.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-02-20 22:27:49 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Don't call iput while holding a spinlock in nfs_access_cache_shrinker (1.41 KB, patch)
2010-04-26 13:21 EDT, Trond Myklebust
no flags Details | Diff
Bugcheck 10 Oct 2011 (2.82 KB, application/octet-stream)
2011-10-11 04:26 EDT, Alastair Munro
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0150 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 5.8 kernel update 2012-02-21 02:35:24 EST

  None (edit)
Description Steve Dickson 2010-04-26 09:38:02 EDT
Description of problem:
From the Linux NFS Linux maintainer:

"It appears to me that the fix for Bz 433249 that is present in the
RHEL-5.4 and RHEL-5.5 kernels contains a typo.

AFAICS, both these kernels apply the same fix as I applied to mainline,
but with one exception: in the original, I make sure to drop the
nfs_access_lru_lock before calling iput() on the inode whereas the RHEL
kernels appear to keep that lock.

Keeping the spinlock looks like a bug, since iput() can definitely sleep
(see for instance the calls to truncate_inode_pages())."
Comment 1 Trond Myklebust 2010-04-26 13:21:27 EDT
Created attachment 409239 [details]
Don't call iput while holding a spinlock in nfs_access_cache_shrinker

This patch syncs nfs_access_cache_shrinker() to the current mainline. I'm not sure that it eliminates all possible deadlocks here, since I'm getting worried that iput() and put_rpccred() can under certain circumstances trigger calls to more allocators.

We may therefore need to check gfp_mask in addition to what is contained in this patch.
Comment 2 Steve Dickson 2010-10-21 14:55:00 EDT
*** Bug 560688 has been marked as a duplicate of this bug. ***
Comment 3 Alastair Munro 2011-09-30 09:59:02 EDT
I have checked a number of RH kernel sources, including the latest RH5 (2.6.18-238,2.6.18-238.19.1, 2.6.18-274, 2.6.18-274.3.1), and this patch has not been applied yet in RHEL5. Can someone advise when it will be available?
Comment 4 Jeff Layton 2011-10-05 06:06:00 EDT
*** Bug 742537 has been marked as a duplicate of this bug. ***
Comment 5 Alastair Munro 2011-10-06 04:04:27 EDT
Ok, thats great. When can we expect a fix for this?
Comment 6 Alastair Munro 2011-10-06 04:08:35 EDT
I see Trond who proposed the fix has his email address at netapp.com. Maybe he fixed the issue for the Linux they use in Netapp appliances? FYI Netapp makes NAS filers.
Comment 7 Alastair Munro 2011-10-11 04:26:34 EDT
Created attachment 527390 [details]
Bugcheck 10 Oct 2011
Comment 8 Alastair Munro 2011-10-11 04:27:38 EDT
We suffered a machine hang again last night. I have attached the second stack trace (thlxpgas01-2nd-bugcheck.lis). Since the first hang reported in Bug 742537 we have applied a BIOS upgrade to the server. This has not worked. Can we escalate this please?
Comment 9 RHEL Product and Program Management 2011-10-12 11:11:37 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 11 Alastair Munro 2011-10-14 06:51:26 EDT
This is being proposed as a fix in rhel5.8.
Comment 13 Jarod Wilson 2011-10-18 09:52:08 EDT
Patch(es) available in kernel-2.6.18-290.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5
Detailed testing feedback is always welcomed.
Comment 17 Tomas Capek 2011-11-29 12:42:00 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, when the iput() function was called while it held the nfs_access_lru lock could result in problems since iput() can sleep, and it can also attempt to allocate memory. This update removes an optimisation that is not present in the mainline kernel series. Now, iput() is never called while holding a spinlock in the <function>nfs_access_cache_shrinker() function, thus preventing this bug.
Comment 18 errata-xmlrpc 2012-02-20 22:27:49 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0150.html

Note You need to log in before you can comment on or make changes to this bug.