Bug 585935
Summary: | Bug in RHEL-5.4/5.5 nfs_access_cache_shrinker | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Steve Dickson <steved> | ||||||
Component: | kernel | Assignee: | nfs-maint | ||||||
Status: | CLOSED ERRATA | QA Contact: | Jian Li <jiali> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 5.6 | CC: | alastair, albert.fluegel, ccui, cww, dhoward, eguan, jiali, jlayton, nmurray, plougher, rwheeler, sforsber, sjmudd, trond.myklebust | ||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | kernel-2.6.18-290.el5 | Doc Type: | Bug Fix | ||||||
Doc Text: |
Previously, when the iput() function was called while it held the nfs_access_lru lock could result in problems since iput() can sleep, and it can also attempt to allocate memory. This update removes an optimisation that is not present in the mainline kernel series. Now, iput() is never called while holding a spinlock in the <function>nfs_access_cache_shrinker() function, thus preventing this bug.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-02-21 03:27:49 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 749459 | ||||||||
Attachments: |
|
Description
Steve Dickson
2010-04-26 13:38:02 UTC
Created attachment 409239 [details]
Don't call iput while holding a spinlock in nfs_access_cache_shrinker
This patch syncs nfs_access_cache_shrinker() to the current mainline. I'm not sure that it eliminates all possible deadlocks here, since I'm getting worried that iput() and put_rpccred() can under certain circumstances trigger calls to more allocators.
We may therefore need to check gfp_mask in addition to what is contained in this patch.
*** Bug 560688 has been marked as a duplicate of this bug. *** I have checked a number of RH kernel sources, including the latest RH5 (2.6.18-238,2.6.18-238.19.1, 2.6.18-274, 2.6.18-274.3.1), and this patch has not been applied yet in RHEL5. Can someone advise when it will be available? *** Bug 742537 has been marked as a duplicate of this bug. *** Ok, thats great. When can we expect a fix for this? I see Trond who proposed the fix has his email address at netapp.com. Maybe he fixed the issue for the Linux they use in Netapp appliances? FYI Netapp makes NAS filers. Created attachment 527390 [details]
Bugcheck 10 Oct 2011
We suffered a machine hang again last night. I have attached the second stack trace (thlxpgas01-2nd-bugcheck.lis). Since the first hang reported in Bug 742537 we have applied a BIOS upgrade to the server. This has not worked. Can we escalate this please? This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. This is being proposed as a fix in rhel5.8. Patch(es) available in kernel-2.6.18-290.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously, when the iput() function was called while it held the nfs_access_lru lock could result in problems since iput() can sleep, and it can also attempt to allocate memory. This update removes an optimisation that is not present in the mainline kernel series. Now, iput() is never called while holding a spinlock in the <function>nfs_access_cache_shrinker() function, thus preventing this bug. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0150.html |