Bug 165993
| Summary: | NFS deadlock when multiple processes creating/deleting a file | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 3 | Reporter: | Neil Horman <nhorman> | ||||||||||
| Component: | kernel | Assignee: | Steve Dickson <steved> | ||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||||||
| Severity: | medium | Docs Contact: | |||||||||||
| Priority: | medium | ||||||||||||
| Version: | 3.0 | CC: | dff, jturner, kanderso, lwang, mjenner, peterm, petrides, rajeev, tao, tburke, tkincaid | ||||||||||
| Target Milestone: | --- | ||||||||||||
| Target Release: | --- | ||||||||||||
| Hardware: | All | ||||||||||||
| OS: | Linux | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | RHSA-2005-663 | Doc Type: | Bug Fix | ||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2005-09-28 15:33:18 UTC | Type: | --- | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Bug Depends On: | |||||||||||||
| Bug Blocks: | 156320 | ||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Neil Horman
2005-08-15 15:42:50 UTC
Created attachment 117759 [details]
script to reproduce deadlock problem
A quick status... I am able to reproduce this and it appears I'm seeing the same thing Neil was seeing... Created attachment 118104 [details]
Proposed Patch
Please give this patch at try. Its stop an inode from be unhashed when
an ESTALE is returned on a getattr. This in turns stop the sync from
going into an infinite loop which causes the machine to hang.
I was able to continuously run the above reproducer for
a 12 hour period without neither RHLE3 client hanging.
TomK/JayT, has Q/A approved of a fix for this bug being taken into the final RHEL3 U6 kernel respin? If so, could we please get the Q/A management ack and the bug moved to the CanFix list? SteveD, should committing your fix be gated on successful testing by the bug reporter? (This bug is still in NEEDINFO_REPORTER.) Removing block against RHEL4 bug 166772. yes Created attachment 118308 [details]
Crash dump with the nfs hang patch applied
Created attachment 118346 [details]
crash log on test kernel IT 75445
Neil, we're waiting for you to confirm that SteveD's patch in comment #7 resolves the problem at your customer's site. We need this answer by the end of the day today! Thanks. Steve, Ernie, Sorry for the delay. I can confirm that the attached patch fixes the reported problem. Now I think you said we just need a QA ACK to move this along. Thanks, Neil. TomK/JayT, could you please do the final honors (QA ack and list move)? Thanks. A fix for this problem has just been committed to the RHEL3 U6 patch pool this afternoon (in kernel version 2.4.21-36.EL). An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html |