Bug 716345

Summary: [ext3/4] deadlock with quota+nfs+fsstress
Product: Red Hat Enterprise Linux 5 Reporter: Eryu Guan <eguan>
Component: kernelAssignee: Eric Sandeen <esandeen>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.7CC: eguan, esandeen, rwheeler, uchiyama.yusaku
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-02 13:23:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 809207    
Bug Blocks: 743405    

Description Eryu Guan 2011-06-24 04:09:32 UTC
Description of problem:
When testing Bug 702197 a deadlock showed up, see attachments for logs

Version-Release number of selected component (if applicable):
kernel-2.6.18-268.el5

How reproducible:
Always

Steps to Reproduce:
From https://bugzilla.redhat.com/show_bug.cgi?id=702197#c12
My reproducer:

Export an ext filesystem via nfs, with quota turned on on the server.

Mount it on the nfs client, run fsstress in a loop on the nfs-mounted
filesystem, and at the same time do large multi-gig buffered IOs in a loop to
that same nfs filesystem.

On the server, I use the "usemem" memory locking test from xfstests to
periodically lock 1G and free it.  This helps push kswapd along.

This seems to lock things up pretty well with stock kernels.
  
Actual results:
deadlock

Expected results:
pass fsstress

Additional info:
There is a bug for ext3/4+quota deadlock without nfs, see bug 650813
If they share the same root cause, this one can be marked as duplicated.

Comment 1 Eric Sandeen 2011-06-24 15:50:22 UTC
This seems to be a generic issue for ext3 & ext4 with the quota infrastructure in RHEL5.

This commit and several that follow it in a patchset reworked this locking and probably will resolve the issue:

http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=cc33412fb1f11613e20f9dfc2919a77ecd63fbc4

however, this is a pretty big rewrite and we'd need to evaluate KABI impact and other risks.

Comment 2 Yusaku Uchiyama 2011-11-22 06:35:50 UTC
This bug has already been fixed in RHEL 5.7, by linux-2.6-fs-ext4-fix-quota-deadlock.patch.

Bug 650813 is a duplicate of this bug.

Comment 3 RHEL Program Management 2012-01-09 14:32:15 UTC
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.8 and Red Hat does not plan to fix this issue the currently developed update.

Contact your manager or support representative in case you need to escalate this bug.

Comment 5 RHEL Program Management 2012-10-30 05:58:13 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 6 RHEL Program Management 2014-03-07 12:18:22 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in the  last planned RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX. To request that Red Hat re-consider this request, please re-open the bugzilla via  appropriate support channels and provide additional business and/or technical details about its importance to you.

Comment 7 RHEL Program Management 2014-06-02 13:23:17 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).