Bug 122252 - ext3/quota deadlock condition consistently hangs systems
Summary: ext3/quota deadlock condition consistently hangs systems
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Stephen Tweedie
QA Contact:
URL: http://www.kernel.org/pub/linux/kerne...
Whiteboard:
: 173135 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-05-02 14:34 UTC by Marc Wallman
Modified: 2007-11-30 22:07 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-10-19 19:26:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Backport of fix for quota/ext3 deadlock from kernel-2.4.25 (2.83 KB, patch)
2004-10-28 23:00 UTC, David Lehman
no flags Details | Diff

Description Marc Wallman 2004-05-02 14:34:00 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Gecko/20040207 Firefox/0.8

Description of problem:
We have been having trouble with 3ES hosts locking up when running
with quotas enabled on an ext3 filesystem. The problem happens at
random times, under both heavy and light loads. We are unable to run
more than a few days, regardless of the load, without our systems
locking up.

The bug was identified and fixed in the mainline 2.4.25 kernel, but as
far as I can tell, this fix has not been backported yet to the v3ES
kernel. I have examined both the changelog for the 3ES kernel and
looked at the source code for the 2.4.21-9.0.3.EL.

The fix was submitted in v2.4.25-pre5 by jack:ucw.cz. See the URL to
the 2.4.25 changelog in the URL field. Can someone backport this patch?

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Enable quotas on a ext3 filesystem.
2. Have disk activity on it (for us, uw-imapd is the kind
   of disk activity that generates the lockup)
3. Wait, probalby not more than a few days.
    

Actual Results:  Our hosts will consistently hang after a few days. We
are unable to keep them stable enough with quotas enabled to run them
as production servers.

Expected Results:  The host should not lock up.

Additional info:

Comment 1 Rik van Riel 2004-05-02 16:03:18 UTC
Reassigned to ext3 author.

Comment 2 Kevin Fenzi 2004-09-08 18:15:01 UTC
Is any progress being made to track this issue down?
It seems to have been around for quite a while, and it means you
basically can't use quotas in a production env. 
I see it here on a server, usually less than a day after enabling quotas. 

I have sysrq output when it's in the deadlock state. 
Anything else we can do to help solve this issue?


Comment 3 David Lehman 2004-10-28 23:00:15 UTC
Created attachment 105927 [details]
Backport of fix for quota/ext3 deadlock from kernel-2.4.25

Comment 5 Michael Simms 2004-11-04 01:39:15 UTC
Does this mean we'll see an official EL kernel with this fix sometime
soon?

Comment 7 Ernie Petrides 2004-11-04 20:09:59 UTC
No fix for this problem has yet been committed to a RHEL3 patch pool,
and specifically U4 is already closed (and in beta now).

Comment 11 fkass 2005-01-27 15:52:04 UTC
This really should be increased in priority!  We are seeing this same
problem and it is creating major issues for us.  Do we apply this
outdated patch onto 2.4.21-27.0.2.ELsmp?  Do we ignore RH kernels and
just put in 2.6.10 which is supposed to have fixed the problem?  Do we
step our filesystem back down to ext2?  I'd like to know how RH
suggests we fix the problem...

Comment 12 strovato 2005-12-01 18:06:26 UTC
Is this patch going to be added to the official Red Hat kernel at some point?  I
was bit by this bug, but compiling a custom kernel with the attached patch has
fixed the problem.

Comment 14 strovato 2006-01-23 14:46:58 UTC
*** Bug 173135 has been marked as a duplicate of this bug. ***

Comment 15 strovato 2007-07-25 07:35:46 UTC
Please add this patch to the official Red Hat kernel.  Thank you.

Comment 16 RHEL Program Management 2007-10-19 19:26:56 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.


Note You need to log in before you can comment on or make changes to this bug.