Bug 122252 - ext3/quota deadlock condition consistently hangs systems
ext3/quota deadlock condition consistently hangs systems
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Stephen Tweedie
: 173135 (view as bug list)
Depends On:
  Show dependency treegraph
Reported: 2004-05-02 10:34 EDT by Marc Wallman
Modified: 2007-11-30 17:07 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-10-19 15:26:56 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Backport of fix for quota/ext3 deadlock from kernel-2.4.25 (2.83 KB, patch)
2004-10-28 19:00 EDT, David Lehman
no flags Details | Diff

  None (edit)
Description Marc Wallman 2004-05-02 10:34:00 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Gecko/20040207 Firefox/0.8

Description of problem:
We have been having trouble with 3ES hosts locking up when running
with quotas enabled on an ext3 filesystem. The problem happens at
random times, under both heavy and light loads. We are unable to run
more than a few days, regardless of the load, without our systems
locking up.

The bug was identified and fixed in the mainline 2.4.25 kernel, but as
far as I can tell, this fix has not been backported yet to the v3ES
kernel. I have examined both the changelog for the 3ES kernel and
looked at the source code for the 2.4.21-9.0.3.EL.

The fix was submitted in v2.4.25-pre5 by jack:ucw.cz. See the URL to
the 2.4.25 changelog in the URL field. Can someone backport this patch?

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Enable quotas on a ext3 filesystem.
2. Have disk activity on it (for us, uw-imapd is the kind
   of disk activity that generates the lockup)
3. Wait, probalby not more than a few days.

Actual Results:  Our hosts will consistently hang after a few days. We
are unable to keep them stable enough with quotas enabled to run them
as production servers.

Expected Results:  The host should not lock up.

Additional info:
Comment 1 Rik van Riel 2004-05-02 12:03:18 EDT
Reassigned to ext3 author.
Comment 2 Kevin Fenzi 2004-09-08 14:15:01 EDT
Is any progress being made to track this issue down?
It seems to have been around for quite a while, and it means you
basically can't use quotas in a production env. 
I see it here on a server, usually less than a day after enabling quotas. 

I have sysrq output when it's in the deadlock state. 
Anything else we can do to help solve this issue?
Comment 3 David Lehman 2004-10-28 19:00:15 EDT
Created attachment 105927 [details]
Backport of fix for quota/ext3 deadlock from kernel-2.4.25
Comment 5 Michael Simms 2004-11-03 20:39:15 EST
Does this mean we'll see an official EL kernel with this fix sometime
Comment 7 Ernie Petrides 2004-11-04 15:09:59 EST
No fix for this problem has yet been committed to a RHEL3 patch pool,
and specifically U4 is already closed (and in beta now).
Comment 11 fkass 2005-01-27 10:52:04 EST
This really should be increased in priority!  We are seeing this same
problem and it is creating major issues for us.  Do we apply this
outdated patch onto 2.4.21-27.0.2.ELsmp?  Do we ignore RH kernels and
just put in 2.6.10 which is supposed to have fixed the problem?  Do we
step our filesystem back down to ext2?  I'd like to know how RH
suggests we fix the problem...
Comment 12 strovato 2005-12-01 13:06:26 EST
Is this patch going to be added to the official Red Hat kernel at some point?  I
was bit by this bug, but compiling a custom kernel with the attached patch has
fixed the problem.
Comment 14 strovato 2006-01-23 09:46:58 EST
*** Bug 173135 has been marked as a duplicate of this bug. ***
Comment 15 strovato 2007-07-25 03:35:46 EDT
Please add this patch to the official Red Hat kernel.  Thank you.
Comment 16 RHEL Product and Program Management 2007-10-19 15:26:56 EDT
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
For more information of the RHEL errata support policy, please visit:
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.

Note You need to log in before you can comment on or make changes to this bug.