Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 4 product line. The current stable release is 4.9. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 162814

Summary: Assertion failure in log_do_checkpoint
Product: Red Hat Enterprise Linux 4 Reporter: Stephen Tweedie <sct>
Component: kernelAssignee: Stephen Tweedie <sct>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: davej, hhd405131, jwelden, mike, redhat.com
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0132 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-07 19:17:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 123137    
Bug Blocks: 168429    

Description Stephen Tweedie 2005-07-08 21:23:13 UTC
+++ This bug was initially created as a clone of Bug #123137 +++

Description of problem:
Assertion failure in log_do_checkpoint() at fs/jbd/checkpoint.c:361:
"drop_count != 0 || cleanup_ret != 0"
------------[ cut here ]------------
kernel BUG at fs/jbd/checkpoint.c:361!

Version-Release number of selected component (if applicable):
2.6.5-1.327

How reproducible:
Rare

System was a dual Xeon with AMI Megaraid RAID controller.  File
systems are Ext3.

I'll attach the oops output in a second.

Comment 3 Dave Jones 2005-09-05 03:53:35 UTC
*** Bug 167343 has been marked as a duplicate of this bug. ***

Comment 4 Jeff Welden 2005-09-12 23:29:40 UTC
There is a one-line fix for this by Jan Kara in the Vanilla Linux Kernel with
2.6.11.12:
    http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.11.12

Additional discussion:
    http://lkml.org/lkml/2005/6/1/34
    http://marc.theaimsgroup.com/?l=linux-kernel&m=111761151011571&w=2

Is it possible for you to create a patch for this for 2.6.9-11 EL smp kernel?

Comment 5 Need Real Name 2005-09-14 22:52:55 UTC
I've tried this patch, and it DOES seem to fix this problem!   Well done! 
Hopefully RedHat will create a kernel update ASAP.


Comment 6 Need Real Name 2005-10-03 20:07:40 UTC
This patch has been in production for 3 weeks now without a single problem. 
These machines would PANIC almost daily before, mostly at night when we were
running backups.  

Maybe this problem is mostly associated with high-end hardware, like DL380s, but
I would think that RedHat would be interested in fixing such a serious problem,
especially ones that affect their target hardware.

Sofar, I've heard nothing to show that RedHat interested in fixing this.

Will this patch be included in a future kernel?

Comment 7 Stephen Tweedie 2005-10-05 20:24:57 UTC
Yes, this fix looks good, and it matches the upstream fix.  It will be queued
subject to the usual internal review for the U3 kernel.

I have a kernel built based on U2 plus 3 filesystem fixes:
* readahead fixes for random >4k read performance
* ext3 performance fix for very slow performance when writing large files on
huge filesystems
* this log_do_checkpoint fix.

i686 and x86_64 kernels are available from:

http://people.redhat.com/sct/.private/test-kernels/kernel-2.6.9-22.EL.sct.4/

Comment 9 Stephen Tweedie 2005-11-07 19:12:33 UTC
Fix committed for inclusion in U3.

Comment 12 Red Hat Bugzilla 2006-03-07 19:17:12 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0132.html


Comment 15 Jason Baron 2006-07-27 19:37:41 UTC
*** Bug 200434 has been marked as a duplicate of this bug. ***