Red Hat Bugzilla – Bug 85379
Panic caused by bug in ext3 transaction handling.
Last modified: 2007-11-30 17:06:52 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.76 [en] (X11; U; Linux 2.4.2-2smp i686)
Description of problem:
The panic is due to a failed J_ASSERT assertion at line 227 of
fs/jbd/transaction.c in function journal_start().
J_ASSERT(handle->h_transaction->t_journal == journal);
The jist of the problem is that while a new ext3 inode was being created
(transaction #1, inode #1), an attempt to allocate dynamic kernel heap memory
(for the new in-memory inode) from the inode kmem cache initiated the attempted
expansion of this cache by slab pages. the expansion attempt finding no free
pages, initiated memory rea/shrink/prune actions which cause some other ext3
inode to be deleted as a result of trying to prune the dcache. The attempt to
delete the second inode initiated the creation of a second transaction before
the first one had completed. It is suspected that ext3 is not designed to
handle embedded transactions within the same process.
Since this is all occurring within the context of a single "cp" process, the
J_ASSERT fails when it tries to assert equivalence between the address of the
data structures for the first and second transactions.
A possible fix is to change the call to kmem_chcahe_alloc() in the inode_alloc()
macro called from get_empty_inode() in fs/inode.c to use GFP flags of GFP_NOFS
instead of GFP_KERNEL. The lack of the __GFP_FS flag in the GFP flags parameter
to shrink_dcache_memory() will force shrink_dcache_memory() to return without
trying to prune the dcache and possibly free (and delete) inodes.
The kernel call stack is below:
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Run heavy I/O against ext3 filesystems.
what modules are in use?
Which kernel version exactly was this, and can you please post the full OOPS?
The kernel being used is 2.4.9-e.12smp with the debugger enabled. (Ed, Jimmy,
please correct me if that is incorrect.)
kdb lsmod listing of modules loaded at time of panic are as follows:
emcppn (EMC PowerPath module)
emcpmpc (EMC PowerPath module)
emcpmp (EMC PowerPath module)
emcp (EMC PowerPath module)
Ed is actually doing the debugging on this system and has been in kdb. Is
there a way to grab the oops data from kdb? If so, what is it?
Also, Ed just noticed that the i_dev field of the 2nd inode (the one being
deleted at the top of the call stack) has an i_dev value which indicates it is
for a file object on one of our (EMC PowerPath) managed disks. This is the
first indication that he has seen that PowerPath may be involved in the problem.
This bug is filed against RHEL2.1, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
For more information of the RHEL errata support policy, please visit:
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.