Red Hat Bugzilla – Bug 49578
ext3 kernel messages when writing large files / file size irregularity
Last modified: 2007-04-18 12:34:57 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
Description of problem:
While writing large files (512MB - 1024MB) with a utility called Bonnie
(www.textuality.com/bonnie, source attached) on ext3 partitions located on
a RAID 1 drive array, the following kernel messages were observed:
(hostname) kernel: JBD: out of memory for journal head.
(hostname) kernel: ext3_write_inode: inside transaction!
The machine is a P3-667Mhz with 256MB RAM, IBM ServeRAID 4M controller
(also happens with ServeRAID 4L).
It looks like the file size ends up being too small - Bonnie writes a file
out using the size you give it, and then reads it back in. It reports an
error because it receives an EOF before it reads in the number of bytes it
was supposed to.
Steps to Reproduce:
1. Use bonnie to benchmark a RAID 1 drive array using a 1024MB file -
bonnie -s 1024 -d <dir on RAID 1>
2. The kernel messages described above should appear, and bonnie should
fail while "Reading with getc()..."
Created attachment 24367 [details]
source for Bonnie drive benchmarking utility
The first message is a warning only, but the file should be ok, so this is
indeed a bug.
Both messages are warnings. The second message is benign: it is simply a result
of a debugging message which escaped into the wild and has since been
eliminated. The out-of-memory error means that the kernel ran out of memory at
a critical point for the filesystem. The filesystem will retry in that case
until the allocation succeeds, so it is not the root cause of any file
corruption, but it is entirely probable that if we got into such a low-memory
state that the kernel would start failing other filesystem write operations with
-ENOMEM for other reasons which might not have been logged.
There is a new version of ext3 being pushed out without the second debugging
message, and with more informative output about out-of-memory situations, but
the underlying low-memory problems may remain. We will let you know when that
build is available for you --- it has been built locally already and will be on
I really suspect that the remaining part of the problem is VM-related, not
filesystem-related (although we know for certain that ext3's pattern of VM use
does cause problems for the VM that ext2 does not provoke.) The results from
the newer build will be useful in determining this.
This defect is considered SHOULD-FIX for Fairfax.
Could you try the kernel from the Roswell beta or from rawhide? There is still
VM tuning being done, but the ext3 debug logging has been cleaned up enormously.
The Roswell kernel does seem to have cleared this up - we no longer get those