Red Hat Bugzilla – Bug 119033
Random ext3 filesystem corruption under heavy disk activity load
Last modified: 2007-11-30 17:07:01 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1)
Description of problem:
We originally installed a RHEL3 system on a dual processor Xeon
hyperthreaded P4 system. After about three weeks of uptime, it
developed ext3 filesystem corruption (random files would suddenly
appear as if their sizes were in the multi-terabyte range for
example). It repeatedly developed filesystem corruption even after
being fscked and so we replaced the server with a nearly identical
machine running RH9, and a single processor (a hyperthreaded p4 Xeon).
It _also_ developed ext3 filesystem corruption after about 3 weeks of
uptime. When I attempted to delete a corrupted file entry, the entire
server crashed and could not be recovered using fsck.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Install RHEL3/RH9 to a dual or single p4 Xeon system with
hyperthreading enabled and 3ware SATA raid5 system with 1 gigabyte of
RAM. Disable 'atime' for the partitions.
2.Install qmail mail server
3.Run under sustained mail traffic load (~40,000 messages per day) for
roughly 3 weeks
4.Run nightly rsync backups of entire server
Actual Results: Corruption in random places of the ext3 filesystem -
the corruption appears _anywhere_ in the filesystem, even in
directories where nothing has been modified.
Expected Results: No filesystem corruption
Our RHEL3 server id is 1004130933. The second box is identical, except
it was running RH9 and only had one processor instead of two.
I've been doing some Google digging, and discovered this may be a
3ware hardware issue. There is a thread at
indicates that 3ware 66Mhz products have a serious problem on Intel
750X chipset and some AMD boards - particularly if using a
manufacturer riser board.
3ware appears to be trying to keep a low profile on it, but there is a
technical brief on it at
As the second comment pointed out, this would appeared to be a 3Ware issue. We
didn't get any other reports of ext3 corruption like this. I'm closing this bug
out as NOTABUG since it appears it was a hardware issue.