Bug 57225

Summary: FileSystem corruption using RAID-1
Product: [Retired] Red Hat Linux Reporter: Gilles CHAUVIN <gilles.chauvin>
Component: raidtoolsAssignee: Dave Jones <davej>
Status: CLOSED WONTFIX QA Contact: David Lawrence <dkl>
Severity: high Docs Contact:
Priority: high    
Version: 7.2CC: pfrields
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-11-25 07:08:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gilles CHAUVIN 2001-12-07 11:02:57 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.6) Gecko/20011120

Description of problem:
When I install RedHat Linux 7.2 with RAID-1, I've got several filesystem
corruptions happening.

Hard/Soft Description:
- Mobo MS-6309 V2.X [MSI M6309B] (BIOS v3.7)
- HDDs Western Digital 40GB (Model=WDC WD400BB-00CLB0, FwRev=05.04E05)
- HDDs installed using RAID-1
- EXT3 filesystem

After installing the distro, all the RPMs were updated to the latest
versions (using up2date).

A big amount of datas (about 14GB) are copied into the largest partition
on the system (/home). After doing this, I unmount /home and then run a
"fsck". Several errors are found. If I run "fsck" several times, each
time it runs, it find errors on the filesystem.

Some strange things happens sometimes, I got this on several partitions.
The example below if for the / partition:
# ls /usr/bin/tic -l
-rwxr-xr-x    1 100204544 589824   2539828910527436 Jul 20 14:25
/usr/bin/tic

The first HDDs were IBM-DeskStar 60GXP HDDs. I thought the problem came
from here and changed them for WDC HDDs... Same problem.

Doing a md5sum of _ALL THE FILES_ in the system and then checking the
files just after the 1st md5sum show some errors with some files that
were not modified between the 1st and the 2nd md5 check.

Switching back to EXT2 doesn't solve the problem. Connecting the HDDs on
another PC doesn't solve the problem too (the other PC is totally
different. Different MoBo, different chipset, etc...).

Installing RH7.2 on a single HDD system (not using RAID) is OK and no
file corruption happen. That why I think it is a RAID-1 related problem.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
- Install RH 7.2 on a 2 HDDs system with EXT3/RAID-1 support.
- Use up2date to upgrade the distro (<= I'm sure this step isn't obligatory)
- Copy a big amount of file on a partition (i.e. /home)
- run:
  # find / -type f -exec md5sum {} \; > /MD5SUM

  and then
  # cd /
  # md5sum --check /MD5SUM 2>&1 | grep -v ": OK$"

  You'll probably get some errors :(

  # Run fsck to check the partitions for errors.

Actual Results:  Filesystem corruption

Expected Results:  No errors to make my fileserver running well :)

Additional info:

I've not posted here all the tests I've made to identify this problem
ad it would be very long to describe :). But I can post any additional
information that could help to solve this....

Comment 1 Gilles CHAUVIN 2001-12-11 10:02:30 UTC
Tried RH7.1 on the same hardware.... Exactly the same problem...

I just decided to try another distro (Mandrake 8.0) to see if it solves
something. The problem doesn't occur using that distro.