Red Hat Bugzilla – Bug 223217
Files on RAID experiencing corruption
Last modified: 2007-11-30 17:11:53 EST
Description of problem:
I have an Asus A8V-SE with a Via VT6420 RAID controller. There are two HDs;
hda (new), hdb (old, possibly dying). There are two other HDs sda and sdb on
the RAID controller, and they are configured as a RAID-1 volume md0. This is
an FC6 install, with kernel 2.6.18-1.2798.fc6 #1 SMP x86_64. All fs are ext3.
I mounted hdb5 and copied my old partition to md0. After a little while I
found a file that had a strange one-bit error in it. Since the old drive is
probably dying of bad sectors, I assumed the fault was there. How to correct?
Make a second copy, diff the two, and if the corruption is random, the diff
should show all the errors, and I can manually correct. So I made a second copy
to hda2, and then diffed** hda2 and md0. This returned a file with a large
number of errors. I manually corrected some errors in one subdir that I needed,
and noticed that all the errors were in the md0 copy, and none in the hda2
copy. So I'm diffing hda2 against hdb5 and so far, no errors at all. This
implies that the errors on md0 were introduced by the RAID somehow.
Version-Release number of selected component (if applicable):
Stock FC6 install.
The partition is 30 Gb. I've only got this one machine. So, I can't attempt to
reproduce it, but I suspect it's reproducible.
Steps to Reproduce:
Files on md0 show one-bit corruption. Somewhere between 200 and 400 such errors
in 30 Gb of files.
Files are corruption-free.
Willing to test. I'd desperately like the RAID to work. I haul very large sets
of files around, and early testing has shown the RAID-1 to cut time by 66%.
*** This bug has been marked as a duplicate of 223216 ***