Red Hat Bugzilla – Bug 159590
corrupted data using software-raid (md)
Last modified: 2007-11-30 17:11:07 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; nl-NL; rv:1.7.5) Gecko/20041202 Firefox/1.0
Description of problem:
My system was a updated version of FC3. It had 3 180Gb drives (P.ATA) that I combined using software raid, level 5 (as /dev/md0). I created them at install time using the default partitioning tools. On top of that I used LVM and on top of that I had a few partitions (ext3), including root-dir (booting from a seperate harddisk that was not included in the raid-set).
After a few months of usage all the filesystems were suddenly completely corrupted. The system could not boot anymore (it couldn't find init) and when I tried mount the partitions using a rescue-CD, live-CD or a complete new install on a seperate drive I could not get /dev/md0 mounted.
I tried reinstalling everything and now at install-time when the installer is formatting the partitions it fails and reboots (after giving a message that something serious happened).
I have now reinstalled FC3 on the seperate harddisk (not part of the 3 180 Gb drives) without creating the raid-array at all. When I try to create a raid-5 set (after the install, using mdadm), the newly created /dev/md0 partition corrupts after a few hours of usage. After unmounting, it won't remount. Because I wanted to rule out the possibility of drive (or controller) failure I fdisk-ed the seperate drives and put an ext3 partition directly on each of them. I filled the 3 drives up with 1Gb files. No problem. Reading back a few of them (eg. cat < 1gbfile > /dev/null) gives no problem either.
This means I have tried the following "chains":
direct partitions on the drives: no problems
combine the drives using raid-5: corruption
combine the drives using raid-5 and then using LVM on top of that: corruption.
This problem may be related to bug-nr 152162 but I'm not sure.
Version-Release number of selected component (if applicable):
kernel-2.6.9-1.667 (but upgrades probably too)
Steps to Reproduce:
1. Start installation of FC3
2. Create raid-5 set using three 180Gb disks (for all three disks: partition 1: 256Mb swap, partition 2: remaining size of disk for software RAID)
3. continue installation process.
4. at formatting-time, just before the formating's finished the installer gives an error-message indicating something serious went wrong and reboots.
1. Install FC3 on a 40Gb harddrive, leave the 180 Gb disks empty (no partitions).
2. After the systems runs, create a partition on each 180 Gb drive (type 0xfd).
3. use mdadm to create a raid-5 set of the 3 partitions
4. mount the /dev/md0 at a dir.
5. start adding random data.
6. After a few Gb's, the data corrupts the filesystem (ls displays irregularities).
7. Unmound /dev/md0
9. mount /dev/md0 gives an error-message
Actual Results: see steps
Expected Results: the installer should have finished formatting/no curruption
I get emails from smartd with subject:
SMART error (CurrentPendingSector) detected on host: bio.lan
Device: /dev/hdh, 11 Currently unreadable (pending) sectors
and in another mail:
Device: /dev/hdg, 2 Currently unreadable (pending) sectors
After more investigation it seems that the hardware was faulty after all
(filling the harddisks up with data didn't give any problems but I now dumped
all data to /dev/nu// and did get errors). Apologies.