Red Hat Bugzilla – Bug 44327
cp command produces different file on RAID 5 IDE disks
Last modified: 2007-04-18 12:33:40 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0)
Description of problem:
I installed the RH7.1 server system on a PC containing 4 IBM UDMA disks.
If those disks are configured to RAID 5, the resultant file of the cp (and
dd) command is DIFFERENT from the original file! However, on the same
hardware, the cp (and dd) command produces the SAME target file as the
original if I DO NOT use RAID array.
Steps to Reproduce:
1.cp dx100.tar dx100.tar1 (while dx100.tar is of 650MB size)
2.cmp dx100.tar dx100.tar1
3.cp dx100.tar dx100.tar2 (just repeat the procedure)
4.cmp dx100.tar dx100.tar2
Actual Results: the cmp result (of step 2):"dx100.tar and dx100.tar1
differ: char 9359357 line 33367"
the cmp result (of step 4) "dx100.tar dx100.tar2 differ: char 106258429
Expected Results: the cmp should returns 0 (not 1) because both
dx100.tar1 and dx100.tar2 are the direct copy of dx100.tar.
The hardware is as follows:
Motherboard: Epox (EP-D3VA)
CPU: PIII 866MHz x 2 (dual cpu)
Hard disks: (4 x) IBM UDMA 41.0GB
I also installed RH7.1 server to the same PC without configuring the hard
drives to RAID array. The cp command produced the correct result.
Furthermore, I have 2 other servers using SCSI RAID5 disks (under RedHat
6.2 and RedHat 7.0). The cp command is OK.
I also have a server using IDE RAID1 under RedHat 7.0. The cp command
cp and dd don't make a difference between raid and non-raid devices. I tend to
believe this is a raid driver bug.
I assume this is linux softwareraid.
Could you try running with "ide=nodma" as commandline option to rule out
IDE DMA corruption ?
After further test, I believe that the problem might be in the driver for the
onboard HTP370 (IDE adaptor) driver.
I have been using a mother board (EPOX D3VA) consisting 2 conventional UDMA66
channels and further 2 UDMA100 channels based on a HTP370 (High Point 370)
chipset. Under Redhat 7.1, the disks on the conventional UDMA66 channels
are /dev/hda, /dev/hdb, ..., /dev/hdd and the disks on the HTP channels
are /dev/hde, ..., /dev/hdh.
Using Redhat 7.1 software RAID 5 on the conventional UDMA66 channels
(/dev/hda, ..., /dev/hde), the result is OK and there is NO data corruption!
This indicates that the software for the RAID 5 in Redhat 7.1 is OK.
However, I got the reported data corruption problem when connecting the SAME
disks, the SAME RAID 5 configuration (just changing /dev/hda to /dev/hde, etc
in the /etc/raidtab) to the HTP channels. Furthermore, when I put the disks to
the UDMA66 channels (using /dev/hda, etc.) together with the reported corrupted
data on the disks and run the cmp command, the cmp magically returns 0 (ie. no
Further tests show NO data corruption on individual disk when connected to the
HTP channels (ie, not running RAID5). Therefore, I now suppect that the problem
is caused by the combination of RAID5 and HTP370 driver.
Sorry, I already took that machine apart and replaced with a different mother
board (SuperMicro 370DLE) and using a 3ware (http://www.3ware.com) Escalade
6800 raid card. However, I still choose to use the software raid as it offers
better performance than the hardware RAID5. Because the 3ware uses SCSI
interface on the UDMA disk, the software RAID5 (on scsi disks) does suffer data
I will try to use ide=nodma next time when I setup the Epox D3VA hardware and
let you know the result.
Closing: Appears to be the now worked around old VIA chipset hardware problem