Bug 456002 - Very Large File Corruption
Very Large File Corruption
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
9
x86_64 Linux
low Severity high
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-07-20 03:08 EDT by B. Britt
Modified: 2008-07-20 19:56 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-20 19:56:28 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description B. Britt 2008-07-20 03:08:32 EDT
Description of problem:

Very large files are corrupted when written

Version-Release number of selected component (if applicable):


How reproducible:

Very easy on my system (uname -a):

Linux liv.local 2.6.25.10-86.fc9.x86_64 #1 SMP Mon Jul 7 20:23:46 EDT 2008
x86_64 x86_64 x86_64 GNU/Linux

Steps to Reproduce:
1.Run this shell script:
#! /bin/csh

set count = 10000000
@ count = $count / 2
dd if=/dev/zero of=zerofile count=$count
echo 1
md5sum zerofile
echo 2
md5sum zerofile

  
Actual results:
(Results vary)

1
bd14343288d1a834bfb0ae933bf8e4b6  zerofile
2
81096f2d72f4b010f8cc8f6c7b19b812  zerofile

Expected results:

The output of the first md5sum and the second should be the same but they are
not.  'sum' shows the same problem.

Additional info:
No data errors are report in /var/log/messages.

I started seeing this problem on Fedora-8 but only on very large files (in the
above case the file size is 2.4G).  The folllowing C program reports the
corruption in the above file (0x00 bytes that becomes a 0x80 bytes)
#include <stdio.h>


FILE *f;
unsigned char b[4096];
int i;
int n;
int total;

int main(char *argv[], int argc) {
  f = fopen("zerofile","r");
  if (f) {
    total = 0;
    while ((n = fread(b,1,sizeof(b),f)) > 0) {
      for (i = 0; i < n; i++) {
        if (b[i]) {
          printf("%d (0x%x): 0x%x\n",total,total,b[i]);
        }
        total++;
      }
    }
    fclose(f);
  }
}

Sample output:
1149587513 (0x44855039): 0x80
1149587545 (0x44855059): 0x80
1149587577 (0x44855079): 0x80
1149587641 (0x448550b9): 0x80
1149587673 (0x448550d9): 0x80
1149587769 (0x44855139): 0x80
1149587801 (0x44855159): 0x80
1149587897 (0x448551b9): 0x80
1149587929 (0x448551d9): 0x80

The drives are Seagate SATA 500 G drives. I've had problems with two different
drives---one was attached via a USB enclosure and one is internally attached via
SATA.  Both were connected via the logical volume mechanism (different lv's,
though).

Lastly, I've not seen a file < 1GB have a problem though I haven't run the above
script intensively on smaller files.
Comment 1 Dave Jones 2008-07-20 03:21:56 EDT
not reproducible here.
0x0 > 0x80 is a single bit flip, which is sometimes indicative of bad memory.
You might want to try running memtest86 for a while.
Comment 2 B. Britt 2008-07-20 19:56:28 EDT
Memtest86 indicated errors in both memory DIMMs.  I swapped them out and there
are no errors when I copy large files.

Thanks!

Note You need to log in before you can comment on or make changes to this bug.