Bug 73846 - file system read call returns corrupted data
Summary: file system read call returns corrupted data
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: libaio
Version: 6.2
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Ben LaHaise
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-09-12 04:13 UTC by Armin Haken
Modified: 2008-05-01 15:38 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2002-09-12 04:27:54 UTC
Embargoed:


Attachments (Terms of Use)
Test app that demonstrates problem (2.78 KB, text/plain)
2002-09-12 04:16 UTC, Armin Haken
no flags Details
sample makefile (47 bytes, text/plain)
2002-09-12 04:23 UTC, Armin Haken
no flags Details

Description Armin Haken 2002-09-12 04:13:28 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0; MSNDE; T312461)

Description of problem:
Under heavy disk system load, "read" calls will sometimes return data that 
contains a sequence of 16 wrong Bytes. It seems that high memory usage and heap 
allocation are also involved in reproducing the problem. I'm attaching an 
example app that shows the failure on each of the 3 RedHat 6.2 machines I have 
available, but does not fail on other linux machines including RedHat 7.3

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.build the attached drt.cpp (sample Makefile attached)
2.run the "drt" app.. on 2 processor systems it seems to fail more reliably 
when 2 copies of the app are running (in separate directories)
3.
	

Actual Results:  The app will print out text including "read buffer does not 
match" as well as the iteration number (out of 10) that is running. Typically I 
see around 3 failure messages per run. When a failure is detected, the correct 
and the misread buffers (length 40KB) are printed in the files errorfile_should 
and errorfile_reality. Comparing these files shows a sequence of 16 Bytes on 
which they differ.

Expected Results:  Non-error operation results in output that is only an 
iteration count from 0 to 9.

Additional info:

The behavior of the test app on all the various platforms I tried is the same 
when I build it on RedHat 6.2 as when I build it on Debian. (I link statically.)

Comment 1 Ben LaHaise 2002-09-12 04:16:25 UTC
libaio does not exist in Red Hat 6.2.

Comment 2 Armin Haken 2002-09-12 04:16:37 UTC
Created attachment 75844 [details]
Test app that demonstrates problem

Comment 3 Ben LaHaise 2002-09-12 04:22:17 UTC
What version of the kernel is this with?  At least one of the 2.2 errata
affected page cache io on smp.

Comment 4 Armin Haken 2002-09-12 04:23:22 UTC
Created attachment 75845 [details]
sample makefile

Comment 5 Armin Haken 2002-09-12 04:27:47 UTC
the kernel is 2.2.14-5.0 and the error shows up on single as well as dual 
processor machines

Comment 6 Ben LaHaise 2002-09-12 04:54:43 UTC
That kernel is obsolete and vulnerable to the aforementioned bug.  Upgrade and
reopen if the problem persists.  This is not a bug in libaio, but in older
kernels for which errata were already released.

Comment 7 Armin Haken 2002-09-16 21:24:20 UTC
The problem did indeed disappear after upgrading
to kernel 2.2.17-14


Note You need to log in before you can comment on or make changes to this bug.