Bug 114941 - Data corruption on SCSI disks
Data corruption on SCSI disks
Status: CLOSED DUPLICATE of bug 112426
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Doug Ledford
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-02-04 13:32 EST by Martin Peschke
Modified: 2007-11-30 17:07 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-02-21 14:01:05 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Martin Peschke 2004-02-04 13:32:51 EST
Description of problem:
We have been running disk exercisers (called blast, which is an IBM
internal tool that is capable of verifying written data) on 40 LUNs
attached via 1 adapter driven by zfcp. Within a few hours verification
of sectors written to some disks failed. The read data did
unexpectedly not equal to the previously written data.

Version-Release number of selected component (if applicable):
2.4.21-EL.9

How reproducible:
almost 100 %, did it 3-4 times

Steps to Reproduce:
1. Load scsi_mod, zfcp, sd_mod
2. configure 40 disks by means of add-single-device
3. start up disk exerciser

Additional info:
I can attach blast reports about failed sectors, if requested.
Comment 1 Martin Peschke 2004-02-06 11:22:43 EST
Problem was most-likely caused by a missing scsi_eh thread. This
kernel thread is essential for SCSI I/O. Without proper recovery, the
result of to be recovered SCSI commands seems to be unpredictable. The
system seems to be silently (with default logging level) railroaded
into data corruption. The eh-thread was not created because there was
not a single scsi device/host available when loading modules. I have
just realized that our setup only included devices/hosts which were
added on-the-fly via proc-fs. A first re-test with at least one device
per host being available when loading SCSI modules has not shown any
data corruption so far. I will close this bugzilla entry as duplicate
to either 112426 or 106214 if the problem does not occur again.
Comment 2 Martin Peschke 2004-02-09 17:02:57 EST

*** This bug has been marked as a duplicate of 112426 ***
Comment 3 Red Hat Bugzilla 2006-02-21 14:01:05 EST
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.

Note You need to log in before you can comment on or make changes to this bug.