Bug 77582

Summary: SCSI emulation causes excessive load on bad CD media
Product: [Retired] Red Hat Linux Reporter: Craig Lawson <craig.lawson>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Mike McLean <mikem>
Severity: high Docs Contact:
Priority: medium    
Version: 7.3   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:40:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Craig Lawson 2002-11-09 18:42:56 UTC
Description of Problem:
The SCSI emulation layer causes excessive system load when attempting to read
from an ATAPI CD drive with bad media.

Version-Release number of selected component (if applicable):
Kernel 2.4.18.17.7.x  i686

How Reproducible:
100%

Steps to Reproduce:
1. Create a CD-R with media flaws. I used a CD full of JPEGs with a few
scratches in the ink side of the CD. As it happened, the directory structure was
intact, and only the file content was damaged.
2. Try to read the entire CD. Use "cp -vr" to locate the damaged files by name.

Actual Results:
When cp encounters the damaged files, it will stay on that file for a long time,
and eventually report an I/O error. The system log "/var/log/messages" will
report lots of messages like these:

  Nov  9 01:57:45 localhost kernel: SCSI bus is being reset for host 0 channel 0.
  Nov  9 01:57:45 localhost kernel: hdc: irq timeout: status=0x80 { Busy }
  Nov  9 01:57:45 localhost kernel: scsi : aborting command due to timeout : pid
6787, scsi0, channel 0, id 0, lun 0 Prevent/Allow Medium Removal 00 00 00 01 00 
  Nov  9 01:57:45 localhost kernel: SCSI host 0 abort (pid 6787) timed out -
resetting

some of these:

  Nov  9 01:57:45 localhost kernel: SCSI bus is being reset for host 0 channel 0.
  Nov  9 01:57:45 localhost kernel: scsi : aborting command due to timeout : pid
6786, scsi0, channel 0, id 0, lun 0 Read (10) 00 00 00 00 60 00 00 01 00 
  Nov  9 01:57:45 localhost kernel: SCSI host 0 abort (pid 6786) timed out -
resetting

and some of these:

  Nov  9 01:57:45 localhost kernel: SCSI bus is being reset for host 0 channel 0.
  Nov  9 01:57:45 localhost kernel: hdc: ATAPI reset complete
  Nov  9 01:57:52 localhost kernel: hdc: irq timeout: status=0x80 { Busy }
  Nov  9 01:57:52 localhost kernel: hdc: status timeout: status=0x80 { Busy }
  Nov  9 01:57:52 localhost kernel: hdc: drive not ready for command
  Nov  9 01:57:52 localhost kernel: scsi : aborting command due to timeout : pid
6787, scsi0, channel 0, id 0, lun 0 Prevent/Allow Medium Removal 00 00 00 01 00 
  Nov  9 01:57:52 localhost kernel: SCSI host 0 abort (pid 6787) timed out -
resetting

It takes a long time before the system gives up on a file and moves on to the
next one.

top reports the average system load rises to around 20.

The system becomes extremely unresponsive.

The system eventually completes the recursive copy, with bad files omitted, and
the system load returns to normal.

If there are enough flaws (on my CD, I had approximately 50 bad JPEG files, each
about 600 Kb), after a couple hours of attempting to copy bad files, the system
gets into a state where it cannot eject the media. Attempting to eject causes
more SCSI bus resets, increased system load, unresponsiveness, and after several
minutes the eject command fails.


Expected Results:
On encountering media read errors, I expected the system to retry a few times
but not increase the system load to unreasonable levels. Eject should not fail.


Additional Information:
My system is:
  933 Mhz P-III
  512 Mb RAM  hda: ATA disk
  hdc: PLEXTOR CD-R PX-W1610A, ATAPI CD/DVD-ROM drive
  hdd: LITEON DVD-ROM LTD163, ATAPI CD/DVD-ROM drive
The problem occurs when reading bad CDs from either drive. Only one drive is in
use at a time.

When using xcdroast to read the same bad CD, it reports bad sectors. It takes a
long time to read through the CD before eventually failing. However, the system
load does not increase to an unreasonable level.

Comment 1 Bugzilla owner 2004-09-30 15:40:11 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/