Bug 102498

Summary: CD-ROM : timeout waiting for DMA & tray open
Product: Red Hat Enterprise Linux 3 Reporter: Larry Troan <ltroan>
Component: kernelAssignee: Jeff Moyer <jmoyer>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: ichute, jgarzik, petrides, riel, tao
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-05 23:27:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
testcdrom-tray_open
none
lsmod.base
none
lspci.base
none
meminfo.base
none
cpuinfo.base (new)
none
lsmod.base (new)
none
lspci.base (new)
none
meninfo.base (new)
none
patch to clear the PG_error bit none

Description Larry Troan 2003-08-15 22:42:41 UTC
testcdrom is started (@10:51) with dma enabled followed by cdrw test, etc, 
Within ~1 minute of testcdrom starting, the following shows up in /var/log/messages:

10:52:39 hdg: timeout waiting for DMA
10:52:39 hdg: status timeout: status = 0xd0 {Busy}
10:52:39 hdg: status timeout: error: 0xd0LastFailedSense 0x0d
10:52:39 hdg: tray open
  .
  .
  .
14:26:31 hdg: tray open
14:26:32 hdg: timeout waiting for DMA
14:26:33 hdg: timeout waiting for DMA
14:27:14 hdg: DMA disabled
14:27:17 hdg: DMA disabled
  . 
  .
  .
Note that /dev/hdg is the cd-rom device and the tray is not open! The testcdrom
process is long since gone, as the \"tray open\" continues to appear.  In
addition, the initial \"timeout waiting for DMA\" is no expected.  
----------
Action by: kim.jensen
Issue Registered
----------
Action by: kim.jensen
Attached are the system log files that were captured after the 14:27 timeframe
lspci -vv > lspci.base
lsmod > lsmod.base
cat /proc/meminfo > meminfo.base
cat /proc/cpuinfo > cpuinfo.base
cp /var/log/messages var-log-messages



Status set to: Waiting on Tech (Long Term)
File uploaded: tray_open_data.tar.gz

----------
Action by: ltroan
What was code level being tested? Especially specific kernel? x86 or IPF? 
With sushi drops, alpha4 and beta1 are not descriptive enough.

kim.jensen assigned to issue for HP-WS.

Category set to: Kernel
Status set to: Waiting on Client

----------
Action by: ltroan


Summary edited.

----------
Action by: kim.jensen
IPF
RHEL3.0 Beta1 
linux-2.4.21-1.1931.2.349.2.2.ent 

Status set to: Waiting on Tech

----------
Action by: kim.jensen
Narrowed the issue down to a simple script.  Found that this same problem occurs
on RHEL3.0 Beta 1 (2.4.21-1.1931.2.349.2.2.ent) and does not occur on AS 2.1 QU2
(2.4.18-e.31).

Reading from the cdrom fails when you try to cat
/proc/ide/ide3/<cdrom-device>/identify when reading is in progress. 
/var/log/messages indicates errors from ide-dma driver and ll_rw_blk.c.  These
messages continue after the process completes, especially when DMA is disabled.

See testcdrom-tray_open script attached.

File uploaded: testcdrom-tray_open


ISUE TRACKER 26814 opened as sev 2

Comment 1 Larry Troan 2003-08-15 22:43:49 UTC
Created attachment 93674 [details]
testcdrom-tray_open

Comment 2 Larry Troan 2003-08-15 22:47:09 UTC
FILE tray_open_data.tar.gz TOO BIG TO ATTACH TO BUGZILLA. SEE ISSUE TRACKER
26814 FOR THIS APPEND (will try to break up and append later) 

Comment 3 Larry Troan 2003-08-21 23:59:04 UTC
Files from .gz file (too big to append to Bugzilla)
cpuinfo.base
lsmod.base
lspci.base
meminfo.base
var-log-messages

Comment 4 Larry Troan 2003-08-22 00:01:40 UTC
Created attachment 93846 [details]
lsmod.base

Comment 5 Larry Troan 2003-08-22 00:02:10 UTC
Created attachment 93847 [details]
lspci.base

Comment 6 Larry Troan 2003-08-22 00:02:38 UTC
Created attachment 93848 [details]
meminfo.base

Comment 7 Larry Troan 2003-08-22 00:08:44 UTC
/proc/cpuinfo from tray_open_data.tar.gz is 36.7MB of junk. /var/log/messages
and /var/log/dmesg missing..... Will ask HP to resend.

Comment 8 Larry Troan 2003-08-28 00:01:45 UTC
FROM ISSUE TRACKER....
Event posted 08-22-2003 03:03pm by kim.jensen with duration of 0.00
tray_open.tar.gz
Attached tray_open.tar.gz to replace tray_open_info.tar.gz.

Have you been able to reproduce the problem with the testcdrom-tray_open script?


Comment 9 Larry Troan 2003-08-28 00:05:06 UTC
tar -zxvf tray_open.tar.gz (too big to append on Bugzilla - expanding)
cpuinfo.base
lsmod.base
lspci.base
meninfo.base
var-log-messages


Comment 10 Larry Troan 2003-08-28 00:06:02 UTC
Created attachment 94003 [details]
cpuinfo.base (new)

Comment 11 Larry Troan 2003-08-28 00:06:35 UTC
Created attachment 94004 [details]
lsmod.base (new)

Comment 12 Larry Troan 2003-08-28 00:07:17 UTC
Created attachment 94005 [details]
lspci.base (new)

Comment 13 Larry Troan 2003-08-28 00:07:55 UTC
Created attachment 94006 [details]
meninfo.base (new)

Comment 14 Larry Troan 2003-08-28 00:11:00 UTC
36.7MB /var/log/messages is included in tray_open.tar.gz attached to Issue
Tracker 26814. TOO BIG TO ATTACH TO BUGZILLA.

Comment 15 Jeff Moyer 2003-10-14 19:00:29 UTC
I have been unable to reproduce this problem using the test script provided.  I
tried with dma enabled and disabled, and neither case causes problems.

Comment 16 Jeff Moyer 2003-11-24 16:25:39 UTC
I have now managed to reproduce the problem.  There are a couple of
issues.  First, the cdrom device should not return an I/O error (or
should at least we should cleanly recover from it).  Next, once the
error is reported, it should be cleaned up:

Currently, if a device returns an I/O error, the PG_error bit is set
in the page struct, but never cleared.  I wrote a patch which
addresses this issue, but the first issue of why the I/O error occurs
remains.

The behaviour after this patch is applied is that the I/O error will
still be reported to the application.  However, subsequent requests
will succeed.

Comment 17 Jeff Moyer 2003-11-24 16:28:16 UTC
Created attachment 96152 [details]
patch to clear the PG_error bit

Comment 18 Jeff Moyer 2003-11-26 18:17:41 UTC
The patch to clear PG_error has been accepted for U1.

Comment 19 Jeff Moyer 2005-09-19 21:51:06 UTC
I'm changing this from MODIFIED to NEEDINFO, since the core problem here wasn't
really addressed.

Larry, are you still experiencing this issue?

Comment 20 Ernie Petrides 2005-10-05 23:27:12 UTC
Closing due to lack of response.  This is believed to have been fixed in U1.