Bug 228238

Summary: Updated kernel 2.6.19-1.2895 introduces timeouts on ATAPI device
Product: [Fedora] Fedora Reporter: Bryan J. Smith <b.j.smith>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: davej, triage, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: bzcl34nup
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-06 19:13:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bryan J. Smith 2007-02-12 01:36:28 UTC
Description of problem:

Updated to kernel 2.6.19-1.2895 from prior (and working quite well) kernel
2.6.18-1.2869, running x86-64 kernel.  Immediately started experiencing ATA
device timeouts on hda (stock ata driver), which is the ATAPI device (SuperMulti
16x DVD-R/RW/RAM/+R/+RW device), causing the system to "hang" every few seconds
for 5-10 seconds at a time while reset is thrown.  Occurs more quickly when
CD/DVD is in drive, but still occurs eventually.

No issues at all with SATA fixed disk (sata_nv).   This is a nVidia GeForce Go
6150 w/nForce 430 MCP (HP Pavilion dv9000).

Have not personally seen this issue since old 2.4.18/19 (whenever Hendrick
[temporarily] removed his IDE/ATA codebase) long ago.  Did some regression or
"politically motivated" change occur in the stock 2.6.19/20 kernel?  I haven't
researched this any further, just remember that fiasco back 1 version of the
kernel (don't know if it applies, probably not).

Version-Release number of selected component (if applicable):

kernel 2.6.19-1.2895 (x86-64)
Did _not_ have issue on immediately previous kernel 2.6.18-1.2869 (x86-64)
(that is what is most troubling, and different from bug 215884)

How reproducible:
Always

Steps to Reproduce:
1.  Use system for awhile (happens sooner with disc in ATAPI device)
2.
3.
  
Actual results:
timeout (blah, blah, blah, can post exact message if needed, but it's the common
ATA device timeout/reset)

Expected results:
none of the above ;->

Additional info:

NOTE:  I don't think this is related to bug 215884 for 2 reasons:
1) it doesn't affect SATA (unless that user is using an ATAPI device over SATA)
2) users was having issues with the earlier kernel (I am not)

System Info:  HP Pavilion dv9000 (Turion x2 TL-50, GeForce Go 6150/nForce 430)
Release:  Fedora Core 6 x86-64

Also have a desktop GeForce 6150/nForce 430 (also with a SuperMulti
DVD-R/RW/RAM/+R/+RW ATAPI device) that I will update to kernel 2.6.19-1.2895 to
see if the same occurs.  Had avoided doing so because of the ATA issues with
2.6.19-1.2895.

Comment 1 Bryan J. Smith 2007-02-12 01:40:19 UTC
NOTE:  These SuperMulti 16x DVD-R/RW/RAM/+R/+RW drives are fairly new or have
updated firmware.  The HP Pavilion dv9000 series isn't even 9 months old (will
post exact make/model of drive), and my LG GSA-4167 in the desktop has an
updated firmware from about 6 months ago (haven't tried 2.6.19-1.2895, still
running 2.6.18-1.2869 -- although I will later in the week to see if I get the
same issues).  So I don't think it's firmware or otherwise related to any prior,
long-removed SuperMulti firmware-driver issues.  There clearly was no issue with
2.6.18-1.2869 -- both on the notebook and on the desktop.

Comment 2 Bryan J. Smith 2007-02-12 04:39:23 UTC
Here's the exact message when no disc is in the system ...

(date) (systemname) kernel: hda: status timeout: status=0xd0 ( Busy )
(date) (systemname) kernel: ide: failed opcode was: unknown
(date) (systemname) kernel: hda: drive no ready for command

I guess I misspoke earlier, there is no "reset" message (at least not one with
no disc in the drive).  If I boot back into kernel 2.6.18-1.2869, I can't
reproduce this error, not even with a disc in the drive and hammering it hard. 
I'll try again with both kernels and hammer the drive hard, as well as the
desktop with the newer kernel (especially if I can reproduce it there with the
newer kernel).



Comment 3 Bryan J. Smith 2007-02-12 04:54:44 UTC
Okay, now I've been guilty of some ignorance ...

This drive in my dv9000 is _not_ a Hitachi SuperMulti, but sort of Toshiba
(Sony/Philips-based firmware?) +R/RW Multi (w/-R/RW support), make/model
TSSTcorp TS-L632D.
Ugh, I bought the dv9000 because I thought it was a SuperMulti (stupid me), like
most other HP Pavilions have been shipping with (I should have checked).

Anyhoo, it's supposed to be capable of UltraDMA Mode 2 (Ultra33, DDR 33.3Mbps,
CRC) c/o the following URL (among others).
http://support2.jp.dell.com/docs/storage/P117843/en/spec.htm

Unfortunately, in both kernel 2.6.19-1.2895 and 2.6.18-1.2869, it's only coming
up with MultiWord DMA Mode 2 (old SDR 16.6MBps, no CRC).
Trying "hdparm [-d1] -X66 /dev/hda" on the device is no go, returns an error
"error=0x04 { Aborted Command }" as well as "ide: failed opcode was: 0xef"
I don't think that's the primary issue, although improper support of the ATA
channel it could be part of the problem in 2.6.19-1.2895.

I'm going to try an updated HP firmware to see if that fixes the channel issue.
That would at least eliminate one potential cause.

Again, I'll test out the LG GSA-4167 SuperMulti on the desktop with the latest
2.6.19-1.2895 kernel tomorrow evening to see if it runs into the same issues.


Comment 4 Bryan J. Smith 2007-02-12 05:20:15 UTC
Last comment on this, as it may be drive-specific (although it is bothering me
that the prior 2.6.18-1.2869 release worked perfectly).

I have firmware HH15 (HP), which is the latest I can find on the Internet as
well.  All Google pages for other hardware -- Acer, Dell, etc... -- show it
reporting itself as UDMA capable, including udma0/1/2 from hdparm -i.

I'm starting to think this may be a BIOS/Windows setup issue, as many people
have complained about Windows putting it in PIO or MDMA modes.  It could also be
that the BIOS setting/Windows driver is causing UDMA to be disabled on the channel.

Until I can confirm this, I can't say it's a (probably stock) kernel issue.  I
can only state that it didn't occur until the latest 2.6.19-1.2895 kernel, and I
can't reproduce it under 2.6.18-1.2869.  But it could just be exposing an issue
that has nothing to do with Linux.


Comment 5 Bryan J. Smith 2007-02-12 06:05:15 UTC
Again, I don't know if it's the firmware/UltraDMA issue (I'm skeptical since I
didn't have any issues with the immediately previous kernel update), or the
kernel's ATA/ATAPI support, but I've opened a related post on an optical
firmware support site if anyone is interested:  
  http://forum.rpc1.org/viewtopic.php?p=199178  


Comment 6 Chuck Ebbert 2007-02-12 15:44:06 UTC
What driver is the system using for the DVD drive? Please
post the relevant lines from bootup.

e.g I have:

Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH7: IDE controller at PCI slot 0000:00:1f.1
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 16 (level, low) -> IRQ 16
ICH7: chipset revision 1
ICH7: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:DMA
Probing IDE interface ide0...
hda: SONY CD-RW CRX217E, ATAPI CD/DVD-ROM drive
hdb: PHILIPS DVD+/-RW DVD8801, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14


Comment 7 Bryan J. Smith 2007-02-13 01:03:44 UTC
Here's Kernel 2.6.18-1.2869 ...

 ... 
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
NFORCE-MCP51: IDE controller at PCI slot 0000:00:0d.0
NFORCE-MCP51: chipset revision 241
NFORCE-MCP51: not 100% native mode: will probe irqs later
NFORCE-MCP51: BIOS didn't set cable bits correctly. Enabling workaround.
NFORCE-MCP51: 0000:00:0d.0 (rev f1) UDMA133 controller
    ide0: BM-DMA at 0x3080-0x3087, BIOS settings: hda:DMA, hdb:pio
Probing IDE interface ide0...
hda: TSSTcorpCD/DVDW TS-L632D, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
 ... 

And just FYI, that is for the ATAPI device connected to the PATA channel 0
(ide0).  The hard drive is connected to SATA channel 0 (sata_nv driver).

 ... 
SCSI subsystem initialized
libata version 2.00 loaded.
sata_nv 0000:00:0e.0: version 2.0
PCI: Enabling device 0000:00:0e.0 (0005 -> 0007)
ACPI: PCI Interrupt Link [LTID] enabled at IRQ 21
GSI 18 sharing vector 0xE1 and IRQ 18
ACPI: PCI Interrupt 0000:00:0e.0[A] -> Link [LTID] -> GSI 21 (level, high) ->
IRQ 225
PCI: Setting latency timer of device 0000:00:0e.0 to 64
ata1: SATA max UDMA/133 cmd 0x30C0 ctl 0x30B6 bmdma 0x3090 irq 225
ata2: SATA max UDMA/133 cmd 0x30B8 ctl 0x30B2 bmdma 0x3098 irq 225
scsi0 : sata_nv
 ... 
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-7, max UDMA/133, 312581808 sectors: LBA48 NCQ (depth 0/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/133
scsi1 : sata_nv
 ... 
ata2: SATA link down (SStatus 0 SControl 300)
ATA: abnormal status 0x7F on port 0x30BF
  Vendor: ATA       Model: WDC WD1600BEVS-0  Rev: 04.0
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 >
sd 0:0:0:0: Attached scsi disk sda
 ... 

Here's the hdparm -I output ...

/dev/hda:

ATAPI CD-ROM, with removable media
        Model Number:       TSSTcorpCD/DVDW TS-L632D                
        Serial Number:      
        Firmware Revision:  HH15    
Standards:
        Likely used CD-ROM ATAPI-1
Configuration:
        DRQ response: 50us.
        Packet size: 12 bytes
Capabilities:
        LBA, IORDY(can be disabled)
        DMA: mdma0 mdma1 *mdma2 
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4 
             Cycle time: no flow control=227ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
HW reset results:
        CBLID- below Vih
        Device num = 0

I'll post the newer kernel's results in a moment, but I believe they match exactly.

Comment 8 Bryan J. Smith 2007-02-13 01:06:12 UTC
Again, NO ISSUES with the SATA WD1600BEVS hard drive, just the ATAPI TS-L632D
optical drive.

Comment 9 Bryan J. Smith 2007-02-13 01:14:57 UTC
And here's the "TROUBLEMAKER" ;) kernel 2.6.19-1.2895 ...

Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
NFORCE-MCP51: IDE controller at PCI slot 0000:00:0d.0
NFORCE-MCP51: chipset revision 241
NFORCE-MCP51: not 100% native mode: will probe irqs later
NFORCE-MCP51: BIOS didn't set cable bits correctly. Enabling workaround.
NFORCE-MCP51: 0000:00:0d.0 (rev f1) UDMA133 controller
    ide0: BM-DMA at 0x3080-0x3087, BIOS settings: hda:DMA, hdb:pio
Probing IDE interface ide0...
hda: TSSTcorpCD/DVDW TS-L632D, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...

 ... 
SCSI subsystem initialized
libata version 2.00 loaded.
sata_nv 0000:00:0e.0: version 2.0
PCI: Enabling device 0000:00:0e.0 (0005 -> 0007)
ACPI: PCI Interrupt Link [LTID] enabled at IRQ 21
ACPI: PCI Interrupt 0000:00:0e.0[A] -> Link [LTID] -> GSI 21 (level, high) -> IRQ 21
PCI: Setting latency timer of device 0000:00:0e.0 to 64
ata1: SATA max UDMA/133 cmd 0x30C0 ctl 0x30B6 bmdma 0x3090 irq 21
ata2: SATA max UDMA/133 cmd 0x30B8 ctl 0x30B2 bmdma 0x3098 irq 21
scsi0 : sata_nv
 ... 
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-7, max UDMA/133, 312581808 sectors: LBA48 NCQ (depth 0/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/133
scsi1 : sata_nv
 ... 
ata2: SATA link down (SStatus 0 SControl 300)
ATA: abnormal status 0x7F on port 0x30BF
scsi 0:0:0:0: Direct-Access     ATA      WDC WD1600BEVS-0 04.0 PQ: 0 ANSI: 5
SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 >
sd 0:0:0:0: Attached scsi disk sda

And hdparm -I output ...

/dev/hda:

ATAPI CD-ROM, with removable media
        Model Number:       TSSTcorpCD/DVDW TS-L632D                
        Serial Number:      
        Firmware Revision:  HH15    
Standards:
        Likely used CD-ROM ATAPI-1
Configuration:
        DRQ response: 50us.
        Packet size: 12 bytes
Capabilities:
        LBA, IORDY(can be disabled)
        DMA: mdma0 mdma1 *mdma2 
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4 
             Cycle time: no flow control=227ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
HW reset results:
        CBLID- below Vih
        Device num = 0

With both kernels, no SATA issues/timeouts.
With 2.6.18-1.2869, no PATA issues/timeouts.
But with 2.6.19-1.2895, continuous PATA issues/timeouts as aforementioned.
Takes a little while with the system running for them to occur, but it happens
within 15 minutes if a CD/DVD is in the drive (within a hour or two if not).


Comment 10 Bryan J. Smith 2007-02-23 09:20:37 UTC
Still occurring on latest kernel 2.6.19-1.2911 build.
As before, when I go back to kernel 2.6.18-1.2869, no issues.

I've found an indirect workaround, which is largely obvious.
It won't occur if I "rmmod cdrom ide_cd".
Just wanted to pass along that removing the 2 modules prevents the issues (which
would occur, even if no disc was in the drive).


Comment 11 Chuck Ebbert 2007-03-14 22:06:09 UTC
Please test kernel 2.6.20-1.2925.fc6


Comment 12 Bryan J. Smith 2007-03-15 00:53:51 UTC
Was the ETA for kernel 2.6.20-1.2925 on x86-64?
I only noted it available for i386 updates, not x86-64.

BTW, thank you for your continued efforts.  I haven't had the time to diff
kernel trees from the last, working kernel (2.6.18-1.2869) and the newer kernels
(2.6.19-*) I've had issue with.  But was going to put in the time by next
weekend to see what I noticed had changed between two kernels to see if I could
provide any assistance in kernel debugging.

SIDE NOTE:

There are now other reports of HP Pavilion owners with TS-L632D drives getting
only Multi-word DMA signaling out clearly Ultra DMA capable drives.  I don't
think it's the root cause since it did work with 2.6.18-1.2869 and earlier
without timeouts, but I do think it's an added performance and recovery issue
(especially the latter since Multi-word DMA does not offer CRC checking, unlike
Ultra DMA).

I haven't had time to take it up with HP, but I'm starting to gather a group of
users who would like to do so.  The lack of any settings in the HP BIOS on
Pavilions other than boot order or date/time is rather troublesome, especially
if the ATA channels can be forced into Ultra DMA mode -- like virtually all
"regular" BIOSes from the "big three" allow normally.  There seems to be no
utility in Windows XP to edit those settings either.


Comment 13 Bryan J. Smith 2007-03-17 15:22:38 UTC
Upgraded to 2.6.20-1.2925 ... good for a couple of hours, then BAM! timeouts.

It's funny, I watched a full hour of DVD, no timeouts, and no choppiness
(clearly better performance than before with even 2.6.18-* that worked and
didn't have timeouts).  Then about a hour after that, I was just browsing with
_no_ disc in the drive and BAM!  The timeouts started (which virtually hangs the
system).

Once again, patiently and persistently switching to a console, finally getting
logged in as root and finally getting "rmmod ide_cd cdrom" to finally execute
and accept the drive isn't busy (typically it takes 3-4 times), the system will
unfreeze.

As such, "rmmod ide_cd cdrom" is going back into /etc/rc.local for now.  ;)


Comment 14 Bug Zapper 2008-04-04 06:12:07 UTC
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers

Comment 15 Bug Zapper 2008-05-06 19:13:02 UTC
This bug is open for a Fedora version that is no longer maintained and
will not be fixed by Fedora. Therefore we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen thus bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.