Bug 214566
Summary: | DMA timeout errors on VT82C586A/B/VT82C686/A/B/VT823x/A/C | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Tuomas Mursu <tuomas.mursu> | ||||||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Brian Brock <bbrock> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | medium | ||||||||||||
Version: | 6 | CC: | db64, jonstanley, tedkaz, twaugh, wtogami | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | All | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2008-01-08 00:32:34 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Tuomas Mursu
2006-11-08 12:25:18 UTC
Created attachment 140644 [details]
Full startup log from /var/log/messages
Hi, I have a similar problem after upgrading from FC5 to FC6, very tedious messages like: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x84 { DriveStatusError BadCRC } ide: failed opcode was: unknown The HD is few weeks old and in good state, I have tested it with full diagnostics sw provided by the factory (Hitachi). On the same hd I have installed Knoppix 5.0.1, and there are not problems at all. Then I can exclude hw problems at 100% as well, but I cannot use DMA and disabled it. This is not a solution, my system is very tedious on long files transfers. On the net I found only a possible solution: patching kernel iosched, see http://lkml.org/lkml/2006/8/27/108. How reproducible: /sbin/hdparm -Tt /dev/hda I confirm: Fedora Core 5 kernels (2.6.17) didn't cause any errors. Problems concern 2.6.18-1.2849.fc6 kernel as well. Ebenfalls auf 2.6.18-1.2849.fc6: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x84 { DriveStatusError BadCRC } ide: failed opcode was: unknown lspci | grep VIA 00:00.0 Host bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333] 00:01.0 PCI bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333 AGP] 00:11.0 ISA bridge: VIA Technologies, Inc. VT8233A ISA Bridge 00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/ C PIPC Bus Master IDE (rev 06) In my opinion this bug should be set to URGENT severity. I see these errors too, but with a different IDE controller: # lspci | grep IDE 00:08.0 IDE interface: nVidia Corporation nForce3 IDE (rev a5) # cat /proc/ide/hda/model ExcelStor Technology J8160 Created attachment 142946 [details]
rtf of lspci -vvv
Created attachment 142947 [details]
lmsod dump
Created attachment 142948 [details]
cpuninfo, ioports, iomem
2.6.18-1.2868.fc6 #1 SMP has this problem still. Is anyone really working on it? It's definitely bringing down any performance and annoys after so many weeks. I agree, for this problem I am seriously thinking to change Linux distribution. This is my hd configuration on Compaq Presario 2701EA: /sbin/lspci | grep Intel 00:00.0 Host bridge: Intel Corporation 82830 830 Chipset Host Bridge (rev 02) 00:01.0 PCI bridge: Intel Corporation 82830 830 Chipset AGP Bridge (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #1) (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #2) (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #3) (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 41) 00:1f.0 ISA bridge: Intel Corporation 82801CAM ISA Bridge (LPC) (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801CAM IDE U100 (rev 01) 00:1f.3 SMBus: Intel Corporation 82801CA/CAM SMBus Controller (rev 01) 00:1f.5 Multimedia audio controller: Intel Corporation 82801CA/CAM AC'97 Audio Controller (rev 01) 02:08.0 Ethernet controller: Intel Corporation 82801CAM (ICH3) PRO/100 VE (LOM) Ethernet Controller (rev 41) /sbin/hdparm -i /dev/hda /dev/hda: Model=HTS541080G9AT00, FwRev=MB4OA60A, SerialNo=MPB4LAXKJMS2WM Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=DualPortCache, BuffSize=7539kB, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=156301488 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled Drive conforms to: ATA/ATAPI-6 T13 1410D revision 3a: ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 There's a kernel update to 2.6.19 coming sometime soon, I'm hoping it would fix this. 2 months later and with 2.6.18-1.2869.fc6 #1 SMP it's still the same. I updated to 2.6.19-1.2895.fc6 from updates-testing couple of days ago, and haven't seen any DMA errors since. But I guess others should try it too and report if these errors are really fixed. I am still seeing these errors with 2.6.19-1.2895.fc6 on x86_64: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x84 { DriveStatusError BadCRC } ide: failed opcode was: unknown Confirmed. I rebooted my box yesterday and the shutdown process hung at unmounting the partitions, followed by very familiar errors. *sigh* Me too... I am still seeing these errors with 2.6.19-1.2895.fc6 on x86_64: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x84 { DriveStatusError BadCRC } ide: failed opcode was: unknown FC5 was better then FC6... I think it is getting worse Still the same with 2.6.19-1.2911.fc6 #1 SMP... 2.6.19-1.2911.fc6 takes this even further. Now it's effecting my cd/dvd-drive (HL-DT-ST DVDRAM GSA-4163B) too, rendering it totally unusable. I put a disc in the drive and close the tray: hdc: irq timeout: status=0xd0 { Busy } ide: failed opcode was: unknown hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command (Repeating continuosly the same messages as above...) I have to reboot the box to get the disc out. Not nice. (In reply to comment #17) > I am still seeing these errors with 2.6.19-1.2895.fc6 on x86_64: > > hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } > hda: dma_intr: error=0x84 { DriveStatusError BadCRC } > ide: failed opcode was: unknown Those are real hardware errors. Your drive is probably failing. BadCRC means there is a real problem... I've noticed that my second harddrive's performance is really poor. Reading and writing are both significantly slower than the first drive. Copying a cd-image over ethernet directly to hdb takes several minutes, and I can hear the drive "shriek". It sounds like the heads are reading/writing like hell but still can't keep up. Same cd-image over ethernet to hda takes only about a minute, and the drive stays quiet like it's supposed to (benchmarked this today). So, to sum it up hda works fine, hdb doesn't. I tried to find out what's the difference between the two, but I don't know how to gather all the info. So far, here's hdparm stuff: -------------------- # hdparm -i /dev/hda /dev/hda: Model=SAMSUNG SP0822N, FwRev=WA100-32, SerialNo=S06QJ10Y959079 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=16383/16/63, TrkSize=34902, SectSize=554, ECCbytes=4 BuffType=DualPortCache, BuffSize=2048kB, MaxMultSect=16, MultSect=16 CurCHS=4047/16/255, CurSects=16511760, LBA=yes, LBAsects=156368016 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=no WriteCache=enabled Drive conforms to: ATA/ATAPI-6 T13 1410D revision 1: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7 * signifies the current active mode # hdparm -i /dev/hdb /dev/hdb: Model=SAMSUNG SV0802N, FwRev=TP100-24, SerialNo=S019J10X679393 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=16383/16/63, TrkSize=34902, SectSize=554, ECCbytes=4 BuffType=DualPortCache, BuffSize=2048kB, MaxMultSect=16, MultSect=16 CurCHS=4047/16/255, CurSects=16511760, LBA=yes, LBAsects=156368016 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=no WriteCache=enabled Drive conforms to: ATA/ATAPI-7 T13 1532D revision 0: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7 * signifies the current active mode -------------------- Only the last line seems to differ. Well this didn't help much, is there anything else I could look at? SOLVED!!!
There is an incompatibility with HAL, just stop the HAL daemon:
> /etc/rc.d/rc.hald stop
That's all.
Bye
D.
"Narrowed down" rather than solved I think. Can we work out why HAL causes this error message? no worries, autofs is much better and reliable! Hello, I'm reviewing this bug as part of the kernel bug triage project, an attempt to isolate current bugs in the Fedora kernel. http://fedoraproject.org/wiki/KernelBugTriage I am CC'ing myself to this bug, however this version of Fedora is no longer maintained. Please attempt to reproduce this bug with a current version of Fedora (presently Fedora 8). If the bug no longer exists, please close the bug or I'll do so in a few days if there is no further information lodged. Thanks for using Fedora! Closing per previous comment. If you can provide the requested information, please feel free to re-open this bug. |