466141 – RHEL 5.2: dma_timer_expiry: dma status==0x24

Bug 466141 - RHEL 5.2: dma_timer_expiry: dma status==0x24

Summary: RHEL 5.2: dma_timer_expiry: dma status==0x24

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.2
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Red Hat Kernel Manager
QA Contact:	Martin Jenner
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-10-08 17:12 UTC by Paul Batkowski
Modified:	2008-10-09 14:05 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-10-09 14:05:26 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Paul Batkowski 2008-10-08 17:12:49 UTC

Description of problem:

Recently installed a new IDE hard drive in my workstation. With RHEL 5.2 (kernel-2.6.18-92.1.13.el5.x86_64) I get the following messages in /var/log/messages when writing to the hard drive:

Oct  8 13:01:18 localhost kernel: hda: dma_timer_expiry: dma status == 0x24
Oct  8 13:01:28 localhost kernel: hda: DMA interrupt recovery
Oct  8 13:01:28 localhost kernel: hda: lost interrupt
Oct  8 13:01:28 localhost kernel: hda: dma_intr: status=0x58 { DriveReady SeekComplete DataRequest }
Oct  8 13:01:28 localhost kernel: ide: failed opcode was: unknown

These message loop continuously until my server hard locks up. (need power cycle the box).

I've also tried running with kernel-2.6.18-118.el5.x86_64 and the same issue occurs.

Version-Release number of selected component (if applicable):

RHEL 5.2 (kernel-2.6.18-92.1.13.el5.x86_64)

How reproducible:

Consistently occurs.

Steps to Reproduce:
1. Start dd of /dev/zero to a file stored on the hard drive.
2. After several minutes, note the messages in /var/log/messages (and dmesg at boot time).
3. Watch system lock up.
  
Actual results:

Messages are continously logged and system locks up.

Expected results:

That messages aren't logged and the system does not lock up.

Additional info:

# hdparm -i /dev/hda

/dev/hda:

 Model=ST3250823A, FwRev=3.03, SerialNo=5NF0QTWZ
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=unknown, BuffSize=8192kB, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4 
 DMA modes:  mdma0 mdma1 mdma2 
 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 
 AdvancedPM=no WriteCache=enabled
 Drive conforms to: Unspecified:  ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7

 * signifies the current active mode

I found a related BZ opened for FC6 that seems related:

https://bugzilla.redhat.com/show_bug.cgi?id=234936

and one for FC4 that seems related:

https://bugzilla.redhat.com/show_bug.cgi?id=132584

Both were closed w/o resolution.

Comment 1 Paul Batkowski 2008-10-09 14:05:26 UTC

Got an email from Alan Cox stating that this maybe a hardware related issue. So I did some more investigation and it turns out I had 2 ide devices connected on the same channel with jumpers set to 'Master'. How this system even booted in this configuration is beyond me...but once I set one of the devices as slave, everything started working smoothly.

Alan also mentioned that some other possible reasons for this message are: a faulty disk, power problems (not enough power to properly power the disk), or a controller problem.

Closing this bz out.

Note You need to log in before you can comment on or make changes to this bug.