Bug 230109 - bug in pdc202xx_old.c hangs system
bug in pdc202xx_old.c hangs system
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
5
All Linux
medium Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Brian Brock
bzcl34nup
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-02-26 13:23 EST by Trevor Cordes
Modified: 2008-04-04 13:35 EDT (History)
2 users (show)

See Also:
Fixed In Version: F8
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-04-04 13:35:04 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Trevor Cordes 2007-02-26 13:23:46 EST
Description of problem:
DMA error (drive may be ailing) causes:
BUG: warning at drivers/ide/pci/pdc202xx_old.c:469/pdc202xx_reset_host() (Not
tainted)
errors and near-complete system hang.
Yes, I know that ailing disks do nasty things, but it's in a RAID 1 config and
it should survive such errors, and especially not give a "kernel: BUG" warning.

Version-Release number of selected component (if applicable):
2.6.18-1.2257.fc5

How reproducible:
only when DMA error pops up, rarely

Steps to Reproduce:
1. 2 drives sw md RAID1 off a Promise IDE card
2. wait for dma error
3. watch it crash
  
Actual results:
hang

Expected results:
keeps going (it is RAID 1 after all)

Additional info:
Feb 26 06:10:24 firewall kernel: hde: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Feb 26 06:10:24 firewall kernel: hde: dma_intr: error=0x84 { DriveStatusError
BadCRC }
Feb 26 06:10:24 firewall kernel: ide: failed opcode was: unknown
Feb 26 06:10:24 firewall kernel: hde: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Feb 26 06:10:24 firewall kernel: hde: dma_intr: error=0x84 { DriveStatusError
BadCRC }
Feb 26 06:10:24 firewall kernel: ide: failed opcode was: unknown
Feb 26 06:10:24 firewall kernel: hde: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Feb 26 06:10:24 firewall kernel: hde: dma_intr: error=0x84 { DriveStatusError
BadCRC }
Feb 26 06:10:24 firewall kernel: ide: failed opcode was: unknown
Feb 26 06:10:28 firewall kernel: hde: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Feb 26 06:10:28 firewall kernel: hde: dma_intr: error=0x84 { DriveStatusError
BadCRC }
Feb 26 06:10:28 firewall kernel: ide: failed opcode was: unknown
Feb 26 06:10:28 firewall kernel: BUG: warning at
drivers/ide/pci/pdc202xx_old.c:467/pdc202xx_reset_host() (Not tainted)
Feb 26 06:10:28 firewall kernel:  [<c0403f28>] dump_trace+0x69/0x1af
Feb 26 06:10:28 firewall kernel:  [<c0404086>] show_trace_log_lvl+0x18/0x2c
Feb 26 06:10:28 firewall kernel:  [<c0404601>] show_trace+0xf/0x11
Feb 26 06:10:28 firewall kernel:  [<c040468b>] dump_stack+0x15/0x17
Feb 26 06:10:28 firewall kernel:  [<c05497c5>] pdc202xx_reset_host+0x7a/0x12f
Feb 26 06:10:28 firewall kernel:  [<c054988c>] pdc202xx_reset+0x12/0x2a
Feb 26 06:10:28 firewall kernel:  [<c055302d>] do_reset1+0x176/0x191
Feb 26 06:10:28 firewall kernel:  [<c05523bc>] __ide_error+0x197/0x1aa
Feb 26 06:10:28 firewall kernel:  [<c055242b>] ide_error+0x5c/0x72
Feb 26 06:10:28 firewall kernel:  [<c055214b>] ide_intr+0x146/0x1a7
Feb 26 06:10:28 firewall kernel:  [<c043f79e>] handle_IRQ_event+0x23/0x49
Feb 26 06:10:28 firewall kernel:  [<c043f846>] __do_IRQ+0x82/0xde
Feb 26 06:10:28 firewall kernel:  [<c0405385>] do_IRQ+0x9a/0xb8
Feb 26 06:10:28 firewall kernel:  =======================
Feb 26 06:10:28 firewall kernel: BUG: warning at
drivers/ide/pci/pdc202xx_old.c:469/pdc202xx_reset_host() (Not tainted)
Feb 26 06:10:28 firewall kernel:  [<c0403f28>] dump_trace+0x69/0x1af
Feb 26 06:10:28 firewall kernel:  [<c0404086>] show_trace_log_lvl+0x18/0x2c
Feb 26 06:10:28 firewall kernel:  [<c0404601>] show_trace+0xf/0x11
Feb 26 06:10:28 firewall kernel:  [<c040468b>] dump_stack+0x15/0x17
Feb 26 06:10:28 firewall kernel:  [<c0549839>] pdc202xx_reset_host+0xee/0x12f
Feb 26 06:10:28 firewall kernel:  [<c054988c>] pdc202xx_reset+0x12/0x2a
Feb 26 06:10:28 firewall kernel:  [<c055302d>] do_reset1+0x176/0x191
Feb 26 06:10:28 firewall kernel:  [<c05523bc>] __ide_error+0x197/0x1aa
Feb 26 06:10:28 firewall kernel:  [<c055242b>] ide_error+0x5c/0x72
Feb 26 06:10:28 firewall kernel:  [<c055214b>] ide_intr+0x146/0x1a7
Feb 26 06:10:28 firewall kernel:  [<c043f79e>] handle_IRQ_event+0x23/0x49
Feb 26 06:10:28 firewall kernel:  [<c043f846>] __do_IRQ+0x82/0xde
Feb 26 06:10:28 firewall kernel:  [<c0405385>] do_IRQ+0x9a/0xb8
Feb 26 06:10:28 firewall kernel:  =======================
Feb 26 06:10:28 firewall kernel: PDC202XX: Primary channel reset.
Feb 26 06:10:28 firewall kernel: PDC202XX: Secondary channel reset.
Feb 26 06:10:28 firewall kernel: ide2: reset: master: error (0x00?)
-- end of log output until rebooted --
Comment 1 Trevor Cordes 2007-02-26 13:26:57 EST
potentially related unresolved bugs:
bug #140788
bug #130810
Comment 2 Bug Zapper 2008-04-04 02:22:40 EDT
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers
Comment 3 Trevor Cordes 2008-04-04 08:53:16 EDT
I haven't seen this bug since the initial report, but it is an obscure bug
requiring a rare set of circumstances, so I'm not surprised.  Maybe it's fixed
upstream, maybe not.
Comment 4 Dave Jones 2008-04-04 13:35:04 EDT
We switched to a completely different set of ata drivers after that kernel, so
chances are the messages don't exist any more, but if there was a problem it
would exhibit a different failure mode.

If everything works fine, we're probably ok to just close this.

Note You need to log in before you can comment on or make changes to this bug.