Bug 185803 - ata port failure leads to spinlock recursion bug
Summary: ata port failure leads to spinlock recursion bug
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 5
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-03-18 14:32 UTC by Matt Domsch
Modified: 2007-11-30 22:11 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2006-10-05 12:12:19 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
crash.txt (23.49 KB, text/plain)
2006-03-18 14:32 UTC, Matt Domsch
no flags Details
smart.txt (5.27 KB, text/plain)
2006-03-18 14:34 UTC, Matt Domsch
no flags Details

Description Matt Domsch 2006-03-18 14:32:31 UTC
Description of problem:
FC4 x86_64 kernel 2.6.15-1.1833_FC4smp
Dell Precision 370 Workstation
Problematic disk:
Device Model:     Maxtor 7V250F0
Serial Number:    V59145RG    V59145RG
Firmware Version: VA131610
User Capacity:    250,000,000,000 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 0
Local Time is:    Sat Mar 18 08:35:44 2006 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Something like this has happened twice now in the last week on this system.

ata2: port reset, p_is 40000001 is 2 pis 0 cmd 4017 tf 37 ss 113 se 0
ata2: handling error/timeout
ata2: port reset, p_is 0 is 0 pis 0 cmd c017 tf b7 ss 113 se 0
ata2: translated ATA stat/err 0x37/00 to SCSI SK/ASC/ASCQ 0x4/00/00
ata2: status=0x37 { DeviceFault SeekComplete CorrectedError Index Error }
end_request: I/O error, dev sdb, sector 104438335
raid1: Disk failure on sdb2, disabling device.
        Operation continuing on 1 devices
BUG: spinlock recursion on CPU#0, scsi_eh_1/502 (Not tainted)
 lock: ffff81007e74f2c0, .magic: dead4ead, .owner: scsi_eh_1/502, .owner_cpu: 0

so of course we're dead.  I'll upload the whole crash.

Version-Release number of selected component (if applicable):
2.6.15-1.1833_FC4smp

How reproducible:
occasional

Steps to Reproduce:
1. wait
2.
3.
  
Actual results:
crash

Expected results:
a) no port timeout failure
b) disk taken offline if there was a failure, no crash.

Additional info:

Comment 1 Matt Domsch 2006-03-18 14:32:32 UTC
Created attachment 126303 [details]
crash.txt

Comment 2 Matt Domsch 2006-03-18 14:34:02 UTC
Created attachment 126304 [details]
smart.txt

disk SMART data shows no problem

Comment 3 Dave Jones 2006-09-17 01:48:42 UTC
[This comment added as part of a mass-update to all open FC4 kernel bugs]

FC4 has now transitioned to the Fedora legacy project, which will continue to
release security related updates for the kernel.  As this bug is not security
related, it is unlikely to be fixed in an update for FC4, and has been migrated
to FC5.

Please retest with Fedora Core 5.

Thank you.

Comment 4 Matt Domsch 2006-10-05 12:12:19 UTC
failure no longer seen in FC5, though I've replaced disks to.  closing
insufficient data.


Note You need to log in before you can comment on or make changes to this bug.