Bug 243033 - kernel 2.6.21-1.3207.fc8 takes very loooong time to loose disks due to ata problems
kernel 2.6.21-1.3207.fc8 takes very loooong time to loose disks due to ata pr...
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
8
All Linux
low Severity high
: ---
: ---
Assigned To: Kernel Maintainer List
Brian Brock
bzcl34nup
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-06-06 19:15 EDT by Michal Jaegermann
Modified: 2008-11-26 11:52 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-11-26 11:52:59 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg for 2.6.21-1.3207.fc8 with "lost" disks (18.82 KB, text/plain)
2007-06-06 19:15 EDT, Michal Jaegermann
no flags Details
dmesg for 2.6.21-1.3206.fc8 (disks are still there) (19.01 KB, text/plain)
2007-06-06 19:17 EDT, Michal Jaegermann
no flags Details
dmesg for 2.6.21-1.3194.fc7 (the last one still usable) (18.45 KB, text/plain)
2007-06-06 19:19 EDT, Michal Jaegermann
no flags Details

  None (edit)
Description Michal Jaegermann 2007-06-06 19:15:56 EDT
Description of problem:

While trying to boot 2.6.21-1.3207.fc8 kernel gets the following

ata3.00: applying bridge limits

followed by a long timeout and

ata3.00: qc timeout (cmd 0xef)
ata3.00: failed to set xfermode (err_mask=0x4)
ata3: failed to recover some devices, retrying in 5 secs

a number of times with a better part of a minute wait
before every "qc timeout".  The first time I was quite
convinced that a boot process alredy locked up.

The above is repeated a few times and it eventually ends up with

ata3.00: disabled

After that we are going to go through the whole exercise for ata4
before the whole thing gets far enough to boot but disks /dev/sdb
and /dev/sdc are no longer accesible.

Attached is dmesg from 2.6.21-1.3207.fc8 and for comparison
from 2.6.21-1.3206.fc8 which does not suffer from the same
affliction.  Also dmesg from 2.6.21-1.3194.fc7 as this is the
last one which is actually usable.

Version-Release number of selected component (if applicable):
kernel-2.6.21-1.3207.fc8

How reproducible:
always
Comment 1 Michal Jaegermann 2007-06-06 19:15:56 EDT
Created attachment 156408 [details]
dmesg for 2.6.21-1.3207.fc8 with "lost" disks
Comment 2 Michal Jaegermann 2007-06-06 19:17:33 EDT
Created attachment 156409 [details]
dmesg for 2.6.21-1.3206.fc8 (disks are still there)
Comment 3 Michal Jaegermann 2007-06-06 19:19:56 EDT
Created attachment 156410 [details]
dmesg for 2.6.21-1.3194.fc7 (the last one still usable)
Comment 4 Chuck Ebbert 2007-06-06 19:49:31 EDT
This is apparently caused by the update from 2.6.22-rc3-git7 to 2.6.22-rc4,
which added the patch: "libata: always use polling SETXFER"

Hardware is sata_promise and sata_via, please confirm that it's the drives
attached to the Promise controller that stopped working.
Comment 5 Michal Jaegermann 2007-06-06 20:59:46 EDT
> please confirm that it's the drives
> attached to the Promise controller that stopped working.

The best I can interpret messages from the working situation,
i.e. this:

sata_promise 0000:00:08.0: version 2.00
ACPI: PCI Interrupt 0000:00:08.0[A] -> GSI 18 (level, low) -> IRQ 18
sata_promise PATA port found
ata3: SATA max UDMA/133 cmd 0xffffc20000020200 ctl 0xffffc20000020238 bmdma
0x0000000000000000 irq 18
ata4: SATA max UDMA/133 cmd 0xffffc20000020280 ctl 0xffffc200000202b8 bmdma
0x0000000000000000 irq 18
ata5: PATA max UDMA/133 cmd 0xffffc20000020300 ctl 0xffffc20000020338 bmdma
0x0000000000000000 irq 18
scsi2 : sata_promise
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata3.00: ATA-6: WDC WD2500JD-00GBB0, 02.05D02, max UDMA/100
ata3.00: 488397168 sectors, multi 0: LBA48
ata3.00: applying bridge limits
ata3.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata3.00: configured for UDMA/100
scsi3 : sata_promise
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata4.00: ATA-6: WDC WD2500JD-00GBB0, 02.05D02, max UDMA/100
ata4.00: 488397168 sectors, multi 0: LBA48
ata4.00: applying bridge limits
ata4.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata4.00: configured for UDMA/100
scsi4 : sata_promise

this is indeed the case.
Comment 6 Michal Jaegermann 2007-06-10 15:17:22 EDT
2.6.21-1.3218.fc8 sees again disks on a Promise controller.
A dmesg fragment where there were troubles look now as the
one quoted in comment #5.
Comment 7 Michal Jaegermann 2007-06-18 22:26:05 EDT
2.6.21-1.3221 was booting OK.  2.6.21-1.3223 is broken again in
the same way as before.

ata3: failed to recover some devices, retrying in 5 secs
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
ata3.00: qc timeout (cmd 0xef)
ata3.00: failed to set xfermode (err_mask=0x4)
ata3.00: disabled

and that disk is goner.
Comment 8 Michal Jaegermann 2007-06-18 23:40:17 EDT
kernel 2.6.21-1.3225.fc8 gets sata_promise drives again.
Changelogs do not provide enough information to make it possible
to tell if this is a "lucky accident" or really a fix.
Comment 9 Bug Zapper 2008-04-04 06:59:11 EDT
Based on the date this bug was created, it appears to have been reported
during the development of Fedora 8. In order to refocus our efforts as
a project we are changing the version of this bug to '8'.

If this bug still exists in rawhide, please change the version back to
rawhide.
(If you're unable to change the bug's version, add a comment to the bug
and someone will change it for you.)

Thanks for your help and we apologize for the interruption.

The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.
Comment 10 Bug Zapper 2008-11-26 02:18:43 EST
This message is a reminder that Fedora 8 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 8.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '8'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 8's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 8 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Note You need to log in before you can comment on or make changes to this bug.