Bug 231294 - ATA: abnormal status message loops booting 2.6.20-1.2966.fc7
ATA: abnormal status message loops booting 2.6.20-1.2966.fc7
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Alan Cox
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-03-07 10:41 EST by cje
Modified: 2007-11-30 17:11 EST (History)
0 users

See Also:
Fixed In Version: 2.6.20-1.2982
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-04-19 06:31:10 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg output (29.90 KB, application/octet-stream)
2007-03-08 06:36 EST, cje
no flags Details
portege p2000 lspci (7.70 KB, application/octet-stream)
2007-03-08 06:37 EST, cje
no flags Details

  None (edit)
Description cje 2007-03-07 10:41:36 EST
Description of problem:

i can't boot 2.6.20-1.2966.fc7 on my toshiba portege 2000.  2.6.20-1.2962.fc7 is
fine.

i get the following:

ATA: abnormal status 0x58 on port 0x000101F7

(followed by a few lines beginning "ata1.00") repeated over and over again.


Version-Release number of selected component (if applicable):
2.6.20-1.2966.fc7

How reproducible:
just boot.

Steps to Reproduce:
1.upgrade to that kernel
2.boot
3.
  
Actual results:
looping error messages

Expected results:
boot

Additional info:
this is an old laptop (at least five years) with a PATA disk but it's been
working fine up until this kernel.  and it's running fine right now with the
previous kernel.
Comment 1 Alan Cox 2007-03-07 14:38:59 EST
Can you attach the dmesg of a successful boot and an lspci -vxx

Thanks
Comment 2 cje 2007-03-08 06:36:47 EST
Created attachment 149559 [details]
dmesg output

here's one
Comment 3 cje 2007-03-08 06:37:39 EST
Created attachment 149560 [details]
portege p2000 lspci

and the other
Comment 4 cje 2007-03-08 07:22:45 EST
just trying 2.6.20-1.2967 and got same errors.  here's some more details:

the following messages are in there somewhere - 

simplex DMA is claimed by other device, disabling DMA
configured for PIO0
EH complete
(HSM violation)

whilst noting all that i actually left it running longer than before and it got
past that bit.  the last bits are:

ata1.00: configured for PIO0
sd 0:0:0:0: SCSI error: return code = 0x08000002
sda: Current [descriptor[: sense key: Aborted Command
    Additional sense: Scsi parity error
Descriptor

urk.  it's gone now.  the system seems to be running but it's extremely slow. 
if it ever makes it to a login prompt at a useable speed i'll try to get another
dmesg output before i go back to 2.6.20-1.2962.fc7.
Comment 5 Alan Cox 2007-03-08 08:07:14 EST
"Simplex DMA is claimed by other device, disabling DMA" is the root cause of the
very slow performance, and I fixed that partly and Petr posted a fix to my fix
last night.

ata1.00: configured for PIO0
sd 0:0:0:0: SCSI error: return code = 0x08000002
sda: Current [descriptor[: sense key: Aborted Command
    Additional sense: Scsi parity error

Tht bit is most peculiar however

Comment 6 cje 2007-03-08 08:26:55 EST
okay.  so it's possible there's some problem with PIO mode?

how about we wait for your fix to turn up in devel and then see if we can
reproduce this weird error by manually disabling DMA mode?

(by the way, the messages are hand-typed .. it should be "[descriptor]", not
"[descriptor[" .. in case you were wondering!)

for completeness ... i've tried booting to just a shell but i can't write
anything to the disk.  (i can remount rw with lots of those errors but actually
writing anything to the disk fails with another error) but i can 'dmesg | more'
to make a more careful copy of the messages.

looks like there's a pattern.  you get six copies of a 7 line block which starts
with

ATA: abnormal status 0x58 on port 0x000101f7

includes the DMA messages and ends with

ata1: EH complete

and those six blocks are followed by the SCSI error.
Comment 7 cje 2007-03-12 10:12:30 EDT
just updated to 2.6.20-1.2982 and it boots fine.  :-)

tried booting with "nodma" option.  i'm not sure if that's still supposed to
have an effect but it didn't make any difference.  'hdparm -d /dev/sda' just
returns '/dev/sda:' either way.

anyway, i guess the weird messages were just weird and nothing more.  i'm happy
to try some things out if you do want to investigate that further but otherwise
i'm equally happy for you to close this call.  many thanks for the responses and
fix.

Note You need to log in before you can comment on or make changes to this bug.