Bug 454269 - ata exception under load, system hangs
Summary: ata exception under load, system hangs
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 9
Hardware: i686
OS: Linux
low
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-07-07 11:37 UTC by cam
Modified: 2008-08-16 09:49 UTC (History)
0 users

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2008-08-16 09:49:11 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description cam 2008-07-07 11:37:17 UTC
Starting with kernel 2.6.25.6-55.fc9.i686 I started to see problems where the
machine would boot, login was possible but any activity (eg. starting a web
browser) would cause the system to become completely unresponsive. There was a
gradual degradation with the system monitor showing the load average creeping
up, eventually the mouse pointer would stop moving and the caps lock light would
not respond to caps lock. Power off was the only way out.

At first I suspected the hard disk because it is old and over spec for load
cycles. However the system is stable under Windows for several hours of web
surfing, software updates etc. The system also works fine booting from USB with
Fedora 9 live image and mounting the hard disk manually. I hope this rules out
hardware failure of the hard disk. 

I managed to pick up a newer kernel, and using dmesg in a text console got the
following (transcription from photo):
ata1: device not ready (errno=-16), forcing hardreset
ata1: soft resetting link
ata1.00: configured for UDMA/100
ata1: EH complete
[drm] Num pipes: 1
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: cmd c8/00:80:de:b7:04/80:00:00:00:00/e6 tag 0 dma 65536 in
         res 48/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status {DRDY}
ata1: port is slow to respond, please be patient (Status 0xd0)

NB some of the hex values above may be inaccurate!

Version-Release number of selected component (if applicable):
First seen problems: 2.6.25.6-55.fc9.i686
Error messages from later kernel: 2.6.25.9-76.fc9.i686
Problems also exist in a recent rawhide kernel (sorry don't have the screen shot
to hand with the exact version)

How reproducible: 100%

Steps to Reproduce:
1. Dell Inspiron 6000 with Fedora 9 previously working
2. Upgrade to affected kernel with yum, reboot
3. problem reproduced after login with medium load (eg. start web browser)
  
Actual results:
Load average climbs, system becomes unresponsive as disk access stops

Expected results:
Continued reliable disk access

Additional info:
Laptop has optional Radeon x300 graphics (using default Fedora driver)
Disk is probably Samsung Spinpoint M MP0804H

Comment 1 cam 2008-07-08 10:33:18 UTC
I have since investigated further and had problems when running the kernel from
the live CD. I now suspect the disk is faulty and have ordered a replacement. If
windows works it may be because the partition with Windows on it has hardly been
used.

Comment 2 cam 2008-08-16 09:49:11 UTC
closed/notabug

the problem was resolved by replacing the hard disk. Although the disk made occasional unusual noises (clicks and screeches) it didn't show SMART errors, and still works in a USB enclosure. Cheers!


Note You need to log in before you can comment on or make changes to this bug.