Bug 242902

Summary: Intel 82801G pata hangs for 30 seconds
Product: [Fedora] Fedora Reporter: Robert Spanton <rds204>
Component: kernelAssignee: Peter Martuccelli <peterm>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: low    
Version: 9CC: bert, chris.brown
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-14 18:15:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output
none
kernel-2.6.22.1-27.fc7 dmesg output (grep for ata1)
none
kernel-2.6.22.5-76.fc7 dmesg output
none
kernel-2.6.22.9-91.fc7 dmesg
none
2.6.23.1-10.fc7 dmesg
none
2.6.23.1-42.fc8 dmesg
none
2.6.23.8-63.fc8 dmesg
none
2.6.23.9-85.fc8 dmesg
none
2.6.23.14-107.fc8 dmesg
none
2.6.23.15-137.fc8 dmesg
none
2.6.24.3-12.fc8 dmesg
none
kernel-2.6.25.3-18.fc9 dmesg
none
kernel-2.6.26.3-29.fc9 dmesg none

Description Robert Spanton 2007-06-06 11:59:03 UTC
Description of problem:
Access to hard disk stops for 30 seconds whilst error messages appear in dmesg:

ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.01: cmd a0/01:00:00:00:00/00:00:00:00:00/b0 tag 0 cdb 0x25 data 8 in
         res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
ata1: port is slow to respond, please be patient (Status 0xd0)
ata1: port failed to respond (30 secs, Status 0xd0)
ata1: soft resetting port

Also seeing some errors similar to those in #242351, but that bug doesn't
mention any 30 second hangs.

Version-Release number of selected component (if applicable):
kernel-2.6.21-1.3194.fc7

How reproducible:
Appears to happen at random intervals - perhaps 5-10 minutes apart.

Steps to Reproduce:
1. Install F7 on Samsung Q35 laptop.
2. Use the laptop - when playing music it's obvious when it happens.
3.
  
Actual results:
Applications that require disk access hang for 30 seconds (music stops for 30
seconds).

Expected results:
No hanging (continuous music).

Additional info:
lspci says (pata) IDE controller is:
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller
(rev 02)

Comment 1 Robert Spanton 2007-06-06 11:59:03 UTC
Created attachment 156339 [details]
dmesg output

Comment 2 Robert Spanton 2007-06-06 12:18:25 UTC
I should add that there were no problems with hard disk access whilst using FC6
on the same laptop.

Comment 3 Robert Spanton 2007-06-18 23:55:09 UTC
Still happens with kernel 2.6.21-1.3228.fc7, but seems to be less often.

Comment 4 Robert Spanton 2007-07-24 21:42:11 UTC
Created attachment 159891 [details]
kernel-2.6.22.1-27.fc7 dmesg output (grep for ata1)

Also still happens with kernel-2.6.22.1-27.fc7.  I retract the previous
statement about it happening less often - it seems fairly random.

Please find dmesg attached - grep for ata1 to see messages.

Comment 5 Robert Spanton 2007-07-29 11:15:20 UTC
Again, still occurs in kernel-2.6.22.1-33.fc7

Comment 6 Robert Spanton 2007-08-02 20:18:56 UTC
And kernel-2.6.22.1-41.fc7

Comment 7 Christopher Brown 2007-09-13 21:51:51 UTC
Hello,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I can.

There hasn't been much activity on this bug for a while. Could you tell me if
you are still having problems with the latest kernel?

I'm re-assigning this bug to the PATA maintainer which will hopefully prompt a
review.

If the problem no longer exists then please close this bug or I'll do so in a
few days if there is no additional information lodged.

Cheers
Chris

Comment 8 Alan Cox 2007-09-14 14:48:32 UTC
Thanks for assigning it over. I'll take a look next week


Comment 9 Robert Spanton 2007-09-15 21:29:03 UTC
Hi,

Sorry, I've been away from my machine for a week - hence the lack of updates.  I
still get the problem with 2.6.22.4-65.fc7.

I've just updated to 2.6.22.5-76.fc7, and should know within the next couple of
days if the bug persists.

Cheers,
Rob

Comment 10 Robert Spanton 2007-09-15 23:07:54 UTC
Created attachment 196591 [details]
kernel-2.6.22.5-76.fc7 dmesg output

Comment 11 Robert Spanton 2007-09-15 23:09:12 UTC
After around an hour of laptop use, 2.6.22.5-76.fc7 showed the same bug.  I've
attached the dmesg just in case there are some subtle differences between it and
the previous versions.

Rob

Comment 12 Robert Spanton 2007-10-09 00:33:24 UTC
Problems persist in kernel-2.6.22.9-91.fc7

Comment 13 Robert Spanton 2007-10-09 00:36:57 UTC
Created attachment 220391 [details]
kernel-2.6.22.9-91.fc7 dmesg

dmesg for kernel-2.6.22.9-91.fc7

Comment 14 Robert Spanton 2007-10-30 19:03:22 UTC
Created attachment 243811 [details]
2.6.23.1-10.fc7 dmesg

Problem persists in kernel-2.6.23.1-10.fc7.

Getting fairly annoying - I can't give presentations etc without having to
sometimes wait 30 seconds between slides :-S

Comment 15 Robert Spanton 2007-11-09 02:03:35 UTC
Created attachment 252311 [details]
2.6.23.1-42.fc8 dmesg

Comment 16 Robert Spanton 2007-11-09 02:04:16 UTC
Same problem happens in F8 - see attached dmesg.

Comment 17 Robert Spanton 2007-12-03 13:04:20 UTC
Problem does not seem to happen in kernel-2.6.23.1-49.fc8.

Comment 18 Chuck Ebbert 2007-12-06 23:33:47 UTC
Closing; reopen bug if problem occurs again.

Comment 19 Robert Spanton 2007-12-13 12:02:00 UTC
Created attachment 287351 [details]
2.6.23.8-63.fc8 dmesg

Turns out that the bug is still present in kernel-2.6.23.8-63.fc8.  Please find
dmesg attached.  I'm reopening this bug.

Comment 20 Robert Spanton 2007-12-21 13:20:54 UTC
Created attachment 290234 [details]
2.6.23.9-85.fc8 dmesg

Bug still in kernel-2.6.23.9-85.fc8.

Comment 21 Robert Spanton 2007-12-21 13:36:58 UTC
I've run smartctl -t short /dev/sda on it.  I had to try this several times to
get the test to complete, because the drive kept getting reset when the error
this bug is about happened.  When it did complete, it reported no errors.

Is there any more information that I should be providing?

Cheers,

Rob

Comment 22 josip 2007-12-31 04:15:22 UTC
I see the same bug on an old reliable uniprocessor system after upgrading to
Fedora 8 and kernel 2.6.23.9-85.fc8 -- this hardware never had this problem
before (using fc6 and kernel 2.6.22.14-72.fc6, before PATA hd drivers got merged
with sd).  This system uses Intel motherboard with 82801 chip, plus a couple of
PATA cards (PDC20267 and 20269).

The symptoms are bursts of error messages about 5 seconds apart similar to the
following:

kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
kernel: ata4.00: BMDMA stat 0x64
kernel: ata4.00: cmd ca/00:10:1f:35:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 8192 out
kernel:          res 51/84:10:1f:35:00/00:00:00:00:00/e0 Emask 0x10 (ATA bus error)
kernel: ata4: soft resetting port
kernel: ata4.00: configured for UDMA/100
kernel: ata4.01: configured for UDMA/100
kernel: ata4: EH complete
kernel: sd 3:0:0:0: [sdd] 320173056 512-byte hardware sectors (163929 MB)
kernel: sd 3:0:0:0: [sdd] Write Protect is off
kernel: sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
kernel: sd 3:0:1:0: [sde] 320173056 512-byte hardware sectors (163929 MB)
kernel: sd 3:0:1:0: [sde] Write Protect is off
kernel: sd 3:0:1:0: [sde] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
kernel: sd 3:0:0:0: [sdd] 320173056 512-byte hardware sectors (163929 MB)
kernel: sd 3:0:0:0: [sdd] Write Protect is off
kernel: sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
kernel: sd 3:0:1:0: [sde] 320173056 512-byte hardware sectors (163929 MB)
kernel: sd 3:0:1:0: [sde] Write Protect is off
kernel: sd 3:0:1:0: [sde] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA


Since there were no hardware changes in upgrading FC6->F8, I suspect the new
merged hd/sd driver.

Comment 23 josip 2007-12-31 04:30:36 UTC
(In reply to comment #22)
> I see the same bug on an old reliable uniprocessor system after upgrading to
> Fedora 8 and kernel 2.6.23.9-85.fc8 -- this hardware never had this problem
> before (using fc6 and kernel 2.6.22.14-72.fc6, before PATA hd drivers got merged
> with sd).  This system uses Intel motherboard with 82801 chip, plus a couple of
> PATA cards (PDC20267 and 20269).

Here is the relevant lspci output:

00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 Controller (rev 04)
02:0b.0 Mass storage controller: Promise Technology, Inc. 20269 (rev 02)
02:0c.0 Mass storage controller: Promise Technology, Inc. PDC20267
(FastTrak100/Ultra100) (rev 02)

I checked past 3 months of logs -- there were NO errors of this type before
upgrading to Fedora 8 and the latest kernel 2.6.23.9-85.fc8 -- and there were no
hardware changes -- so the most reasonable hypothesis is that there is something
wrong with the merged hd/sd driver.

Comment 24 josip 2008-01-08 05:43:46 UTC
My machine's problem seems to be related to a Promise controller driver.

See http://www.opensubscriber.com/message/linux-ide@vger.kernel.org/8302327.html
which says that for Promise 20267 which uses pata_pdc202xx_old module, one needs
burst mode -- it is pretty essential and should not be optional.

I haven't tested this patch yet.

Comment 25 josip 2008-01-09 04:21:08 UTC
...but the pata_pdc202xx_old module patch doesn't help:

Jan  8 21:17:16 fw kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action
0x2 frozen
Jan  8 21:17:16 fw kernel: ata3.00: cmd ca/00:08:97:0c:00/00:00:00:00:00/e0 tag
0 cdb 0x0 data 4096 out
Jan  8 21:17:16 fw kernel:          res 40/00:08:77:16:00/00:00:00:00:00/e0
Emask 0x4 (timeout)
Jan  8 21:17:17 fw kernel: ata3: soft resetting port
Jan  8 21:17:17 fw kernel: ata3.00: configured for UDMA/33
Jan  8 21:17:17 fw kernel: ata3.01: configured for UDMA/100
Jan  8 21:17:17 fw kernel: ata3: EH complete
Jan  8 21:17:17 fw kernel: sd 2:0:0:0: [sdb] 320173056 512-byte hardware sectors
(163929 MB)
Jan  8 21:17:17 fw kernel: sd 2:0:0:0: [sdb] Write Protect is off
Jan  8 21:17:17 fw kernel: sd 2:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Jan  8 21:17:17 fw kernel: sd 2:0:1:0: [sdc] 320173056 512-byte hardware sectors
(163929 MB)
Jan  8 21:17:17 fw kernel: sd 2:0:1:0: [sdc] Write Protect is off
Jan  8 21:17:17 fw kernel: sd 2:0:1:0: [sdc] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Jan  8 21:17:17 fw kernel: sd 2:0:0:0: [sdb] 320173056 512-byte hardware sectors
(163929 MB)
Jan  8 21:17:17 fw kernel: sd 2:0:0:0: [sdb] Write Protect is off
Jan  8 21:17:17 fw kernel: sd 2:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Jan  8 21:17:17 fw kernel: sd 2:0:1:0: [sdc] 320173056 512-byte hardware sectors
(163929 MB)
Jan  8 21:17:17 fw kernel: sd 2:0:1:0: [sdc] Write Protect is off
Jan  8 21:17:17 fw kernel: sd 2:0:1:0: [sdc] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA


Comment 26 josip 2008-01-13 20:08:40 UTC
FYI: Reverting to hd style drivers fixes the sd style driver's "ata ... frozen"
messages -- although there are some "SeekComplete" errors reported by the hd*
device, the raid5 driver sails through them.

Comment 27 Robert Spanton 2008-01-26 14:10:28 UTC
Created attachment 293049 [details]
2.6.23.14-107.fc8 dmesg

Bug still present in kernel-2.6.23.14-107.fc8.

Comment 28 Alan Cox 2008-02-07 13:03:13 UTC
Comment #22-#26 are a different bug. I know what the promise one occurs and is
being worked up upstream and here. Promise controllers sometimes complete an I/O
and don't send us an interrupt. Old IDE cleans up in that case libata doesn't
(because the cleanup technique that works for PATA hangs a lot of SATA
controllers). 

The original bug is a drive simply "going away" for a bit but without providing
any more meaningful diagnostics.


Comment 29 josip 2008-02-10 20:43:56 UTC
Thanks, Alan, that's a lucid explanation.  As a workaround for comment #22-#26,
I'd recommend that Promise users recompile their kernels with hd-style PATA
drivers.  This worked for me.

Comment 30 Robert Spanton 2008-02-22 11:56:35 UTC
Created attachment 295618 [details]
2.6.23.15-137.fc8 dmesg

Bug still present in 2.6.23.15-137.fc8.

Comment 31 Robert Spanton 2008-03-04 11:57:57 UTC
After reading some of the comments on the kernel bug that I linked to, I tried
leaving a CD in my CD drive.  This appears to make the errors go away.  The CD
drive isn't the drive that features in the error messages.

I have no idea what this means! 

Comment 32 Robert Spanton 2008-03-10 14:10:16 UTC
Created attachment 297436 [details]
2.6.24.3-12.fc8 dmesg

Bug still present in 2.6.24.3-12.fc8.

Still only happens when there isn't a CD in the drive.

Comment 33 Robert Spanton 2008-06-11 10:44:59 UTC
Created attachment 308918 [details]
kernel-2.6.25.3-18.fc9 dmesg

Bug still occurring in F9.

Comment 34 Robert Spanton 2008-10-04 14:02:28 UTC
Created attachment 319462 [details]
kernel-2.6.26.3-29.fc9 dmesg

Bug still present in 2.6.26.3-29

Comment 35 Bug Zapper 2008-11-26 07:18:12 UTC
This message is a reminder that Fedora 8 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 8.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '8'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 8's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 8 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 37 Bug Zapper 2009-06-09 22:38:20 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 38 Bug Zapper 2009-07-14 18:15:00 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.