Bug 727998 - commands may get lost or stuck
Summary: commands may get lost or stuck
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 15
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-03 21:28 UTC by Mike Miller (OS Dev)
Modified: 2011-08-17 21:09 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-08-17 21:09:15 UTC


Attachments (Terms of Use)
fix lost command issue and prevent controller lockups (788 bytes, application/octet-stream)
2011-08-03 21:28 UTC, Mike Miller (OS Dev)
no flags Details

Description Mike Miller (OS Dev) 2011-08-03 21:28:33 UTC
Created attachment 516582 [details]
fix lost command issue and prevent controller lockups

Description of problem: An extra readl is required to prevent commands from being lost or stuck under certain workloads


Version-Release number of selected component (if applicable):


How reproducible: Hard to reproduce.


Steps to Reproduce:
1. Run heavy read workload on Smart Array
2.
3.
  
Actual results: A command may get lost or stuck resulting in system instability.


Expected results: All commands should complete.


Additional info:

Comment 1 Dave Jones 2011-08-03 21:41:09 UTC
The 2.6.40 (3.0.0) update we have in f15 has this line (from upstream)

(void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);


I assume the actual register being written is irrelevant, and it's just working around write posting ?

Comment 2 Mike Miller (OS Dev) 2011-08-04 15:14:49 UTC
(In reply to comment #1)
> The 2.6.40 (3.0.0) update we have in f15 has this line (from upstream)
> 
> (void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);
> 
> 
> I assume the actual register being written is irrelevant, and it's just working
> around write posting ?

It actually does matter. At the time of the patch you reference it was assumed to be safe to read back from the request queue register. That register is defined by the spec as being read-only but on the controllers in the field it didn't matter. However, the next generation of controllers lockup when reading that register so it was changed to read from the scratchpad register instead. So please use the patch I posted to the bz.

This is (hopefully) a temporary work around. But you are right, it's only acting as a fence to ensure all commands get posted to the controller. I'm working with our 3rd level support engineers to root cause the issue.

Comment 3 Dave Jones 2011-08-04 17:26:26 UTC
ok, I'll switch it to use the version you attached in the next update.
Can you make sure this gets into Linus' tree & Greg's -stable branches ?

thanks.

Comment 4 Dave Jones 2011-08-04 17:32:01 UTC
actually disregard that. I see this made it into 3.0.1rc, which we've already picked up for the next f15 update.


Note You need to log in before you can comment on or make changes to this bug.