Bug 727998

Summary: commands may get lost or stuck
Product: [Fedora] Fedora Reporter: Mike Miller (OS Dev) <mike.miller>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 15CC: aquini, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-17 21:09:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
fix lost command issue and prevent controller lockups none

Description Mike Miller (OS Dev) 2011-08-03 21:28:33 UTC
Created attachment 516582 [details]
fix lost command issue and prevent controller lockups

Description of problem: An extra readl is required to prevent commands from being lost or stuck under certain workloads


Version-Release number of selected component (if applicable):


How reproducible: Hard to reproduce.


Steps to Reproduce:
1. Run heavy read workload on Smart Array
2.
3.
  
Actual results: A command may get lost or stuck resulting in system instability.


Expected results: All commands should complete.


Additional info:

Comment 1 Dave Jones 2011-08-03 21:41:09 UTC
The 2.6.40 (3.0.0) update we have in f15 has this line (from upstream)

(void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);


I assume the actual register being written is irrelevant, and it's just working around write posting ?

Comment 2 Mike Miller (OS Dev) 2011-08-04 15:14:49 UTC
(In reply to comment #1)
> The 2.6.40 (3.0.0) update we have in f15 has this line (from upstream)
> 
> (void) readl(h->vaddr + SA5_REQUEST_PORT_OFFSET);
> 
> 
> I assume the actual register being written is irrelevant, and it's just working
> around write posting ?

It actually does matter. At the time of the patch you reference it was assumed to be safe to read back from the request queue register. That register is defined by the spec as being read-only but on the controllers in the field it didn't matter. However, the next generation of controllers lockup when reading that register so it was changed to read from the scratchpad register instead. So please use the patch I posted to the bz.

This is (hopefully) a temporary work around. But you are right, it's only acting as a fence to ensure all commands get posted to the controller. I'm working with our 3rd level support engineers to root cause the issue.

Comment 3 Dave Jones 2011-08-04 17:26:26 UTC
ok, I'll switch it to use the version you attached in the next update.
Can you make sure this gets into Linus' tree & Greg's -stable branches ?

thanks.

Comment 4 Dave Jones 2011-08-04 17:32:01 UTC
actually disregard that. I see this made it into 3.0.1rc, which we've already picked up for the next f15 update.