Bug 456911

Summary: RHEL4 scheduler optimizations for financial applications
Product: Red Hat Enterprise Linux 4 Reporter: Goutham Kandiar <gkandiar>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.8CC: csnook, cward, dshaks, duck, fhirtz, herrold, lwoodman, moakley, rlerch, tao, tburke
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
A new allowable value has been added to the /proc/sys/kernel/wake_balance tunable parameter. Setting wake_balance to a value of 2 will instruct the scheduler to run the thread on any available CPU rather than scheduling it on the optimal CPU. Setting this kernel parameter to 2 will force the scheduler to reduce the overall latency even at the cost of total system throughput.
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-18 19:32:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 458752, 461304    
Attachments:
Description Flags
Wombat Performance w/ 2-1Gbit nics on Intel quad-core none

Description Goutham Kandiar 2008-07-28 15:49:09 UTC
Description of problem:

A large financial customer is seeing poor performance from the RHEL4 scheduler
in terms of sub millisecond CPU response.  

The application is a high frequency financial trading application.

The customer kernel developer changed the code to allocate the job to free CPU's
and managed to reduce latency in their application from milliseconds  to
microseconds. 

Red Hat has identified the following steps:
 
1) Research kernel patches for scheduler optimization from customer and share
with Red Hat kernel team for review

2) Test kernel patches against real production applications and assess potential
benefits and/or drawbacks, including assessment of potential performance regressions

3) Leverage scheduler optimizations in future RHEL updates/versions as appropriate

4) Help push Red Hat-reviewed code changes upstream for Fedora 10/11 and Red Hat
Enterprise Linux 6

The customer has supplied Red Hat with their patch.

Comment 1 John Shakshober 2008-08-04 17:57:48 UTC
Attached is improved performance by Tom Tracy with Wombat, on a kernel from Larry Woodman.

Completed testing the scheduler patch using Wombat. Throughput increased from 44K to 74K. Latency is comparable to RHEL5.2 I have attached results showing the throughput and latency comparisons with the scheduler patch.

Comment 2 John Shakshober 2008-08-04 20:57:20 UTC
Created attachment 313402 [details]
Wombat Performance w/ 2-1Gbit nics on Intel quad-core

Comment 3 Chris Snook 2008-08-11 21:37:31 UTC
Looking at the patch that was posted to the list, we may want to differentiate between static (realtime) priority threads and dynamic priority threads for the default behavior of the migrate-on-clone code.  For realtime threads, it's safe to assume that latency is top priority, so we should probably enable it by default there.  For dynamic priority threads, there will often be a throughput benefit to avoiding the migration, due to cache effects, as well as a power saving benefit to keeping cores idle longer, so it should be disabled by default there.  Since we're going to have a sysctl tunable anyway, we might as well default to the settings that will make the most people happy.

Comment 4 RHEL Program Management 2008-09-03 13:09:03 UTC
Updating PM score.

Comment 5 RHEL Program Management 2008-12-17 20:19:53 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 Vivek Goyal 2009-01-12 14:50:24 UTC
Committed in 78.27.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 7 Larry Woodman 2009-01-23 19:10:48 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
A new allowable value was added to the /proc/sys/kernel/wake_balance tunable parameter.  
Setting wake_balance  to 2 will cause the scheduler to run the thread being awakened on any avaialble CPU rather than scheduling it on the optimal CPU based on a combination of cache footprint and idleness of the CPU in question.  This will cause the scheduler to reduce the overall latencey even at the cost of total system throughput.
 
Large financial applications that experience poor latencey performance from the RHEL4 scheduler
and would like to see sub millisecond CPU response  times should set /proc/sys/kernel/wake_balance = 2.

Comment 11 Ryan Lerch 2009-02-25 06:55:17 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1,5 +1 @@
-A new allowable value was added to the /proc/sys/kernel/wake_balance tunable parameter.  
+A new allowable value has been added to the /proc/sys/kernel/wake_balance tunable parameter. Setting wake_balance to a value of 2 will instruct the scheduler to run the thread on any available CPU rather than scheduling it on the optimal CPU. Setting this kernel parameter to 2 will force the scheduler to reduce the overall latency even at the cost of total system throughput.-Setting wake_balance  to 2 will cause the scheduler to run the thread being awakened on any avaialble CPU rather than scheduling it on the optimal CPU based on a combination of cache footprint and idleness of the CPU in question.  This will cause the scheduler to reduce the overall latencey even at the cost of total system throughput.
- 
-Large financial applications that experience poor latencey performance from the RHEL4 scheduler
-and would like to see sub millisecond CPU response  times should set /proc/sys/kernel/wake_balance = 2.

Comment 14 Chris Ward 2009-05-05 13:57:23 UTC
Any updates here? Has this issue been resolved in the RHEL 4.8 Beta? later kernel?

Comment 15 John Shakshober 2009-05-05 14:08:29 UTC
We have tested this for performance in 4.8 and ack this for 4.8.
/proc/sys/kernel/wake_balance = 2.

Comment 16 Larry Woodman 2009-05-05 14:11:29 UTC
Chris, this change is in RHEL4-U8.  You enable it by setting proc/sys/kernel/wake_balance = 2.

Larry Woodman

Comment 17 Chris Ward 2009-05-05 14:22:08 UTC
Sorry for the confusion. I meant to ask whether this issue had been tested by QA, customer or partner and if so, whether or not it has been VERIFIED.

Comment 19 errata-xmlrpc 2009-05-18 19:32:07 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html