Bug 1331562

Summary: rt: fix idle_balance iterating over all CPUs if a runnable task shows up partway through
Product: Red Hat Enterprise Linux 7 Reporter: Clark Williams <williams>
Component: kernel-rtAssignee: Steven Rostedt <srostedt>
kernel-rt sub component: Process management QA Contact: Jiri Kastner <jkastner>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: bhu, bperkins, lgoncalv, riel, srostedt, williams
Version: 7.3   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 19:48:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1274397    
Attachments:
Description Flags
break out of idle_balance if an RT task is ready to run
none
Enable irqs in idle_balance() routine
none
Move call to idle_balance to post-schedule none

Description Clark Williams 2016-04-28 19:40:23 UTC
The idle_balance() kernel function is responsible for balancing SCHED_OTHER tasks when a core goes idle. This function is called with interrupts disabled, meaning on systems with a large number of cores (>=64) hundreds of microseconds can be spent in the balance function without the opportunity for a high priority task to preempt and run. 

This behavior can be seen when running rteval on a 72-core HP DL580gen9 and observing the lock contention on the run-queue locks when idle_balance() is called. The cyclictest thread on the core that is running idle_balance() cannot run; it's timer fires but the interrupt is held off due to idle_balance() running and the cyclictest thread misses its deadline by hundreds of microseconds.

Comment 1 Rik van Riel 2016-04-28 19:46:57 UTC
There are a few separate issues here:
1) idle_balance is currently called with irqs disabled, Steven Rostedt has a patch to fix that
2) idle_balance continues to iterate over all CPUs even if a runnable task shows up during balancing, I have a patch to fix that

We need both of these fixes together to get the system to behave better.

Comment 2 Clark Williams 2016-04-28 19:49:13 UTC
Created attachment 1152052 [details]
break out of idle_balance if an RT task is ready to run

Comment 3 Clark Williams 2016-04-28 19:49:46 UTC
Created attachment 1152053 [details]
Enable irqs in idle_balance() routine

Comment 4 Clark Williams 2016-04-28 19:50:21 UTC
Created attachment 1152054 [details]
Move call to idle_balance to post-schedule

Comment 5 Clark Williams 2016-04-28 19:51:10 UTC
The above three patches have been applied to a scratch build based on kernel-rt-3.10.0-382.rt56.263.el7 and are under testing now

Comment 7 Jiri Kastner 2016-10-04 10:09:31 UTC
see bug 1209987 comment 20

Comment 9 errata-xmlrpc 2016-11-03 19:48:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2584.html