Red Hat Bugzilla – Bug 1331562
rt: fix idle_balance iterating over all CPUs if a runnable task shows up partway through
Last modified: 2016-11-03 15:48:49 EDT
The idle_balance() kernel function is responsible for balancing SCHED_OTHER tasks when a core goes idle. This function is called with interrupts disabled, meaning on systems with a large number of cores (>=64) hundreds of microseconds can be spent in the balance function without the opportunity for a high priority task to preempt and run. This behavior can be seen when running rteval on a 72-core HP DL580gen9 and observing the lock contention on the run-queue locks when idle_balance() is called. The cyclictest thread on the core that is running idle_balance() cannot run; it's timer fires but the interrupt is held off due to idle_balance() running and the cyclictest thread misses its deadline by hundreds of microseconds.
There are a few separate issues here: 1) idle_balance is currently called with irqs disabled, Steven Rostedt has a patch to fix that 2) idle_balance continues to iterate over all CPUs even if a runnable task shows up during balancing, I have a patch to fix that We need both of these fixes together to get the system to behave better.
Created attachment 1152052 [details] break out of idle_balance if an RT task is ready to run
Created attachment 1152053 [details] Enable irqs in idle_balance() routine
Created attachment 1152054 [details] Move call to idle_balance to post-schedule
The above three patches have been applied to a scratch build based on kernel-rt-3.10.0-382.rt56.263.el7 and are under testing now
see bug 1209987 comment 20
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2584.html