RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1331562 - rt: fix idle_balance iterating over all CPUs if a runnable task shows up partway through
Summary: rt: fix idle_balance iterating over all CPUs if a runnable task shows up part...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel-rt
Version: 7.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Steven Rostedt
QA Contact: Jiri Kastner
URL:
Whiteboard:
Depends On:
Blocks: 1274397
TreeView+ depends on / blocked
 
Reported: 2016-04-28 19:40 UTC by Clark Williams
Modified: 2016-11-03 19:48 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-03 19:48:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
break out of idle_balance if an RT task is ready to run (3.76 KB, patch)
2016-04-28 19:49 UTC, Clark Williams
no flags Details | Diff
Enable irqs in idle_balance() routine (1.13 KB, patch)
2016-04-28 19:49 UTC, Clark Williams
no flags Details | Diff
Move call to idle_balance to post-schedule (2.42 KB, patch)
2016-04-28 19:50 UTC, Clark Williams
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:2584 0 normal SHIPPED_LIVE Important: kernel-rt security, bug fix, and enhancement update 2016-11-03 12:08:49 UTC

Description Clark Williams 2016-04-28 19:40:23 UTC
The idle_balance() kernel function is responsible for balancing SCHED_OTHER tasks when a core goes idle. This function is called with interrupts disabled, meaning on systems with a large number of cores (>=64) hundreds of microseconds can be spent in the balance function without the opportunity for a high priority task to preempt and run. 

This behavior can be seen when running rteval on a 72-core HP DL580gen9 and observing the lock contention on the run-queue locks when idle_balance() is called. The cyclictest thread on the core that is running idle_balance() cannot run; it's timer fires but the interrupt is held off due to idle_balance() running and the cyclictest thread misses its deadline by hundreds of microseconds.

Comment 1 Rik van Riel 2016-04-28 19:46:57 UTC
There are a few separate issues here:
1) idle_balance is currently called with irqs disabled, Steven Rostedt has a patch to fix that
2) idle_balance continues to iterate over all CPUs even if a runnable task shows up during balancing, I have a patch to fix that

We need both of these fixes together to get the system to behave better.

Comment 2 Clark Williams 2016-04-28 19:49:13 UTC
Created attachment 1152052 [details]
break out of idle_balance if an RT task is ready to run

Comment 3 Clark Williams 2016-04-28 19:49:46 UTC
Created attachment 1152053 [details]
Enable irqs in idle_balance() routine

Comment 4 Clark Williams 2016-04-28 19:50:21 UTC
Created attachment 1152054 [details]
Move call to idle_balance to post-schedule

Comment 5 Clark Williams 2016-04-28 19:51:10 UTC
The above three patches have been applied to a scratch build based on kernel-rt-3.10.0-382.rt56.263.el7 and are under testing now

Comment 7 Jiri Kastner 2016-10-04 10:09:31 UTC
see bug 1209987 comment 20

Comment 9 errata-xmlrpc 2016-11-03 19:48:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2584.html


Note You need to log in before you can comment on or make changes to this bug.