Bug 1332593
Summary: | rt: Use IPI to trigger RT task push migration instead of pulling | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Clark Williams <williams> | ||||||||
Component: | kernel-rt | Assignee: | Clark Williams <williams> | ||||||||
kernel-rt sub component: | Process management | QA Contact: | Jiri Kastner <jkastner> | ||||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||||
Severity: | medium | ||||||||||
Priority: | high | CC: | bhu, lgoncalv, srostedt, williams | ||||||||
Version: | 7.3 | Keywords: | ZStream | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: |
In order to avoid thundering herd access to run-queue locks when performing task migration, a three-commit series was backported from upstream. By sending an IPI from an underutilized core to an overloaded core, requesting that the overloaded core push the task to the requesting core, the issue was avoided.
|
Story Points: | --- | ||||||||
Clone Of: | |||||||||||
: | 1334459 (view as bug list) | Environment: | |||||||||
Last Closed: | 2016-11-03 19:49:35 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1274397, 1334459 | ||||||||||
Attachments: |
|
Description
Clark Williams
2016-05-03 14:29:26 UTC
To reproduce: run kernel on a system with >= 64 cores (usually a four-socket DL58x) and after tuning BIOS for low latency run rteval for 1-2 hours. If latency spikes appear of 200-300 microseconds, then run with tracing: In one login session run 'rteval --onlyload --duration=2h' In a separate window: 1. trace-cmd start -e all -p function -l '*rt_spin*' -l '_raw_spin*' 2. cyclictest --numa -p95 -i100 -d0 -qmu -b 200 --tracemark --notrace wait for cyclictest to hit the breaktrace threshold and then run: 3. trace-cmd extract 4. trace-cmd report -l Looking through the report you will see stretches where the system goes idle and tries to migrate RT workloads to idle cores, with lots of calls to spin_lock/spin_unlock of the run-queue (rq) locks. Created attachment 1153839 [details]
sched/rt: Use IPI to trigger RT task push migration
Rather than have all idle cpus try to migrate tasks from an overloaded cpu, have the idle cpu send an IPI to the overloaded cpu and push tasks to the requesting idle cpu.
Created attachment 1153841 [details]
sched/rt: Hide the push_irq_work_func() declaration
Get rid of a compiler warning from the previous IPI patch
Created attachment 1153842 [details]
sched/rt: Have the schedule IPI irq_work run in hard irq
As the sched rt pull work has moved to using irq_work IPI, having it
delayed to threading pretty much defeats the purpose. The handle also
expects interrupts to be disabled when called as it takes the rq locks.
Set the rt push ipi irq_work handle flag HARD_IRQ
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2584.html |