Bug 1401061
Summary: | RFE: Improve RT throttling mechanism | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Daniel Bristot de Oliveira <daolivei> | ||||
Component: | kernel-rt | Assignee: | Daniel Bristot de Oliveira <daolivei> | ||||
kernel-rt sub component: | Memory Management | QA Contact: | Jiri Kastner <jkastner> | ||||
Status: | CLOSED ERRATA | Docs Contact: | Jana Heves <jsvarova> | ||||
Severity: | medium | ||||||
Priority: | high | CC: | bhu, cww, daolivei, dhoward, mkolaja, salmy, stalexan, toneata, williams | ||||
Version: | 7.4 | Keywords: | FutureFeature, ZStream | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Enhancement | |||||
Doc Text: |
Improved RT throttling mechanism
The current real-time throttling mechanism prevents the starvation of non-real-time tasks by CPU intensive real-time tasks. When a real-time run queue is throttled, it allows non-real-time tasks to run or if there are none, the CPU goes idle. To safely maximize CPU usage by decreasing the CPU idle time, the "RT_RUNTIME_GREED" scheduler feature has been implemented. When enabled, this feature checks if non-real-time tasks are starving before throttling the real-time task. As a result, the "RT_RUNTIME_GREED" scheduler option guarantees some run time on all CPUs for the non-real-time tasks, while keeping the real-time tasks running as much as possible.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 1505158 (view as bug list) | Environment: | |||||
Last Closed: | 2018-04-10 09:07:09 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1420851, 1442258, 1505158 | ||||||
Attachments: |
|
Description
Daniel Bristot de Oliveira
2016-12-02 16:29:01 UTC
Hey, BNP complained about this thread being blocked because of a CPU with a spinning -rt tasks: crash> bt 6949 PID: 6949 TASK: ffff880418466300 CPU: 10 COMMAND: "force" #0 [ffff8800b8793918] __schedule at ffffffff815f31dc #1 [ffff8800b87939b0] schedule at ffffffff815f38f4 #2 [ffff8800b87939d0] wait_transaction_locked at ffffffffa0311a05 [jbd2] #3 [ffff8800b8793a40] add_transaction_credits at ffffffffa0311e89 [jbd2] #4 [ffff8800b8793ac0] start_this_handle at ffffffffa0312131 [jbd2] #5 [ffff8800b8793b60] jbd2__journal_start at ffffffffa0312640 [jbd2] #6 [ffff8800b8793bc0] __ext4_journal_start_sb at ffffffffa0370889 [ext4] #7 [ffff8800b8793c10] ext4_dirty_inode at ffffffffa0341934 [ext4] #8 [ffff8800b8793c30] __mark_inode_dirty at ffffffff811dbd9b #9 [ffff8800b8793c60] update_time at ffffffff811c8d41 #10 [ffff8800b8793c90] file_update_time at ffffffff811c8e28 #11 [ffff8800b8793cf0] __generic_file_aio_write at ffffffff8114c028 #12 [ffff8800b8793d80] generic_file_aio_write at ffffffff8114c2b5 #13 [ffff8800b8793dd0] ext4_file_write at ffffffffa0339954 [ext4] #14 [ffff8800b8793e10] do_sync_write at ffffffff811acdff #15 [ffff8800b8793ef0] vfs_write at ffffffff811ad31f #16 [ffff8800b8793f20] sys_write at ffffffff811addd0 #17 [ffff8800b8793f80] tracesys at ffffffff815fdca8 (via system_call) RIP: 0000003eb900e6fd RSP: 00007fdf14f02d60 RFLAGS: 00000293 RAX: ffffffffffffffda RBX: ffffffff815fdca8 RCX: ffffffffffffffff RDX: 000000000000001f RSI: 00000000007b9a0c RDI: 0000000000000022 RBP: 0000000000000022 R8: 00000000007b99d0 R9: 00000000000001f0 R10: 00007fdf20909718 R11: 0000000000000293 R12: 00007fdeec1fbe10 R13: 00007fdf14f02db0 R14: 000000000000001f R15: 00000000007b9a0c ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b This is that old BZ about not being possible to avoid a jbd2 thread on an isolated CPU - BZ1306341. One possible workaround for this problem is to add the patch suggested in this BZ. The other would be to try to make jdb2 per-cpu kworkers not to be per-cpu. But that would be really complex. How to reproduce the problem: 1) prepare a busy-loop task, like: f.c: ------------- %< ------------------ int main (void) { for(;;); } ------------- >% ------------------- # gcc -o rt f.c # gcc -o nonrt f.c 2) disable rt runtime sharing # echo NO_RT_RUNTIME_SHARE > /sys/kernel/debug/sched_features 3) run the "rt" busy loop task, in the FIFO policy, pinned to a CPU, for instance, CPU 1: # taskset -c 1 chrt -f 1 ./rt & 4) see the CPU 1 usage, it should notify 95% busy with the "rt" task, and +- 5% idle. 5) Then, enable the RT_RUNTIME_GREED feature: # echo RT_RUNTIME_GREED > /sys/kernel/debug/sched_features and check the CPU 1 usage, now the "rt" should be taking +-100 % of CPU time. The system should be able to run for a long period without causing problems like hung tasks because of the busy-loop task. (that is the feature implemented by this patch) 6) Finally, run the "nonrt" task in the CPU 1 as non-rt: # taskset -c 1 ./nonrt & Now, the "rt" task should be taking 95% and the "nonrt" 5%. Created attachment 1298514 [details]
[RT PATCH] sched/rt: RT_RUNTIME_GREED sched feature
Patch posted to the internal list: http://post-office.corp.redhat.com/archives/kernel-rt-team/2017-July/msg00005.html patch merged to the version 3.10.0-695.rt56.620. Hello All, 7.5 flag is not required, as kernel-rt it's approved directly for zstream Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:0676 |