Bug 2093847

Summary: [RFE] Automate determining initial cpu isolation setup and update tuned-profiles-realtime
Product: Red Hat Enterprise Linux 9 Reporter: Juri Lelli <jlelli>
Component: tunedAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED ERRATA QA Contact: Robin Hack <rhack>
Severity: unspecified Docs Contact: Šárka Jana <sjanderk>
Priority: unspecified    
Version: 9.1CC: ailan, bhu, gfialova, jeder, jkacur, jskarvad, jzerdik, kazen, kcarcia, mhou, mstowell, mtosatti, rlandry, rt-qe, shichen, sjanderk, skurup, williams
Target Milestone: rcKeywords: FutureFeature, Patch, TestCaseNeeded, Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tuned-2.19.0-0.1.rc1.el9 Doc Type: Enhancement
Doc Text:
.TuneD real-time profiles now auto determine initial CPU isolation setup TuneD is a service for monitoring your system and optimizing the performance profile. You can also isolate central processing units (CPUs) using the `tuned-profiles-realtime` package to give application threads the most execution time possible. Previously, the real-time profiles for systems running the real-time kernel did not load if you did not specify the list of CPUs to isolate in the `isolated_cores` parameter. With this enhancement, TuneD introduces the `calc_isolated_cores` built-in function that automatically calculates housekeeping and isolated cores lists, and applies the calculation to the `isolated_cores` parameter. With the automatic preset, one core from each socket is reserved for housekeeping, and you can start using the real-time profile without any additional steps. If you want to change the preset, customize the `isolated_cores` parameter by specifying the list of CPUs to isolate.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-15 11:17:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Juri Lelli 2022-06-06 08:11:33 UTC
Description of problem:
realtime tuned profile can't be used right away after install since it
requires a valid 'isolated_cores' parameter from realtime-variables.conf.

What we (RT team and RT-QE team) do for testing (gating) kernels is to
define it as 'isolated_cores=' (void) so that the additional tuning done by
applying the profile is performed (disabling watchdogs, intel_pstate, etc.).

This has been OK so far for relatively small single-socket systems, but it
turns out that:

1. Larger (dual+ sockets) boxes can experience high latencies w/o isolation
2. Most of RT customers don't of course run in production w/o some kind of isolation
3. (minor) It might be convenient for users and automation to have the realtime
   profile available to use right after install w/o any additional steps

It looks like a sane default (as possibly a base for further tuning) is to use
1 or 2 CPUs per socket as housekeeping and isolate the rest of the CPUs for the
actual workload. So, e.g., in a system with 2 sockets and 40 cores per socket
(socket#0 [0-39] socket#1 [40-79]) a sane initial default would be to configure
'isolated_cores=2-39,42-79'.

Request for this RFE is to implement logic to automatically inspect a system's
topology (using for example lstopo-no-graphics), calculate housekeeping and
isolated cores lists and apply the calculation to isolated_cores parameter.
With this automated process in place a user can then simply start the profile
and run tests on the system w/o any initial need for hands on tuning.

Version-Release number of selected component (if applicable):
Possibly implement this in 9.2, so that the feature can be used for kernel-rt
gating tests and RHEL9-RT certifications.

Comment 2 Luiz Capitulino 2022-06-23 20:33:41 UTC
Hi Juri,

This looks like a great idea!

Do we know if/how this will affect the virtual real-time profiles? Would be great if you or Jarda include the Virt-RT team in the patches so that we can review and test them with the KVM-RT test-cases.

Comment 5 Jaroslav Škarvada 2022-08-08 21:46:08 UTC
Upstream PR:
https://github.com/redhat-performance/tuned/pull/453

Comment 6 Jaroslav Škarvada 2022-08-08 21:52:29 UTC
(In reply to Luiz Capitulino from comment #2)
> Hi Juri,
> 
> This looks like a great idea!
> 
> Do we know if/how this will affect the virtual real-time profiles? Would be
> great if you or Jarda include the Virt-RT team in the patches so that we can
> review and test them with the KVM-RT test-cases.

IMHO the difference is:
- previously the stock realtime profiles refused to load if user haven't specified the isolated_cores
- now, with the patch, it presets isolated_cores according to the comment 0

Users can still customize the isolated_cores (change what was preset for her/him). In case she/he already customized it, nothing will change for her/him.

I preset one housekeeping core per socket in the patch. If two are needed, please let me know to update the patch.

Comment 26 Jaroslav Škarvada 2022-11-01 13:46:10 UTC
With this enhancement, TuneD introduces the `calc_isolated_cores` built-in function that automatically calculates housekeeping and isolated cores lists, and applies the calculation to the `isolated_cores` parameter. With the automatic preset, one core from each socket is reserved for housekeeping, and you can start using the real-time profile without any additional steps.
If you want to change the preset, customize the `isolated_cores` parameter by specifying the list of CPUs to isolate.

Example:
isolated_cores=${f:calc_isolated_cores:2}

On machine with 2 sockets, each 4 cores, it will expand to:
isolated_cores=2, 3, 6, 7

I.e. cores 0, 1 and 4, 5 will be used for housekeeping.

Comment 30 errata-xmlrpc 2022-11-15 11:17:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (tuned bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8321