Bug 1403309

Summary: [RHEL7][RFE] tuned should implement a whitelist of processes that should not be repined on isolated_cores, when starting
Product: Red Hat Enterprise Linux 7 Reporter: Franck Baudin <fbaudin>
Component: tunedAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED ERRATA QA Contact: Tereza Cerna <tcerna>
Severity: urgent Docs Contact:
Priority: high    
Version: 7.4CC: atelang, atheurer, atragler, fbaudin, fiezzi, jeder, jskarvad, lcapitulino, salmy, tcerna, thozza
Target Milestone: rcKeywords: FutureFeature, Patch, Upstream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: tuned-2.8.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1442229 1442230 (view as bug list) Environment:
Last Closed: 2017-08-01 12:32:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1393869, 1394537    

Description Franck Baudin 2016-12-09 15:29:55 UTC
Description of problem:
tuned move all existing threads in isolated_cores when started. The problem is that in case of OVS-DPDK (https://bugzilla.redhat.com/show_bug.cgi?id=1394537) the PMD threads should not been re-pinned as the goal of the isolation is to isolate these threads.



Version-Release number of selected component (if applicable): 
RHEL 7.3

How reproducible:
See https://bugzilla.redhat.com/show_bug.cgi?id=1394537

Expected results:

Adding a new configuration option, being a whitelist of regex: the threads ahing a name matching any of the regex should not be repined. Example:

isolated_cores=4,6,8,10,12,14,20,22,24,26,28,30
ignore_processes=*pmd*,*PMD*,^DPDK

Comment 6 Jaroslav Škarvada 2017-03-27 13:21:14 UTC
Is it also request for adding "*pmd*,*PMD*,^DPDK" into cpu-partitioning Tuned profile whitelist? Or to different profile? Or no default whitelist, just request for the whitelisting feature?

Comment 7 Jaroslav Škarvada 2017-03-27 13:35:08 UTC
Currently Tuned use Tuna for the cores isolation and unfortunately Tuna doesn't have whitelisting functionality. For the 7.4 I am trying to workaround it by multiple Tuna calls and some scripting-fu. For the 7.5 we should probably introduce Tuned plugin for cores isolation.

Comment 8 Jaroslav Škarvada 2017-03-27 13:48:55 UTC
Is the following workaround OK?:

- snapshot CPU affinity of processes
- isolate cores by:
# tuna -C isolated_cores -i  # now all processes moved outside of isolated cores
- revert the affinity for whitelisted processes

Is it OK? Or is it unacceptable to touch the whitelisted processes (e.g. due to performance related reasons)?

If the presented workaround is OK, I will be able to script it with Tuna. If not, I will probably have to write the Tuned plugin.

Comment 9 Federico Iezzi 2017-03-27 17:38:43 UTC
In case you go for the regex approach, please also add the qemu-kvm process since it will impact customer workloads.

Comment 10 Jaroslav Škarvada 2017-03-27 17:45:10 UTC
(In reply to fiezzi from comment #9)
> In case you go for the regex approach, please also add the qemu-kvm process
> since it will impact customer workloads.

What's the right regex for it? '.*qemu-kvm.*'? just guessing :)

Comment 11 Jaroslav Škarvada 2017-03-27 17:47:49 UTC
(In reply to Jaroslav Škarvada from comment #8)
> Is the following workaround OK?:
> 
> - snapshot CPU affinity of processes
> - isolate cores by:
> # tuna -C isolated_cores -i  # now all processes moved outside of isolated
> cores
> - revert the affinity for whitelisted processes
> 
> Is it OK? Or is it unacceptable to touch the whitelisted processes (e.g. due
> to performance related reasons)?
> 
> If the presented workaround is OK, I will be able to script it with Tuna. If
> not, I will probably have to write the Tuned plugin.

TLDR the workaround consists of moving all processes out of the isolated cores and then returning the whitelisted processes back (i.e. setting back their previous affinity).

Comment 12 Federico Iezzi 2017-03-28 07:18:55 UTC
(In reply to Jaroslav Škarvada from comment #10)
> (In reply to fiezzi from comment #9)
> > In case you go for the regex approach, please also add the qemu-kvm process
> > since it will impact customer workloads.
> 
> What's the right regex for it? '.*qemu-kvm.*'? just guessing :)

Correct.

About the workaround, someone should check what would happen to move the PMD threads on a different NUMA node. E.g. the NUMA1 has the NIC and the PMD threads as well. Due to tuned the PMDs got relocated on NUMA0. I believe OVS-DPDK, at least, will complain in the logs.
Someone from NFV team should check this behavior.

Comment 13 Jaroslav Škarvada 2017-04-07 16:02:30 UTC
Instead of hacking I implemented the feature into scheduler plugin:

Plugin scheduler now can do cores isolation on its own, Tuna is not needed for it, e.g.:

[scheduler]
isolated_cores=2-4
ps_blacklist=.*pmd.*;.*PMD.*;^DPDK;.*qemu-kvm.*

It will isolate cores 2-4, it will ignore processes which matched ps_blacklist regexes. Multiple regexes can be separated by ';'. Quoted semicolon, i.e. '\;', is taken literally.

It also supports 'ps_whitelist' which is by default set to '.*'.

It takes all processes which matches ps_whitelist than it removes those which matches ps_blacklist and move them out of the isolated_cores. When the profile is unloaded it allows all matching processes to run on all cores.

It changes processes affinities, threads affinities, IRQs affinities
and it sets default_smp_affinity for IRQs.

Also cpu-partitioning profile has been switched from Tuna to this mechanism.

Upstream commit adding this feature:
https://github.com/redhat-performance/tuned/commit/ac78f90c773cc97573844f521c2f67291f15d354

Available for testing as tuned-2.7.1-1.20170407gitac78f90c.el7 from:
https://jskarvad.fedorapeople.org/tuned/devel/repo/

Comment 16 Jaroslav Škarvada 2017-04-12 08:11:04 UTC
I received bug report which may or may not be related to this new feature (I wasn't able to reproduce the problem myself). The truth is that this feature is very new and it would be good to give it some time to stabilize.

So the question is, could we do the following for 7.4?

- keep the code / support in RHEL-7.4 Tuned, so everybody who is interested could test it
- for safety do not use it in cpu-partitioning profile by default yet, switch back to Tuna for core isolation
- file new bug for RHEL-7.5 requesting switch from Tuna to this new code in cpu-partitioning profile

Comment 17 Jaroslav Škarvada 2017-04-12 18:50:34 UTC
It turned out to be python-linux-procfs, workaround for it is already in Tuned upstream git.

Comment 18 Luiz Capitulino 2017-04-12 19:13:53 UTC
That plan looks great to me.

Comment 19 Jaroslav Škarvada 2017-04-13 19:58:43 UTC
Upstream commit making 'tuna' the default for 7.4 in cpu-partitioning, instructions how to enable Tuned for cores isolation with process blacklisting support (it was previously wrongly called whitelisting) are provided in the tuned.conf:
https://github.com/redhat-performance/tuned/commit/8cde3e1b3c9103f0b3e46175aff671f85cd8dcc1

Cloning the BZ for 7.5 to make Tuned the default for cores isolation in cpu-partitioning profile.

Comment 20 Jaroslav Škarvada 2017-04-13 20:03:38 UTC
(In reply to Jaroslav Škarvada from comment #19)
> Upstream commit making 'tuna' the default for 7.4 in cpu-partitioning,
> instructions how to enable Tuned for cores isolation with process
> blacklisting support (it was previously wrongly called whitelisting) are
> provided in the tuned.conf:
> https://github.com/redhat-performance/tuned/commit/
> 8cde3e1b3c9103f0b3e46175aff671f85cd8dcc1
> 
> Cloning the BZ for 7.5 to make Tuned the default for cores isolation in
> cpu-partitioning profile.

Bug 1442230.

Comment 21 Franck Baudin 2017-04-18 07:53:11 UTC
(In reply to Jaroslav Škarvada from comment #6)
> Is it also request for adding "*pmd*,*PMD*,^DPDK" into cpu-partitioning
> Tuned profile whitelist?
Yes

> Or to different profile? Or no default whitelist,
> just request for the whitelisting feature?
Ideally this feature could be used in any profile using CPU pinning, as this feature permit not to re-pin the processes matching the regex.

Comment 22 Jaroslav Škarvada 2017-04-18 08:02:09 UTC
(In reply to Franck Baudin from comment #21)
> (In reply to Jaroslav Škarvada from comment #6)
> > Is it also request for adding "*pmd*,*PMD*,^DPDK" into cpu-partitioning
> > Tuned profile whitelist?
> Yes
>
Tracking in bug 1442230.

> > Or to different profile? Or no default whitelist,
> > just request for the whitelisting feature?
> Ideally this feature could be used in any profile using CPU pinning, as this
> feature permit not to re-pin the processes matching the regex.

Understood. The core functionality is already in tuned-2.8.0, but it's disabled by default in Tuned profiles. We are going to enable it by default in RHEL-7.5.

Comment 24 errata-xmlrpc 2017-08-01 12:32:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2102