Bug 2013321 - TuneD: high CPU utilization of the TuneD daemon.
Summary: TuneD: high CPU utilization of the TuneD daemon.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node Tuning Operator
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Jiří Mencák
QA Contact: Simon
URL:
Whiteboard:
Depends On:
Blocks: 2013653
TreeView+ depends on / blocked
 
Reported: 2021-10-12 15:12 UTC by Jiří Mencák
Modified: 2022-03-10 16:19 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:18:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-node-tuning-operator pull 278 0 None open Bug 2013321: TuneD: workaround for high CPU utilization of [scheduler] plug-in. 2021-10-12 15:13:29 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:19:09 UTC

Description Jiří Mencák 2021-10-12 15:12:26 UTC
Description of problem:
The fix for rhbz#1979352 introduced the [scheduler] plug-in as a standard part of openshift TuneD profiles. Unfortunately, the [scheduler] plug-in can be very CPU intensive, especially on the OpenShift platform. The bug for this issue is tracked by rhbz#1921738.  The CPU utilization of the TuneD process can be around 1% of one core.

Version-Release number of selected component (if applicable):
4.8->4.10

How reproducible:
Always.

Steps to Reproduce:
1. Install OCP
2. Watch tuned process utilization either via top -p <pid> or just by querying /proc/<pid>/status

Actual results:
~1% of CPU

Expected results:
~0% of CPU

Additional info:
https://github.com/openshift/cluster-node-tuning-operator/pull/278

Comment 2 Simon 2021-10-13 19:59:59 UTC
$ oc get clusterversion
NAME      VERSION      AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-rc.8   True        False         3h37m   Cluster version is 4.9.0-rc.8

$ oc get nodes
NAME                                                       STATUS   ROLES    AGE     VERSION
skordas1013-qjh6c-master-0.c.openshift-qe.internal         Ready    master   3h56m   v1.22.0-rc.0+894a78b
skordas1013-qjh6c-master-1.c.openshift-qe.internal         Ready    master   3h56m   v1.22.0-rc.0+894a78b
skordas1013-qjh6c-master-2.c.openshift-qe.internal         Ready    master   3h56m   v1.22.0-rc.0+894a78b
skordas1013-qjh6c-worker-a-h8cgr.c.openshift-qe.internal   Ready    worker   3h44m   v1.22.0-rc.0+894a78b
skordas1013-qjh6c-worker-b-flv8z.c.openshift-qe.internal   Ready    worker   3h44m   v1.22.0-rc.0+894a78b
skordas1013-qjh6c-worker-c-67tvp.c.openshift-qe.internal   Ready    worker   3h45m   v1.22.0-rc.0+894a78b

# Debug master node
sh-4.4# top -p 20213
  PID   USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
20213   root      20   0  408308  34824  17028 S   1.0   0.2   2:28.88 tuned

# Debug worker node
sh-4.4# top -p 4744
 PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
4744 root      20   0  407796  33320  16840 S   0.3   0.2   1:16.83 tuned

# ^^ %CPU > 0

$ oc get clusterversions.config.openshift.io 
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2021-10-13-152930   True        False         34m     Cluster version is 4.10.0-0.nightly-2021-10-13-152930

$ oc get nodes
NAME                                                        STATUS   ROLES    AGE   VERSION
skordas1013a-cbhs8-master-0.c.openshift-qe.internal         Ready    master   57m   v1.22.1+9312243
skordas1013a-cbhs8-master-1.c.openshift-qe.internal         Ready    master   57m   v1.22.1+9312243
skordas1013a-cbhs8-master-2.c.openshift-qe.internal         Ready    master   58m   v1.22.1+9312243
skordas1013a-cbhs8-worker-a-m9jkc.c.openshift-qe.internal   Ready    worker   43m   v1.22.1+9312243
skordas1013a-cbhs8-worker-b-gxztz.c.openshift-qe.internal   Ready    worker   43m   v1.22.1+9312243
skordas1013a-cbhs8-worker-c-jb8gz.c.openshift-qe.internal   Ready    worker   43m   v1.22.1+9312243

# Debug master node
sh-4.4# top -p 14146
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
14146 root      20   0 1496068  46124  26120 S   0.0   0.3   0:00.44 openshift-tuned

# Debug worker node
sh-4.4# top -p 2456
 PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
2456 root      20   0 1496068  44880  25616 S   0.0   0.3   0:00.44 openshift-tuned

# %CPU = 0

Comment 5 errata-xmlrpc 2022-03-10 16:18:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.