Bug 1910006 - Accounting of steal time as CPU usage
Summary: Accounting of steal time as CPU usage
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.7
Hardware: s390x
OS: Unspecified
Target Milestone: ---
: 4.7.0
Assignee: Jayapriya Pai
QA Contact: hongyan li
Depends On: 1878766
Blocks: ocp-47-z-tracker
TreeView+ depends on / blocked
Reported: 2020-12-22 10:03 UTC by wvoesch
Modified: 2021-07-26 17:35 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Last Closed: 2021-07-26 17:35:21 UTC
Target Upstream Version:
janantha: needinfo-

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 993 0 None closed Add Thanos sidecar metrics + alerts 2021-05-25 13:08:25 UTC
Github prometheus-operator/kube-prometheus/commit/87ddb30a41253dce66bde0006634f30817ccb07a 0 None None None 2021-05-25 13:08:25 UTC
Red Hat Product Errata RHBA-2021:2762 0 None None None 2021-07-26 17:35:37 UTC

Description wvoesch 2020-12-22 10:03:40 UTC
Description of problem:

I have observed that while increasing the steal time, the available CPU shown by Prometheus has been reduced, which is expected. 
However Prometheus has also increased the CPU consumption, although no additional CPU load has been scheduled on that node. 
It seems as if the CPU usage is calculated like this: CPU usage = CPU count - available CPU
in order to reflect the correct cpu usage I think it should either be: CPU usage = CPU count - available CPU - steal time
or it could be calculated by CPU usage = sum over the CPU consumption of all processes 

Version-Release number of selected component (if applicable):

Steps to reproduce: 
1. Monitor the available CPU resources and the CPU usage of a particular node, say node A, in Prometheus. 
2. Increase the steal time on that particular node. 
   Possible options how to achieve this: 
   a. Configure a CPU overcommitment for node A and another node B. Schedule CPU intensive workload (stess-ng) on node B. Due to the CPU overcommitment node A will experience steal time. 
   b. On node A start a I/O intensive process, like for exampling coping of huge files. This will result in steal time because “z/VM was executing on behalf of the Linux virtual processor” [1].
3. Observe that the steal time will be counted as CPU usage of node A. 

Additional Information: 
This bug is a follow up of the BZ: 1878766 see comment 29 and 31.

[1] https://www.vm.ibm.com/perf/tips/prgcom.html see solution to problem: “I see a non-trivial number in my Linux Reports for %Steal. Is this a problem?”

Comment 1 Pawel Krupa 2021-01-06 08:34:33 UTC
> Observe that the steal time will be counted as CPU usage of node A. 

Where do you observe this?

After including this change in kube-prometheus [1] and propagating it to cluster-monitoring-operator [2] we no longer treat steal time as part of CPU usage (result of [3]). Plus if you are using `instance:node_cpu:rate:sum` recording rule, then CPU usage is counted as: `node_cpu_seconds_total{mode!="idle",mode!="iowait",mode!="steal"}` averaged over last 3 minutes

[1]: https://github.com/prometheus-operator/kube-prometheus/commit/87ddb30a41253dce66bde0006634f30817ccb07a
[2]: https://github.com/openshift/cluster-monitoring-operator/pull/993
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1878766

Comment 3 wvoesch 2021-01-07 09:25:37 UTC
Hi Pawel, 

I observe this in the WebUI overview for a particular node: https://console-openshift-console.apps.<cluster-name>.<domain>/k8s/cluster/nodes/<worker>.<cluster-name>.<domain>
The version I observed this was: 4.7.0-0.nightly-s390x-2020-12-15-081322

Comment 7 wvoesch 2021-05-25 11:55:51 UTC
Hi Jayapriya, 

could you please specify which information you need?

Comment 9 hongyan li 2021-05-31 10:32:38 UTC
Have no s390x machine, can't test. Tried to test on AWS, deployed app which need 7CPU on a node with 4CPU, all the pods are running and use up all 4 CPU, but didn't see CPU steal time on all the other nodes.
Wait for wvoesch to verify.

Comment 10 Junqi Zhao 2021-06-04 01:27:21 UTC
can you help to check in s390x machines, we don't have the platform, and the issue is not happen with AWS/GCP

Comment 11 Dan Li 2021-06-07 18:45:44 UTC
Making Jinqi's request un-private as Wolfgang is a Partner Engineer and cannot see private comment(s)

Comment 12 Junqi Zhao 2021-06-11 03:47:52 UTC
tested with 4.8.0-0.nightly-2021-06-10-071057, steal time is removed from CPU usage
    - expr: sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait",mode!="steal"}[3m]))
        BY (instance)
      record: instance:node_cpu:rate:sum

Comment 15 errata-xmlrpc 2021-07-26 17:35:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.7.21 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.