Bug 2035046 - SNO: Recover Platform CPU by Reducing the Kubelet Service Monitor Scrape Interval
Summary: SNO: Recover Platform CPU by Reducing the Kubelet Service Monitor Scrape Inte...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Telco Edge
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Nahian
QA Contact: yliu1
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-22 19:13 UTC by Ken Young
Modified: 2022-12-13 20:29 UTC (History)
1 user (show)

Fixed In Version: OCP 4.11
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-12-13 20:29:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ken Young 2021-12-22 19:13:22 UTC
Description of problem:

An SNO deployed at the Telco Far Edge allocated limited CPU for the platform reserving as much of the CPU Cores for the revenue generating workload.  Every opportunity to reduce platform overhead creates more room for revenue generating workload.

A low hanging opportunity to recovery a significant amount of platform core is to reduce the Kubelet Service Monitor Scrape Interval.  This is currently hard coded and the goal is to make this configurable leaving the default behaviour the same.  This would add a mechanism to configure this using annotations leveraging a new feature of Prometheus.

Version-Release number of selected component (if applicable):

4.10

How reproducible:

100%

Steps to Reproduce:
1.  Monitor an SNO up with Prometheus
2.  Measure CPU usage

Actual results:

The current CPU level

Expected results:

A non-trivial CPU usage measurement reduction

Additional info:


Note You need to log in before you can comment on or make changes to this bug.