Bug 1775826

Summary: kubelet needs --reserved-cpus to have fine control over isolation of system resources vs static-cpu pods
Product: OpenShift Container Platform Reporter: Andrew Theurer <atheurer>
Component: NodeAssignee: Ryan Phillips <rphillips>
Status: CLOSED ERRATA QA Contact: Weinan Liu <weinliu>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.3.0CC: aos-bugs, augol, dblack, dshaks, dshchedr, eparis, fsimonce, jokerman, mfojtik, mpatel, msluiter, mtosatti, schoudha, vromanso, yquinn
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1779857 (view as bug list) Environment:
Last Closed: 2020-01-23 11:13:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1771572, 1779348, 1779857, 1782893    

Description Andrew Theurer 2019-11-22 21:54:33 UTC
Description of problem:

Currently kubeReserved and systemReserved values affect how kubelet sets aside CPUs to not allocate from for static-cpu-manager pods.  These cpus are not under the direct control of the user and may not be predictable which CPUs are used.  Documentation suggests that these CPUs start with 0 and are allocated incrementally, but our observations are that they follow a pattern of 0, 2, etc.

Different systems have different CPU enumeration and may not work well with the assumptions kubelet may make for reserving these CPUs.  We would like to ensure the feature recently merged upstream, --reserved-cpus, is included in OCP as soon as possible.  This provides the administrator the ability to specify exactly which CPUs are not used for static-cpu pods, and then this can be closely coordinated with system tuning options, using those [and only those] CPUs for system tasks.

Comment 1 Maciej Szulik 2019-11-25 08:30:36 UTC
This looks like RFE against kubelet, I'm moving to node team, but I think this should rather be filled in as a RFE.

Comment 3 Martin Sivák 2019-11-25 09:11:19 UTC
The backport was already proposed in https://github.com/kubernetes/kubernetes/pull/83592

Comment 4 Martin Sivák 2019-11-25 09:15:20 UTC
Ah sorry, that was the original patch. I need to check with Vladik who played with this about where his backport PR is.

Comment 5 Vladik Romanovsky 2019-11-26 19:17:50 UTC
I've opened a PR to backport the upstream 83592 to Openshift 4.3: https://github.com/openshift/origin/pull/24224
However, I'm not sure if I'm following the right process here.
Please push me in the right direction if I'm doing something wrong.

Thanks,
Vladik

Comment 15 errata-xmlrpc 2020-01-23 11:13:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062