Bug 2106414

Summary: Kubelet slowly leaking memory and pods eventually unable to start
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: NodeAssignee: Ryan Phillips <rphillips>
Node sub component: Kubelet QA Contact: Sunil Choudhary <schoudha>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: bsmitley, clasohm, dahernan, hshukla, jdee, mharri, nagrawal, rphillips, sychen, wking
Version: 4.9   
Target Milestone: ---   
Target Release: 4.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-07-20 07:46:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2065749    
Bug Blocks: 2106655    

Comment 3 Sunil Choudhary 2022-07-18 08:49:34 UTC
Checked on 4.10.0-0.nightly-2022-07-13-131411

% oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-07-13-131411   True        False         3h46m   Cluster version is 4.10.0-0.nightly-2022-07-13-131411

% oc get nodes
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-132-61.us-east-2.compute.internal    Ready    master   4h      v1.23.5+8cfebb1
ip-10-0-150-4.us-east-2.compute.internal     Ready    worker   3h52m   v1.23.5+8cfebb1
ip-10-0-185-122.us-east-2.compute.internal   Ready    master   4h      v1.23.5+8cfebb1
ip-10-0-188-31.us-east-2.compute.internal    Ready    worker   3h54m   v1.23.5+8cfebb1
ip-10-0-205-38.us-east-2.compute.internal    Ready    master   4h      v1.23.5+8cfebb1
ip-10-0-210-175.us-east-2.compute.internal   Ready    worker   3h54m   v1.23.5+8cfebb1

% oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-07-13-131411   True        False         6h44m   Cluster version is 4.10.0-0.nightly-2022-07-13-131411

% oc adm top node      
NAME                                         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
ip-10-0-132-61.us-east-2.compute.internal    531m         15%    6444Mi          44%       
ip-10-0-150-4.us-east-2.compute.internal     66m          4%     1413Mi          21%       
ip-10-0-185-122.us-east-2.compute.internal   401m         11%    6061Mi          41%       
ip-10-0-188-31.us-east-2.compute.internal    261m         17%    3934Mi          58%       
ip-10-0-205-38.us-east-2.compute.internal    495m         14%    8911Mi          60%       
ip-10-0-210-175.us-east-2.compute.internal   286m         19%    3903Mi          58%

Comment 5 errata-xmlrpc 2022-07-20 07:46:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.23 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5568