2065749 – Kubelet slowly leaking memory and pods eventually unable to start

Bug 2065749 - Kubelet slowly leaking memory and pods eventually unable to start

Summary: Kubelet slowly leaking memory and pods eventually unable to start

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.11.0
Assignee:	Ryan Phillips
QA Contact:	Sunil Choudhary
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2106414
TreeView+	depends on / blocked

Reported:	2022-03-18 15:53 UTC by Luke Stanton
Modified:	2023-09-18 04:33 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Memory leak in container management code within Kubelet. Consequence: Fix: Code change Result: Memory is no longer leaking on container cleanup within kubelet container management code.
Clone Of:
Environment:
Last Closed:	2022-08-10 10:54:40 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift kubernetes pull 1229	0	None	Merged	Bug 2065749: UPSTREAM: 109103: cpu/memory manager containerMap memory leak	2022-07-12 12:23:57 UTC
Red Hat Product Errata	RHSA-2022:5069	0	None	None	None	2022-08-10 10:55:14 UTC

Internal Links: 2052378

Description Luke Stanton 2022-03-18 15:53:35 UTC

Description of problem:

Over time the kubelet slowly consumes memory until, at some point, pods are no longer able to start on the node; coinciding with this are container runtime errors. It appears that even rebooting the node does not resolve the issue once it occurs - the node has to be completely rebuilt.



How reproducible: Consistently



Actual results: Pods are eventually unable to start on the node; rebuilding the node is the only workaround



Expected results: kubelet/crio would continue working as expected

Comment 17 Sunil Choudhary 2022-06-15 15:21:28 UTC

Checked on 4.11.0-0.nightly-2022-06-14-172335 by running pods over a day and don't see unexpectedly high memory usage by kubelet on node.

% oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-06-14-172335   True        False         8h      Cluster version is 4.11.0-0.nightly-2022-06-14-172335

Comment 22 errata-xmlrpc 2022-08-10 10:54:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Comment 24 W. Trevor King 2022-12-06 05:22:12 UTC

If you click "Show advanced fields" on this bug, you can see that it blocks bug 2106414, which shipped in 4.10.23 [1].  And bug 2106414 blocks bug 2106655, which shipped in 4.9.45.  And from there tracking hopped to Jira [3], with a fix shipping in 4.8.51 [4].

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2106414#c5
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=2106655#c7
[3]: https://issues.redhat.com//browse/OCPBUGS-1461
[4]: https://access.redhat.com/errata/RHSA-2022:6801

Comment 26 Red Hat Bugzilla 2023-09-18 04:33:50 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days

Note You need to log in before you can comment on or make changes to this bug.