Bug 2111139

Summary: crio increase percpu memory and can't be freed
Product: OpenShift Container Platform Reporter: roarora
Component: UnknownAssignee: Sudha Ponnaganti <sponnaga>
Status: CLOSED DUPLICATE QA Contact: Jianwei Hou <jhou>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.8CC: bbaude, dwalsh, eparis, gscrivan, mheon, pthomas, rmanes, tsweeney
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-08 18:10:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description roarora 2022-07-26 14:56:47 UTC
Description of problem:

Previous Bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=2049289

percpu memory keeps on increasing over time and is not freed on OpenShift CoreOS nodes with crio

As per previous bugzilla recommendation,Tried switching to logging driver as journald but behavior is still the same.


$ cat uname 
Linux foo.bar.com 4.18.0-305.45.1.el8_4.x86_64 #1 SMP Wed Apr 6 13:48:37 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

containers-common-1.3.1-5.module+el8.4.0+11990+22932769.x86_64 Mon May 23 21:37:21 2022
container-selinux-2.170.0-2.rhaos4.8.el8.noarch             Mon May 23 21:37:20 2022
criu-3.15-1.module+el8.4.0+11822+6cc1e7d7.x86_64            Mon May 23 21:37:21 2022
cri-o-1.21.7-3.rhaos4.8.git57607b4.el8.x86_64               Mon May 23 21:37:21 2022
cri-tools-1.21.0-4.el8.x86_64    

 find sys/fs/cgroup/memory -type d | wc -l
640

cat proc/cgroups | grep -E "name|memory"
#subsys_name	hierarchy	num_cgroups	enabled
memory	6	3948	1

grep -E "MemTotal|MemFree|Percpu" proc/meminfo
MemTotal:       32897520 kB
MemFree:          485260 kB
Percpu:          1455104 kB

Comment 2 Tom Sweeney 2022-07-26 20:14:21 UTC
Giuseppe,

Can you take another dive into this please?

Comment 3 Giuseppe Scrivano 2022-07-26 20:39:22 UTC
is the systemd log on a tmpfs?

Make sure the systemd journal is not on `/run/systemd/journal`, otherwise it still needs memory.

Comment 5 Giuseppe Scrivano 2022-08-08 08:35:33 UTC
@roarora, yes thanks for confirming.  I meant that.

The previous bug linked in the report refers to an issue with Podman.

Since the current bug is related to CRI-O, I am reassigning to the Node team

Comment 6 Peter Hunt 2022-08-08 13:30:51 UTC
this looks like a dup of https://bugzilla.redhat.com/show_bug.cgi?id=2004037. Did they ever try the test kernel?

Comment 8 Peter Hunt 2022-08-08 18:10:43 UTC

*** This bug has been marked as a duplicate of bug 2004037 ***