Bug 2004037
Summary: | Percpu counter usage is gradually getting increasing during podman container recreation. | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | rcheerla | |
Component: | kernel | Assignee: | Waiman Long <llong> | |
kernel sub component: | Memory Management | QA Contact: | Chao Ye <cye> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | urgent | CC: | akanekar, akesarka, aquini, atomlin, bhenders, bjohri, cchen, chris.bowles, chrzhang, cldavey, cye, David.Taylor, ddutile, dornelas, fboboc, fperalta, hfukumot, jaeshin, jarod, kjavier, kwwong, llong, mharri, michele, mmilgram, mmilgram, mschibli, ngirard, nsu, nyelle, palonsor, pauwebst, pehunt, pescorza, pifang, psingour, rgertzbe, rmanes, rnoma, roarora, ruud, saniyer, skamboj, skanniha, skrenger, snishika, tkimura, vagrawal, vbendel, vumrao, wwurzbac | |
Version: | 8.4 | Keywords: | Triaged, ZStream | |
Target Milestone: | rc | Flags: | nyelle:
needinfo-
nyelle: needinfo- nyelle: needinfo- pehunt: needinfo- |
|
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | kernel-4.18.0-404.el8 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2054076 2110039 2110040 (view as bug list) | Environment: | ||
Last Closed: | 2022-11-08 10:14:55 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2037529, 2054076, 2110039, 2110040 |
Description
rcheerla
2021-09-14 12:02:02 UTC
Hello Team, 1] Executed below command multiple times like below on RHEL 8.4 latest kernel. while :; do podman run --name=test1 --replace centos /bin/echo 'running'; done .. while :; do podman run --name=test20 --replace centos /bin/echo 'running'; done 2] Monitored the system for about 12 hours and do see around 2 GB growth in Percpu counter value. grep Per meminfo-a.out Percpu: 2077248 kB o Do see growth in directory entries for memory controller. o However I don't see many entries under /sys/fs/cgroup/memory directory, hardly I could see 200 entries. cat /proc/cgroups | column -t #subsys_name hierarchy num_cgroups enabled cpuset 11 16 1 cpu 10 20 1 cpuacct 10 20 1 blkio 5 20 1 memory 4 31055 1 devices 6 67 1 freezer 8 16 1 net_cls 2 16 1 perf_event 3 16 1 net_prio 2 16 1 hugetlb 9 16 1 pids 12 100 1 rdma 7 1 1 o meminfo output MemTotal: 10015428 kB MemFree: 1280632 kB MemAvailable: 2362640 kB Buffers: 3240 kB Cached: 1925576 kB Slab: 1641464 kB SReclaimable: 943816 kB SUnreclaim: 697648 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB Percpu: 2077248 kB o vmallocinfo output. pcpu_get_vm_areas+0x0/0x1140 2139095040 Bytes pcpu_create_chunk+0x16c/0x1c0 174743552 Bytes _stp_map_new_is.constprop.65+0x148/0x220 144994304 Bytes pci_mmcfg_arch_map+0x31/0x70 134221824 Bytes relay_open_buf.part.11+0x1af/0x2e0 42024960 Bytes _do_fork+0x8f/0x350 35246080 Bytes alloc_large_system_hash+0x19e/0x261 29462528 Bytes vmw_fb_init+0x1bd/0x3c0 8413184 Bytes layout_and_allocate+0x9c9/0xd40 6782976 Bytes pcpu_create_chunk+0x77/0x1c0 4153344 Bytes o I have collected the stack traces leading to pcpu_alloc(). o Showing something like below, will attach the nohup.out file as well. 0x36714500bab0 0xffffffff964e6949 : kmem_cache_open+0x3b9/0x420 [kernel] 0xffffffff964e7262 : __kmem_cache_create+0x12/0x50 [kernel] 0xffffffff9648c7f9 : kmem_cache_create_usercopy+0x169/0x2d0 [kernel] 0xffffffff9648c972 : kmem_cache_create+0x12/0x20 [kernel] 0xffffffffc0889088 : 0xffffffffc0889088 0xffffffffc0889088 : 0xffffffffc0889088 0xffffffffc088903b : 0xffffffffc088903b 0xffffffff962027f6 : do_one_initcall+0x46/0x1c3 [kernel] 0xffffffff9638a3fa : do_init_module+0x5a/0x220 [kernel] 0xffffffff9638c835 : load_module+0x14c5/0x17f0 [kernel] 0xffffffff9638cc9b : __do_sys_init_module+0x13b/0x180 [kernel] 0xffffffff9620420b : do_syscall_64+0x5b/0x1a0 [kernel] 0xffffffff96c000ad : entry_SYSCALL_64_after_hwframe+0x65/0xca [kernel] 0x36714500bad0 0xffffffff9653ab2e : alloc_vfsmnt+0x7e/0x1e0 [kernel] 0xffffffff9653bfb3 : clone_mnt+0x33/0x330 [kernel] 0xffffffff9653d6dc : copy_tree+0x6c/0x300 [kernel] 0xffffffff9653d9d8 : __do_loopback.isra.61+0x68/0xd0 [kernel] 0xffffffff9653fe79 : do_mount+0x7c9/0x950 [kernel] 0xffffffff965403a6 : ksys_mount+0xb6/0xd0 [kernel] 0xffffffff965403e1 : __x64_sys_mount+0x21/0x30 [kernel] 0xffffffff9620420b : do_syscall_64+0x5b/0x1a0 [kernel] 0xffffffff96c000ad : entry_SYSCALL_64_after_hwframe+0x65/0xca [kernel] 0xffffffff96c000ad : entry_SYSCALL_64_after_hwframe+0x65/0xca [kernel] 0x367143219b20 0xffffffff9669fcf4 : __percpu_counter_init+0x24/0xa0 [kernel] 0xffffffff96b2788e : fprop_global_init+0x1e/0x30 [kernel] 0xffffffff96b42b64 : mem_cgroup_css_alloc+0x1f4/0x860 [kernel] 0xffffffff96399720 : cgroup_apply_control_enable+0x130/0x350 [kernel] 0xffffffff9639bc86 : cgroup_mkdir+0x216/0x4c0 [kernel] 0xffffffff965ada7a : kernfs_iop_mkdir+0x5a/0x90 [kernel] 0xffffffff96527572 : vfs_mkdir+0x102/0x1b0 [kernel] 0xffffffff9652b0ad : do_mkdirat+0x7d/0xf0 [kernel] 0xffffffff9620420b : do_syscall_64+0x5b/0x1a0 [kernel] 0xffffffff96c000ad : entry_SYSCALL_64_after_hwframe+0x65/0xca [kernel] 0xffffffff96c000ad : entry_SYSCALL_64_after_hwframe+0x65/0xca [kernel] 0x367145011ab0 0xffffffff96b42a14 : mem_cgroup_css_alloc+0xa4/0x860 [kernel] 0xffffffff96399720 : cgroup_apply_control_enable+0x130/0x350 [kernel] 0xffffffff9639bc86 : cgroup_mkdir+0x216/0x4c0 [kernel] 0xffffffff965ada7a : kernfs_iop_mkdir+0x5a/0x90 [kernel] 0xffffffff96527572 : vfs_mkdir+0x102/0x1b0 [kernel] 0xffffffff9652b0ad : do_mkdirat+0x7d/0xf0 [kernel] 0xffffffff9620420b : do_syscall_64+0x5b/0x1a0 [kernel] 0xffffffff96c000ad : entry_SYSCALL_64_after_hwframe+0x65/0xca [kernel] 0xffffffff96c000ad : entry_SYSCALL_64_after_hwframe+0x65/0xca [kernel] o I have tried to drop_cache by using 'echo 2' and 'echo 3' did not help. I am not sure how to find the used memory in "Percpu" or how to reclaim this, need further suggestions. Or this could be a bug in the kernel. Also, confirm if below patch set is backported into any RHEL8 kernel so that I will test and confirm the result. https://yhbt.net/lore/all/20210407182618.2728388-4-guro@fb.com/T/ Regards, Raju Thanks for the report. Increase in percpu memory consumption over time is inevitable due to memory fragmentation. However, we will backport some of the upstream percpu related commits to reduce the rate of increase. What I have found out is that the increase in percpu memory consumption is likely due to the percpu vmstat data in dying mem_cgroup structures being held in place by references stored in page cache. By "echo 1 > /proc/sys/vm/drop_caches", this will allow those dying mem_cgroup structures to finally get freed. Could you try that to see if that helps to reduce the percpu memory consumption back to a more normal level? Hi Waiman, I have tried "echo 1 > /proc/sys/vm/drop_caches" but did not help. o Before initiating "podman run" command. $ grep Per /proc/meminfo Percpu: 20160 kB $ cat /proc/cgroups | column -t #subsys_name hierarchy num_cgroups enabled memory 3 163 1 $ date Thu Sep 16 08:56:05 CDT 2021 $ grep Per /proc/meminfo Percpu: 31872 kB o After the loop has been completed. $ ps aux | grep -i "podman run" root 3968422 0.0 0.0 12136 1196 pts/0 S+ 09:32 0:00 grep --color=auto -i podman run $ date Thu Sep 16 09:36:26 CDT 2021 $ grep Per /proc/meminfo Percpu: 111456 kB <<--- $ cat /proc/cgroups | column -t #subsys_name hierarchy num_cgroups enabled memory 3 1327 1 $ echo 1 > /proc/sys/vm/drop_caches $ echo 2 > /proc/sys/vm/drop_caches $ echo 3 > /proc/sys/vm/drop_caches $ sync o Still it did not reduce. $ grep Per /proc/meminfo Percpu: 111456 kB <<<--- (In reply to Waiman Long from comment #8) > Well, it is what I have expected. But is the memory increase slowing down or > is at the same rate? > > -Longman Yes the growth is almost at the same rate, however I will run the loop for longer time and will confirm the result. FWIW, it seems like the previous versions of the podman packages exhibit the same problem at roughly the same rate. test: # do podman run --name=test --replace centos /bin/echo 'running' [root@rhel8 ~]# rpm -qa | grep podman podman-catatonit-3.0.1-7.module+el8.4.0+11311+9da8acfb.x86_64 cockpit-podman-29-2.module+el8.4.0+11311+9da8acfb.noarch podman-3.0.1-7.module+el8.4.0+11311+9da8acfb.x86_64 [root@rhel8 ~]# grep Percpu /proc/meminfo Percpu: 1080 kB * run the test 3k times * [root@rhel8 ~]# grep Percpu /proc/meminfo Percpu: 2832 kB 1752 KiB growth over 3000 runs ~~~~~~~~~~ after updating podman packages and rebooting: [root@rhel8 ~]# rpm -qa | grep podman podman-catatonit-3.2.3-0.11.module+el8.4.0+12050+ef972f71.x86_64 podman-3.2.3-0.11.module+el8.4.0+12050+ef972f71.x86_64 cockpit-podman-32-2.module+el8.4.0+11990+22932769.noarch [root@rhel8 ~]# grep Percpu /proc/meminfo Percpu: 1064 kB * run the test 3k times * [root@rhel8 ~]# grep Percpu /proc/meminfo Percpu: 2760 kB 1696 KiB growth over 3000 runs not sure if 3k runs if sufficient to highlight the problem. (In reply to John Siddle from comment #12) > FWIW, it seems like the previous versions of the podman packages exhibit the > same problem at roughly the same rate. Thanks for running the test. It does look like downgrading to the previous version of podman and/or kernel will not help. *** Bug 2004453 has been marked as a duplicate of this bug. *** *** Bug 2044626 has been marked as a duplicate of this bug. *** My ticket 2044626 is closed as duplicate for this one, i sadly did not have enough time to debug it fully and supply all information needed. I did however find a possible workaround, by doing "swapoff -a" and running the machine without any swap my memleak issue went away. While this workaround works for me, for certain machines i do prefer to have a small swap preferably. *** Bug 2037529 has been marked as a duplicate of this bug. *** An upstream patch has been posted. https://lore.kernel.org/lkml/20220421145845.1044652-1-longman@redhat.com/ *** Bug 2014136 has been marked as a duplicate of this bug. *** MM tier test pass with kernel-4.18.0-397.g2c67.el8.mr2872_220603_1814 from comment#72: https://beaker.engineering.redhat.com/jobs/6725731 Set Verified:Tested Hi One of my customer hit this issue on their OCP4.8, may I know whether there is any ETA for this fix be ported to OCP please Hi team, It seems this bug is affecting elasticsearch in OCP 4.7 as well, can you please look into a fix ported to OCP? Case 03269526 is attached to this bug, thanks. Hi team, It seems this bug is affecting elasticsearch in OCP 4.7 as well, can you please look into a fix ported to OCP? Case 03269526 is attached to this bug, thanks. *** Bug 2111139 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: kernel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7683 |