Bug 1743768 - Keepalived static pod cause to dentry cache increase to size of ~7GB
Summary: Keepalived static pod cause to dentry cache increase to size of ~7GB
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.2.0
Assignee: Antonio Murdaca
QA Contact: Micah Abbott
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-20 15:53 UTC by Yossi Boaron
Modified: 2019-10-16 06:36 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-16 06:36:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1083 0 None closed Bug 1743768: avoid excessive dentries due to keepalived static pod health check 2020-07-14 08:34:12 UTC
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:36:53 UTC

Description Yossi Boaron 2019-08-20 15:53:31 UTC
Description of problem:
Keepalived static pod cause to dentry cache increase to size of ~7GB.

 curl results in heavy dentry cache pollution . The underlying bug is https://bugzilla.redhat.com/show_bug.cgi?id=1571183.

 

Since keepalived static pod uses curl to gets component health status every few seconds,  the dentry cache size on master nodes is increased to ~7.5GB (in dev-scripts env)

Version-Release number of selected component (if applicable):

4.2.0-0.ci-2019-07-31-123929-kni.0


How reproducible:
Deploy OCP on BM using dev-scripts

Steps to Reproduce:
1.
2.
3.

Actual results:

After cluster up for ~24 hours, checking cache status (run 'sudo slabtop') seems that dentry cache size is ~7.5GB
 

 Active / Total Objects (% used)    : 48678986 / 48914256 (99.5%)
 Active / Total Slabs (% used)      : 2002767 / 2002767 (100.0%)
 Active / Total Caches (% used)     : 109 / 145 (75.2%)
 Active / Total Size (% used)       : 8056720.86K / 8085203.66K (99.6%)
 Minimum / Average / Maximum Object : 0.01K / 0.17K / 24.25K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
35393371 35371279  99%    0.20K 1862809       19   7451236K dentry
11086976 10955912  98%    0.03K  86617      128    346468K kmalloc-32
437248 437248 100%    0.01K    854      512      3416K kmalloc-8
384710 384710 100%    0.02K   2263      170      9052K avtab_node
139026 101764  73%    0.04K   1363      102      5452K Acpi-Namespace
121344 121113  99%    0.02K    474      256      1896K kmalloc-16
118144 116432  98%    0.06K   1846       64      7384K kmalloc-64
116160 115584  99%    0.06K   1815       64      7260K pid
 73467  70292  95%    0.58K   2721       27     43536K radix_tree_node
 69828  69486  99%    0.09K   1518       46      6072K anon_vma
 66032  64057  97%    0.25K   4127       16     16508K filp
 65850  65715  99%    0.13K   2195       30      8780K kernfs_node_cache
 61438  61059  99%    0.23K   3614       17     14456K vm_area_struct
 57384  51982  90%    0.64K   2391       24     38256K inode_cache
 54208  53906  99%    0.12K   1694       32      6776K seq_file
 54152  54042  99%    0.07K    967       56      3868K eventpoll_pwq
 
Expected results:


Additional info:

For more details see:
https://github.com/openshift/machine-config-operator/pull/705
https://github.com/openshift/openshift-ansible/pull/11829

Comment 3 errata-xmlrpc 2019-10-16 06:36:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.