Bug 1735751 - Liveness https curl commands on the webconsole pods cause a huge amount of dentries
Summary: Liveness https curl commands on the webconsole pods cause a huge amount of de...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 3.11.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.11.z
Assignee: Samuel Padgett
QA Contact: Yadan Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-01 11:59 UTC by Angelo Gabrieli
Modified: 2019-09-09 17:35 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The web console liveness probe used the curl command to check health periodically, which caused an unusually high level of dentry cache usage. This could cause high CPU usage when draining nodes. The liveness probe has been changed to avoid excessive dentry cache usage.
Clone Of:
Environment:
Last Closed: 2019-09-03 15:56:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 11829 0 'None' closed Bug 1735751: avoid excessive dentries due to console liveness probe 2020-10-28 08:51:09 UTC
Red Hat Product Errata RHBA-2019:2580 0 None None None 2019-09-03 15:56:11 UTC

Description Angelo Gabrieli 2019-08-01 11:59:52 UTC
Description of problem:
liveness https curl commands on the webconsole pods cause a huge amount of memory dentries. Some actions such as drain node or restart docker daemon trigger a flush of this large dentry slab cache.
This causes an high CPU load/unresponsiveness of the node
(see case #02398979)

Version-Release number of selected component (if applicable):
OCPv3.11

How reproducible:
Not easily reproducible cause the dentry cache grow step by step

Steps to Reproduce:
1.
2.
3.

Actual results:
This causes an high CPU load/unresponsiveness of the node


Expected results:
The memory usage should not increase indefinitely

Additional info:
as workaround: NSS_SDB_USE_CACHE=yes (see case #02398979)

Comment 3 XiaochuanWang 2019-08-27 02:53:30 UTC
Now the dockerd-current occupies below 5% CPU when liveness probe has been make by web-console to check a pod. 
I think this could be verified. 
Tested on Openshift v3.11.141

Comment 5 errata-xmlrpc 2019-09-03 15:56:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2580


Note You need to log in before you can comment on or make changes to this bug.