Bug 1706625 - etcd-quorum-guard reporting extremely high memory usage
Summary: etcd-quorum-guard reporting extremely high memory usage
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.1.0
Assignee: Robert Krawitz
QA Contact: ge liu
URL:
Whiteboard:
: 1706635 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-05 18:47 UTC by Samuel Padgett
Modified: 2019-06-04 10:48 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:48:23 UTC
Target Upstream Version:


Attachments (Terms of Use)
Prometheus 3 day view of one of the pods (214.90 KB, image/png)
2019-05-05 18:48 UTC, Samuel Padgett
no flags Details
container_memory_rss (1.53 MB, image/png)
2019-05-05 18:49 UTC, Samuel Padgett
no flags Details
container_memory_working_set_bytes (938.97 KB, image/png)
2019-05-05 18:49 UTC, Samuel Padgett
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:48:32 UTC

Description Samuel Padgett 2019-05-05 18:47:35 UTC
The etcd-quorum-guard pods are all reporting ~10Gi memory usage. The console is showing this using the query:

pod_name:container_memory_usage_bytes:sum{pod_name='etcd-quorum-guard-7b55ddf465-bsr42',namespace='openshift-machine-config-operator'}

`oc adm top` agrees:

❯ oc adm top pod etcd-quorum-guard-7b55ddf465-stgns -n openshift-machine-config-operator
NAME                                 CPU(cores)   MEMORY(bytes)
etcd-quorum-guard-7b55ddf465-stgns   5m           10166Mi

Version 4.1.0-0.ci-2019-05-02-194100

Comment 1 Samuel Padgett 2019-05-05 18:48:30 UTC
Created attachment 1564066 [details]
Prometheus 3 day view of one of the pods

Comment 2 Samuel Padgett 2019-05-05 18:49:21 UTC
Created attachment 1564067 [details]
container_memory_rss

Comment 3 Samuel Padgett 2019-05-05 18:49:48 UTC
Created attachment 1564068 [details]
container_memory_working_set_bytes

Comment 6 Samuel Padgett 2019-05-05 21:09:52 UTC
`ps aux` and `free` from inside the container:

sh-4.2# ps aux
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          1  0.0  0.0   4372   688 ?        Ss   May02   0:00 /bin/sleep infinity
root      16144  0.0  0.0  11828  2904 pts/0    Ss   20:59   0:00 sh
root      16446  0.0  0.0  51752  3492 pts/0    R+   21:04   0:00 ps aux

sh-4.2# free -h
              total        used        free      shared  buff/cache   available
Mem:            15G        3.1G        171M        8.8M         12G         11G
Swap:            0B          0B          0B

Comment 9 Greg Blomquist 2019-05-06 15:39:01 UTC
*** Bug 1706635 has been marked as a duplicate of this bug. ***

Comment 15 ge liu 2019-05-08 04:29:31 UTC
Checked latest payload(4.1.0-0.nightly-2019-05-08-012425), the pr have not pushed in.

Comment 16 ge liu 2019-05-10 03:08:07 UTC
Recreated and Verified with Beta5 final build: 4.1.0-rc.2, the memory cost is 3M only.
# oc adm top pods etcd-quorum-guard-9cdb6f6c4-l822f -n openshift-machine-config-operator
NAME                                CPU(cores)   MEMORY(bytes)   
etcd-quorum-guard-9cdb6f6c4-l822f   6m           3Mi

Comment 18 errata-xmlrpc 2019-06-04 10:48:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.