Bug 1823677 - (regression) etcd-quorum-guard is reporting abnormal memory usage
Summary: (regression) etcd-quorum-guard is reporting abnormal memory usage
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.5.0
Assignee: Sam Batschelet
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks: 1824137
TreeView+ depends on / blocked
 
Reported: 2020-04-14 09:07 UTC by Vadim Rutkovsky
Modified: 2020-07-13 17:27 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1824137 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:27:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1648 0 None closed Bug 1823677:etcdquorumguard_deployment: pass NSS_SDB_USE_CACHE=no to curl 2020-10-22 10:16:28 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:27:47 UTC

Description Vadim Rutkovsky 2020-04-14 09:07:13 UTC
Description of problem:

https://github.com/openshift/machine-config-operator/pull/1552 reworked etcd-quorum-daemon deployment and left  NSS_SDB_USE_CACHE=no setting out of curl command. As a result in 4.5/4.4 memory usage for etcd quorum guards has started leaking

Similar issue: #1706625

Comment 3 Michael Nguyen 2020-04-20 14:53:58 UTC
Verified on
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-04-18-184707   True        False         48m     Cluster version is 4.5.0-0.nightly-2020-04-18-184707


$ oc -n openshift-machine-config-operator get pods
NAME                                        READY   STATUS    RESTARTS   AGE
etcd-quorum-guard-848d7db55d-d8lpv          1/1     Running   0          57m
etcd-quorum-guard-848d7db55d-hdwz6          1/1     Running   0          57m
etcd-quorum-guard-848d7db55d-vldlz          1/1     Running   0          57m
machine-config-controller-b9f88cf4b-jpjnj   1/1     Running   0          61m
machine-config-daemon-2n55f                 2/2     Running   0          51m
machine-config-daemon-7f48d                 2/2     Running   0          51m
machine-config-daemon-fqkl5                 2/2     Running   0          61m
machine-config-daemon-qdnfk                 2/2     Running   0          60m
machine-config-daemon-rsv47                 2/2     Running   0          51m
machine-config-daemon-ztfpw                 2/2     Running   0          60m
machine-config-operator-8bc8b48d9-t6b5z     1/1     Running   0          73m
machine-config-server-286zg                 1/1     Running   0          60m
machine-config-server-5p5ms                 1/1     Running   0          60m
machine-config-server-qxgf4                 1/1     Running   0          61m
$ oc -n openshift-machine-config-operator rsh etcd-quorum-guard-848d7db55d-d8lpv  cat /usr/local/bin/etcd-quorum-guard.sh
env NSS_SDB_USE_CACHE=no curl --silent --max-time 2 --cert "/mnt/kube/system\:etcd-peer-ip-10-0-142-108.ec2.internal.crt" --key "/mnt/kube/system:etcd-peer-ip-10-0-142-108.ec2.internal.key" --cacert "/mnt/kube/ca.crt" "https://10.0.142.108:2379/health"
$ oc -n openshift-machine-config-operator rsh etcd-quorum-guard-848d7db55d-hdwz6  cat /usr/local/bin/etcd-quorum-guard.sh
env NSS_SDB_USE_CACHE=no curl --silent --max-time 2 --cert "/mnt/kube/system\:etcd-peer-ip-10-0-151-170.ec2.internal.crt" --key "/mnt/kube/system:etcd-peer-ip-10-0-151-170.ec2.internal.key" --cacert "/mnt/kube/ca.crt" "https://10.0.151.170:2379/health"
$ oc -n openshift-machine-config-operator rsh etcd-quorum-guard-848d7db55d-vldlz  cat /usr/local/bin/etcd-quorum-guard.sh
env NSS_SDB_USE_CACHE=no curl --silent --max-time 2 --cert "/mnt/kube/system\:etcd-peer-ip-10-0-142-255.ec2.internal.crt" --key "/mnt/kube/system:etcd-peer-ip-10-0-142-255.ec2.internal.key" --cacert "/mnt/kube/ca.crt" "https://10.0.142.255:2379/health"

Comment 4 errata-xmlrpc 2020-07-13 17:27:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.