Bug 1823677

Summary: (regression) etcd-quorum-guard is reporting abnormal memory usage
Product: OpenShift Container Platform Reporter: Vadim Rutkovsky <vrutkovs>
Component: Machine Config OperatorAssignee: Sam Batschelet <sbatsche>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.5   
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1824137 (view as bug list) Environment:
Last Closed: 2020-07-13 17:27:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1824137    

Description Vadim Rutkovsky 2020-04-14 09:07:13 UTC
Description of problem:

https://github.com/openshift/machine-config-operator/pull/1552 reworked etcd-quorum-daemon deployment and left  NSS_SDB_USE_CACHE=no setting out of curl command. As a result in 4.5/4.4 memory usage for etcd quorum guards has started leaking

Similar issue: #1706625

Comment 3 Michael Nguyen 2020-04-20 14:53:58 UTC
Verified on
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-04-18-184707   True        False         48m     Cluster version is 4.5.0-0.nightly-2020-04-18-184707


$ oc -n openshift-machine-config-operator get pods
NAME                                        READY   STATUS    RESTARTS   AGE
etcd-quorum-guard-848d7db55d-d8lpv          1/1     Running   0          57m
etcd-quorum-guard-848d7db55d-hdwz6          1/1     Running   0          57m
etcd-quorum-guard-848d7db55d-vldlz          1/1     Running   0          57m
machine-config-controller-b9f88cf4b-jpjnj   1/1     Running   0          61m
machine-config-daemon-2n55f                 2/2     Running   0          51m
machine-config-daemon-7f48d                 2/2     Running   0          51m
machine-config-daemon-fqkl5                 2/2     Running   0          61m
machine-config-daemon-qdnfk                 2/2     Running   0          60m
machine-config-daemon-rsv47                 2/2     Running   0          51m
machine-config-daemon-ztfpw                 2/2     Running   0          60m
machine-config-operator-8bc8b48d9-t6b5z     1/1     Running   0          73m
machine-config-server-286zg                 1/1     Running   0          60m
machine-config-server-5p5ms                 1/1     Running   0          60m
machine-config-server-qxgf4                 1/1     Running   0          61m
$ oc -n openshift-machine-config-operator rsh etcd-quorum-guard-848d7db55d-d8lpv  cat /usr/local/bin/etcd-quorum-guard.sh
env NSS_SDB_USE_CACHE=no curl --silent --max-time 2 --cert "/mnt/kube/system\:etcd-peer-ip-10-0-142-108.ec2.internal.crt" --key "/mnt/kube/system:etcd-peer-ip-10-0-142-108.ec2.internal.key" --cacert "/mnt/kube/ca.crt" "https://10.0.142.108:2379/health"
$ oc -n openshift-machine-config-operator rsh etcd-quorum-guard-848d7db55d-hdwz6  cat /usr/local/bin/etcd-quorum-guard.sh
env NSS_SDB_USE_CACHE=no curl --silent --max-time 2 --cert "/mnt/kube/system\:etcd-peer-ip-10-0-151-170.ec2.internal.crt" --key "/mnt/kube/system:etcd-peer-ip-10-0-151-170.ec2.internal.key" --cacert "/mnt/kube/ca.crt" "https://10.0.151.170:2379/health"
$ oc -n openshift-machine-config-operator rsh etcd-quorum-guard-848d7db55d-vldlz  cat /usr/local/bin/etcd-quorum-guard.sh
env NSS_SDB_USE_CACHE=no curl --silent --max-time 2 --cert "/mnt/kube/system\:etcd-peer-ip-10-0-142-255.ec2.internal.crt" --key "/mnt/kube/system:etcd-peer-ip-10-0-142-255.ec2.internal.key" --cacert "/mnt/kube/ca.crt" "https://10.0.142.255:2379/health"

Comment 4 errata-xmlrpc 2020-07-13 17:27:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409