Bug 1882748

Summary: search-operator Pod OOMKilled After Upgrading OpenShift
Product: Red Hat Advanced Cluster Management for Kubernetes Reporter: Chris Keller <ckeller>
Component: Search / AnalyticsAssignee: Jorge Padilla <jpadilla>
Status: CLOSED ERRATA QA Contact: Song Lai <slai>
Severity: high Docs Contact: Mikela Dockery <mdockery>
Priority: high    
Version: rhacm-2.0.zCC: gghezzo
Target Milestone: ---Flags: gghezzo: rhacm-2.0.z+
Target Release: rhacm-2.0.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhacm-2.0.4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-22 11:23:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Graph of search-operator Pod Memory Utilization from Grafana
none
oc describe for search-operator none

Description Chris Keller 2020-09-25 14:40:58 UTC
Description of problem:

After deploying ACM and subsequently upgrading OCP on the hub cluster, the search-operator pod remains in OOMKilled/CrashLoopBackOff status.


Version-Release number of selected component (if applicable):

ACM 2.0.3


How reproducible:

Varies.


Steps to Reproduce:
1. Install ACM 2.0.3
2. Upgrade OpenShift on hub cluster


Actual results:

search-operator pod remains in OOMKilled/CrashLoopBackOff status.


Expected results:

search-operator pod remains in Running state.


Additional info:

When the search operator restarts, initial memory consumption spikes above 128Mi limit. After ~5 minutes it levels off to ~45Mi.

Was able to work around the issue by changing the memory limit in the search-operator pod deployment to 256Mi.

Graph from Grafana showing memory utilization above the 128Mi limit is attached. Also attaching output of oc describe pod.

Comment 1 Chris Keller 2020-09-25 14:41:41 UTC
Created attachment 1716631 [details]
Graph of search-operator Pod Memory Utilization from Grafana

Comment 2 Chris Keller 2020-09-25 14:44:59 UTC
Created attachment 1716632 [details]
oc describe for search-operator

Comment 3 Jorge Padilla 2020-09-25 17:32:28 UTC
I've increased the default memory limit to 256Mi as a quick fix to prevent this from happening. The change is merged for the next z-release 2.0.4 and for 2.1

I'll keep this issue open to investigate why the memory consumption has increased and track a permanent code fix.

Comment 8 errata-xmlrpc 2020-10-22 11:23:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Advanced Cluster Management for Kubernetes version 2.0.4 images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4304