Bug 1882748 - search-operator Pod OOMKilled After Upgrading OpenShift
Summary: search-operator Pod OOMKilled After Upgrading OpenShift
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: Search / Analytics
Version: rhacm-2.0.z
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: rhacm-2.0.4
Assignee: Jorge Padilla
QA Contact: Song Lai
Mikela Dockery
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-25 14:40 UTC by Chris Keller
Modified: 2021-04-09 16:35 UTC (History)
1 user (show)

Fixed In Version: rhacm-2.0.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-22 11:23:44 UTC
Target Upstream Version:
Embargoed:
gghezzo: rhacm-2.0.z+


Attachments (Terms of Use)
Graph of search-operator Pod Memory Utilization from Grafana (20.88 KB, image/png)
2020-09-25 14:41 UTC, Chris Keller
no flags Details
oc describe for search-operator (4.96 KB, text/plain)
2020-09-25 14:44 UTC, Chris Keller
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github open-cluster-management backlog issues 5712 0 None None None 2020-10-07 17:57:30 UTC
Red Hat Product Errata RHSA-2020:4304 0 None None None 2020-10-22 11:23:48 UTC

Description Chris Keller 2020-09-25 14:40:58 UTC
Description of problem:

After deploying ACM and subsequently upgrading OCP on the hub cluster, the search-operator pod remains in OOMKilled/CrashLoopBackOff status.


Version-Release number of selected component (if applicable):

ACM 2.0.3


How reproducible:

Varies.


Steps to Reproduce:
1. Install ACM 2.0.3
2. Upgrade OpenShift on hub cluster


Actual results:

search-operator pod remains in OOMKilled/CrashLoopBackOff status.


Expected results:

search-operator pod remains in Running state.


Additional info:

When the search operator restarts, initial memory consumption spikes above 128Mi limit. After ~5 minutes it levels off to ~45Mi.

Was able to work around the issue by changing the memory limit in the search-operator pod deployment to 256Mi.

Graph from Grafana showing memory utilization above the 128Mi limit is attached. Also attaching output of oc describe pod.

Comment 1 Chris Keller 2020-09-25 14:41:41 UTC
Created attachment 1716631 [details]
Graph of search-operator Pod Memory Utilization from Grafana

Comment 2 Chris Keller 2020-09-25 14:44:59 UTC
Created attachment 1716632 [details]
oc describe for search-operator

Comment 3 Jorge Padilla 2020-09-25 17:32:28 UTC
I've increased the default memory limit to 256Mi as a quick fix to prevent this from happening. The change is merged for the next z-release 2.0.4 and for 2.1

I'll keep this issue open to investigate why the memory consumption has increased and track a permanent code fix.

Comment 8 errata-xmlrpc 2020-10-22 11:23:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Advanced Cluster Management for Kubernetes version 2.0.4 images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4304


Note You need to log in before you can comment on or make changes to this bug.