2035995 – [GSS] odf-operator-controller-manager is in CLBO with OOM kill while upgrading OCS-4.8 to ODF-4.9

Bug 2035995 - [GSS] odf-operator-controller-manager is in CLBO with OOM kill while upgrading OCS-4.8 to ODF-4.9

Summary: [GSS] odf-operator-controller-manager is in CLBO with OOM kill while upgradin...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	odf-operator
Sub Component:
Version:	4.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	ODF 4.10.0
Assignee:	Nitin Goyal
QA Contact:	akarsha
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2036009
TreeView+	depends on / blocked

Reported:	2021-12-29 07:12 UTC by Bipin Kunal
Modified:	2023-08-09 17:00 UTC (History)
CC List:	11 users (show)
Fixed In Version:	4.10.0-113
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	2036009 (view as bug list)
Environment:
Last Closed:	2022-04-13 18:50:46 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	red-hat-storage odf-operator pull 161	0	None	open	bundle: increase memory of the controller	2021-12-29 09:42:34 UTC
Red Hat Product Errata	RHSA-2022:1372	0	None	None	None	2022-04-13 18:53:23 UTC

Description Bipin Kunal 2021-12-29 07:12:32 UTC

Description of problem (please be detailed as possible and provide log
snippests):
 odf-operator-controller-manager is in CLBO
 Looking at the describe output we notice that we have hit OOM kill. 

 It seems that the default memory limit is lower than required. Unfortunately, we have only 2 known occurrences of this issue till now. It is still a question why we did not see this issue in QE environment. 

Version of all relevant components (if applicable):

Upgrade from OCS-4.8 to ODF-4.9


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

yes. 

Is there any workaround available to the best of your knowledge?

Yes, Increasing the memory limit helps.


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
2 such instance of the issue has been reported. One from a customer case and the other by an internal associate.

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Upgrade from OCS-4.8 to ODF-4.9
2.
3.


Actual results:

Pod is CLBO


Expected results:

Pod should be in running state

Comment 18 errata-xmlrpc 2022-04-13 18:50:46 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372

Comment 19 errata-xmlrpc 2022-04-13 18:53:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372

Comment 20 Daniel Del Ciancio 2022-06-15 13:04:29 UTC

Hello,

My customer is still seeing this same behavior running 4.10.3 even after running the fixed version (i.e. memory limits = 300Mi).

Does it need to be increased even further?  Is this a legitimate increase in memory needed or is it indicative of a memory leak ?

Note You need to log in before you can comment on or make changes to this bug.