Bug 2272664 - Default resource allocation for ODF Noobaa BackingStore is too low when used as storage for Internal Registry
Summary: Default resource allocation for ODF Noobaa BackingStore is too low when used ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.15
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ODF 4.16.0
Assignee: Jacky Albo
QA Contact: Tiffany Nguyen
URL:
Whiteboard:
Depends On:
Blocks: 2260844
TreeView+ depends on / blocked
 
Reported: 2024-04-02 13:15 UTC by Matthew Secaur
Modified: 2024-07-17 14:12 UTC (History)
7 users (show)

Fixed In Version: 4.16.0-75
Doc Type: Enhancement
Doc Text:
.Increase in resource allocation for OpenShift Data Foundation MCG BackingStore The default resource for PV pool CPU and memory are increased to 999m and 1Gi respectively to enable more resource allocation for OpenShift Data Foundation MCG BackingStore.
Clone Of:
Environment:
Last Closed: 2024-07-17 13:17:33 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github noobaa noobaa-operator pull 1341 0 None open Fix default resource request/limit for PV pool pods 2024-04-10 11:38:43 UTC
Red Hat Product Errata RHSA-2024:4591 0 None None None 2024-07-17 13:17:47 UTC

Description Matthew Secaur 2024-04-02 13:15:35 UTC
Description of problem (please be detailed as possible and provide log
snippests):
Application builds can fail on an error 500 when using the Internal Registry configured with Noobaa as the backend.

Version of all relevant components (if applicable):
4.14.z, 4.15.z

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
When Noobaa is used as a default image registry and configured with default resource settings, it will fail during pushing operations due to the backinstore being OOMKilled.

Is there any workaround available to the best of your knowledge?
Increase the resource requests and limits as per https://access.redhat.com/solutions/7062227

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Unknown

Steps to Reproduce:
    1. Install a cluster
    2. Install/Configure ODF
    3. Configure the internal OCP registry to use a Noobaa bucket as back-end storage via S3
    4. Run an S2I build
    5. The noobaa-default-backing-store-noobaa-pod-XXXXXX pod in the openshift-storage namespace will get OOMKilled during the push to the registry


Actual results:
Builds (and other operations to the internal registry) will fail with and error 500, and the Noobaa BackingStore will go into Phase:Rejected (because the backingstore pod gets OOMKilled)

Expected results:
Builds and other operations to the internal registry should succeed.

Additional info:
The default resource allocation for the backingstore pod appears to be 100m/400Mi. I increased these values to 600m/800Mi and the problem went away.

Comment 3 Nimrod Becker 2024-04-02 13:46:57 UTC
Changing the default behaviour is something to Eran to respond to

Comment 4 Nimrod Becker 2024-04-02 14:02:06 UTC
4.16 at feature complete, updating flag

Comment 8 Elad 2024-04-10 09:32:37 UTC
Agreed. For verification we should run regression over a cluster with PV pool backingstore

Comment 13 Tiffany Nguyen 2024-04-24 05:40:20 UTC
Verified using build v4.16.0-81.
Ran regression PV test cases and didn't see any issue of noobaa-default-backing-store, all test cases are passed as expected.

Comment 14 errata-xmlrpc 2024-07-17 13:17:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.16.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:4591


Note You need to log in before you can comment on or make changes to this bug.