Bug 1901442

Summary: Backing store in a state of IO_ERROR (when using non-production memory limits for Noobaa)
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: aberner
Component: Multi-Cloud Object GatewayAssignee: Nimrod Becker <nbecker>
Status: CLOSED WONTFIX QA Contact: Raz Tamir <ratamir>
Severity: low Docs Contact:
Priority: low    
Version: 4.6CC: assingh, ebenahar, etamir, nbecker, ocs-bugs, odf-bz-bot
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-01 06:51:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Prometheus screenshot 1
none
Prometheus screenshot 2
none
Prometheus screenshot 3
none
Prometheus screenshot 4 none

Comment 7 aberner 2020-12-02 08:12:53 UTC
We were able to reproduce the issue manually in vsphere on a limited cluster (same 500mb memory endpoint) as well, while the production cluster passed.
After further investigation with Ohad, we found out that the cause of this issue is Nodejs allocating memory over the limit of the pod which causes it to restart. 
This is why the platform is not relevant and will happen on any platform if the resource is limited.

Since the issue will not reproduce on a production environment (by default nodejs limits it's memory usage to less then 2gb) The next course of action should be to find a way to limit NodeJS to the limitation of the pod (for the dev env).


Im attaching screenshots from Prometheus that clearly shows the memory spike right before the endpoint restarts.

Comment 8 aberner 2020-12-02 08:19:29 UTC
Created attachment 1735505 [details]
Prometheus screenshot 1

Comment 9 aberner 2020-12-02 08:21:31 UTC
Created attachment 1735506 [details]
Prometheus screenshot 2

Comment 10 aberner 2020-12-02 08:21:57 UTC
Created attachment 1735507 [details]
Prometheus screenshot 3

Comment 11 aberner 2020-12-02 08:23:06 UTC
Created attachment 1735508 [details]
Prometheus screenshot 4