We were able to reproduce the issue manually in vsphere on a limited cluster (same 500mb memory endpoint) as well, while the production cluster passed. After further investigation with Ohad, we found out that the cause of this issue is Nodejs allocating memory over the limit of the pod which causes it to restart. This is why the platform is not relevant and will happen on any platform if the resource is limited. Since the issue will not reproduce on a production environment (by default nodejs limits it's memory usage to less then 2gb) The next course of action should be to find a way to limit NodeJS to the limitation of the pod (for the dev env). Im attaching screenshots from Prometheus that clearly shows the memory spike right before the endpoint restarts.
Created attachment 1735505 [details] Prometheus screenshot 1
Created attachment 1735506 [details] Prometheus screenshot 2
Created attachment 1735507 [details] Prometheus screenshot 3
Created attachment 1735508 [details] Prometheus screenshot 4