1784163 – build consumes more MEM on OCP 4.2

Bug 1784163 - build consumes more MEM on OCP 4.2

Summary: build consumes more MEM on OCP 4.2

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Containers
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.5.0
Assignee:	Nalin Dahyabhai
QA Contact:	weiwei jiang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-12-16 21:28 UTC by Hongkai Liu
Modified:	2023-09-14 05:48 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-05-26 14:01:34 UTC
Target Upstream Version:
Embargoed:
Flags:	adam.kaplan: needinfo-

Attachments	(Terms of Use)
3.11 memory usage (250.66 KB, image/png) 2020-02-04 18:21 UTC, Steve Kuznetsov	no flags	Details
4.3 memory usage (124.99 KB, image/png) 2020-02-04 18:22 UTC, Steve Kuznetsov	no flags	Details
4.3 prometheus screen shot (472.17 KB, image/png) 2020-02-06 01:12 UTC, Hongkai Liu	no flags	Details
samples-operator-builds (26.97 KB, image/png) 2020-02-10 20:27 UTC, Adam Kaplan	no flags	Details
View All

Description Hongkai Liu 2019-12-16 21:28:42 UTC

Description of problem:
We have a build defined in ci process.
On 4.2 OCP cluster, it consumes much more MEM than it used to be on 311 cluster.


Version-Release number of selected component (if applicable):

$ oc get clusterversion --kubeconfig ./build01.config
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-10-11-230724   True        False         61d     Cluster version is 4.2.0-0.nightly-2019-10-11-230724


How reproducible:
Always.

Steps to Reproduce:
1. CI rehearsal on master build on OCP 4.2 cluster. See an example
https://github.com/openshift/release/pull/6331
2. Tuning the MEM limit for artifacts job: starting from 1G (inherited from 311 cluster) to 16G
3.

Actual results:
Still failing even with 16G as MEM limit.

Expected results:


Additional info:
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_release/6331/rehearse-6331-pull-ci-openshift-origin-master-images/16

Will upload more logs later.

Comment 2 Hongkai Liu 2019-12-16 21:43:15 UTC

Some input during the turning the limits:

1G ...
$ oc --kubeconfig build01.config get pod -n ci-op-hkw72spt artifacts-build
NAME              READY   STATUS       RESTARTS   AGE
artifacts-build   0/1     Init:Error   0          58m


4G ...
$ oc --kubeconfig build01.config   -n ci-op-hxnb16tj get pod
NAME                        READY   STATUS           RESTARTS   AGE
artifacts-build             0/1     Init:OOMKilled   0          73s

8G ...
$ oc --kubeconfig build01.config   -n ci-op-nbir7n6t get pod 
NAME                           READY   STATUS      RESTARTS   AGE
artifacts-build                0/1     Error       0          8m22s

Comment 3 Ben Parees 2019-12-16 21:50:48 UTC

the most interesting failure mode to me is this one which happens at 8gig+ limits:

error: build error: Image used by e53c8abb5114ef4ad8fd485b...3f144c0c9529f5a70f0dcbc7c94: image is in use by a container

We should get the buildah team's opinion on what could cause that.

We will probably also need to engage them to profile the memory usage since it's almost certainly all occurring within the buildah execution (to understand why the builds are consuming upwards of 8gigs of memory or more in 4.x).  Guessing we have some sort of runaway recursive activity or bad file processing.

Comment 4 Ben Parees 2019-12-16 21:55:05 UTC

full buildah output for the "image in use" error:
https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/6331/rehearse-6331-pull-ci-openshift-origin-master-images/13/build-log.txt

Comment 5 Stephen Cuppett 2019-12-17 11:54:24 UTC

Moving to active development branch (4.4) for investigation.

Comment 6 Nalin Dahyabhai 2019-12-17 18:08:30 UTC

If this happens in 4.2 but not 4.3, then I think that points to needing https://github.com/containers/storage/pull/435, which was first included in storage v1.13.4.  The 4.2 branch currently uses v1.13.2, which predates that fix.  The 4.3 branch uses a version between v1.15.2 and v1.15.3 which includes that fix.

Comment 9 Hongkai Liu 2019-12-19 03:21:20 UTC

After applying the workaround (foo=bar)

2G
$ oc --kubeconfig ~/.kube/build01.config -n ci-op-571cm788 get pod artifacts-build 
NAME              READY   STATUS           RESTARTS   AGE
artifacts-build   0/1     Init:OOMKilled   0          113s

4G
Working.

It seems that the MEM on 4.2 is still worse than 3.11.
See the prometheus screenshot.
https://user-images.githubusercontent.com/4013349/71142154-8bcdc580-21e4-11ea-8f69-12aa237a7cc4.png

But the workaround definitely unblock me.

Comment 10 Adam Kaplan 2020-01-29 15:26:07 UTC

@Nalin assigning this to you.

Based on the state from last month:

1. Adding an arbitrary environment variable to the BuildConfig's build strategy eliminated "image in use" errors and reduced the memory consumed. Is this a separate bug we need to investigate?
2. This may have been fixed in 4.3 - you suggested we need to upgrade the version of containers/storage to 1.13.4. Is this correct?

Comment 11 Hongkai Liu 2020-02-03 21:07:22 UTC

oc get clusterversions.config.openshift.io --context build01
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.0     True        False         4d7h    Cluster version is 4.3.0

https://user-images.githubusercontent.com/4013349/73690596-b42f5780-469e-11ea-8505-ebdb72113370.png

https://user-images.githubusercontent.com/4013349/73690613-c0b3b000-469e-11ea-9603-2fe874f0ad8a.png


With this change, it works well now:
https://github.com/openshift/release/commit/e3a07db74c9a0bd03302e3eaaa4d87ad1ed552ca

Note that we still need to change the values in `request` comparing to 311 cluster.
But the numbers are much better than 4.2.0

Comment 13 Steve Kuznetsov 2020-02-04 18:20:03 UTC

Comparison on 3.11: 

https://prometheus-k8s-openshift-monitoring.svc.ci.openshift.org/graph?g0.range_input=12h&g0.expr=sum(max_over_time(container_memory_rss%7Bpod_name%3D%22artifacts-build%22%7D%5B3h%5D))%20by%20(namespace%2C%20pod)&g0.tab=0

Usual usage is at ~80M

On 4.3:

https://prometheus-k8s-openshift-monitoring.apps.build01.ci.devcluster.openshift.com/graph?g0.range_input=6h&g0.expr=sum(max_over_time(container_memory_rss%7Bpod%3D%22artifacts-build%22%7D%5B3h%5D))%20by%20(namespace%2C%20pod)&g0.tab=0

Usual usage is ~500M

This is a critical blocker for using OpenShift 4.x for the CI build farm. Could we get the priority of this to reflect that, please?

Comment 14 Steve Kuznetsov 2020-02-04 18:21:38 UTC

Created attachment 1657633 [details]
3.11 memory usage

Comment 15 Steve Kuznetsov 2020-02-04 18:22:02 UTC

Created attachment 1657634 [details]
4.3 memory usage

Comment 16 Steve Kuznetsov 2020-02-04 18:22:44 UTC

Also, to be clear, we are seeing this WITH the env-var workaround: https://github.com/openshift/ci-tools/pull/337/files

Comment 17 Adam Kaplan 2020-02-04 18:56:21 UTC

Split out the env-var workaround into https://bugzilla.redhat.com/show_bug.cgi?id=1798192 - this shouldn't be necessary.

Comment 18 Adam Kaplan 2020-02-04 19:15:22 UTC

Thanks @Steve for providing the memory graphs - these provided a key insight into how builds work in OCP 4 vs. 3.11.

In OCP 3.x builds run in a Docker container that is completely unmanaged by kubernetes. Build pods mount the Docker socket from the host and essentially run a "remote Docker" build. As a result these workloads are effectively invisible to the cluster - I don't think we had any means of reporting "total build resource consumption" in real time. The observed memory usage represents the build container sending commands to the Docker socket, and awaiting responses. I find it hard to believe real OCP image builds would consume less than 80Mb of memory.

In OCP 4.x, builds run completely within the build container via buildah. The resources consumed represent the total consumed by buildah and the "sub-containers" that are created during the build process. Building OCP 4 projects could easily consume 400+Mb, and the metrics reported in kubernetes are a better representation of what is consumed.

TL;DR build containers in OCP 4.x will report more memory and CPU usage - by 1-2 orders of magnitude depending on the workload. However, this will represent the true usage of the build workloads.

Comment 19 Ben Parees 2020-02-05 02:45:45 UTC

I think everything adam said is accurate but i do want to note that even on 3.x we ran the docker build container under the same cgroup as the build-pod, so the build was constrained by whatever resource limits were applied to the pod (and thus would have run into OOM kills if it was exceeding whatever limit was configured on the build in 3.x).

So if 3.x was able to run with a 100meg limit(were they? i haven't seen what the build resource limit was configured to in 3.x), then indeed the builds were running within that memory limit.  The only thing that wouldn't be accounted for in 3.x is the memory usage of the docker daemon itself (the logic needed to manage the container and commit the image/push the image).

Comment 20 Ben Parees 2020-02-05 05:01:36 UTC

Actually I take it back. Since the pod cgroup was used in 3.x, the 3.x prometheus data should reflect the full usage just like 4.x (though the query required may not be the query steve used).  So I think further investigation of docker build vs buildah memory usage is warranted here.

Comment 21 Hongkai Liu 2020-02-05 14:17:05 UTC

Hi Ben,

This is the current limit/resource setup of openshift/origin/master for 3.11:
https://github.com/openshift/release/blob/ce3af828b5b6c1057accd6aafa8944fb5962065a/ci-operator/config/openshift/origin/openshift-origin-master.yaml#L111-L116

We hit the problem when we tried to run the same build with the same limit/resource and then started to tune their numbers to make the build work.

Comment 22 Clayton Coleman 2020-02-05 18:42:15 UTC

> In OCP 3.x builds run in a Docker container that is completely unmanaged by kubernetes. Build pods mount the Docker socket from the host and essentially run a "remote Docker" build. As a result these workloads are effectively invisible to the cluster - I don't think we had any means of reporting "total build resource consumption" in real time. The observed memory usage represents the build container sending commands to the Docker socket, and awaiting responses. I find it hard to believe real OCP image builds would consume less than 80Mb of memory.

This is incorrect, 3.11 clusters correctly attributed the child build to the pod cgroup.  However the naive prometheus queries won't account that under the container, you have to check the pod.  You have to look at memory where {container_name=""}, not sum({container_name!=""})

Comment 23 Clayton Coleman 2020-02-05 18:46:28 UTC

Here is the correct query for 3.11 to tell you max memory usage:

https://prometheus-k8s-openshift-monitoring.svc.ci.openshift.org/graph?g0.range_input=1h&g0.expr=container_memory_max_usage_bytes%7Bcontainer%3D%22%22%2Cpod_name%3D~%22.*-build%22%7D&g0.tab=0

Console build uses 11g

Comment 24 Clayton Coleman 2020-02-05 19:28:08 UTC

Looks like there's a 2-3gb spike at the end of 4.3 build (probably during push) that is extremely suspicious.  Just looking at the spikes I would say it looks like buildah reading the entire image into memory or something during a push, which would obviously not be good.  That is likely to trigger an OOM right at the end, and should not be happening.

Comment 25 Adam Kaplan 2020-02-05 20:47:36 UTC

@Hongkai can you please attach the build YAML that is generated for the `artifacts-build`? We need a clear apples/apples comparison to determine if memory consumption is in fact increasing with buildah, and if so where in the build process is this happening.

Comment 26 Hongkai Liu 2020-02-06 01:12:27 UTC

Created attachment 1658048 [details]
4.3 prometheus screen shot

Comment 27 Hongkai Liu 2020-02-06 01:13:21 UTC

Adam,
I will bring the yaml tomorrow.

Comment 29 Adam Kaplan 2020-02-10 13:59:12 UTC

Buildah improvements landed in https://github.com/openshift/builder/pull/126 - we need to verify if these move us closer to 3.11 parity with respect to memory consumption. Running a build of the openshift-tests image in origin is a good test benchmark test case.

Comment 30 Ben Parees 2020-02-10 14:27:03 UTC

Clayton mentioned that the DPTP team saw a major regression in memory usage from 4.2->4.3 (they upgraded their cluster from 4.2 to 4.3 and it started using more memory).  So while this bug was originally filed against 4.2, it sounds like there may be additional issues in 4.3.

Secondly, when we resolve this, please ensure that an e2e test is added that tests our memory usage during builds so we don't regress this in the future (a simple test would be to create a build with a memory limit that's just enough to allow the build to succeed.. if that test ever starts failing, we know we've increased our memory usage unexpectedly).

Comment 31 Adam Kaplan 2020-02-10 20:27:29 UTC

Created attachment 1662268 [details]
samples-operator-builds

Interesting results using the Samples Operator as a test build repo [1].
When run without resource limits, the build pod consumed ~5.5Gi of RAM. With 4Gi resource limits (identical to what is in openshift/release [2]), the memory limit is respected and build duration was nearly identical.


[1] https://github.com/openshift/cluster-samples-operator
[2] https://github.com/openshift/release/blob/590adedec8ea113740918590a6c295fb3a0c5a8c/ci-operator/config/openshift/cluster-samples-operator/openshift-cluster-samples-operator-master.yaml

Comment 33 Tom Sweeney 2020-02-10 23:01:20 UTC

Reassigning Nalin as he has been looking at this and the move to containers auto-assigned Mrunal.

Comment 36 Ben Parees 2020-04-01 02:06:26 UTC

so we think at least some fixes have been delivered (in 4.3?) for this issue, right?

Comment 37 Tom Sweeney 2020-04-01 16:02:00 UTC

I'm going to add Naline as a needinfo for Ben's query: https://bugzilla.redhat.com/show_bug.cgi?id=1784163#c36.  I'm not sure Mrunal will have the answer.

Comment 38 Nalin Dahyabhai 2020-05-14 15:36:05 UTC

(In reply to Ben Parees from comment #36)
> so we think at least some fixes have been delivered (in 4.3?) for this
> issue, right?

Compared to 4.2, 4.3 includes https://github.com/containers/storage/pull/435, which definitely affected memory usage, so yes.

Comment 39 Ben Parees 2020-05-14 15:48:55 UTC

so can we close this out?  not clear if there are still issues being seen in 4.3+

Comment 40 Nalin Dahyabhai 2020-05-14 15:58:47 UTC

If it's enough to fix the out-of-memory part in 4.3 and later, I think we can close this.  If we need to backport the change to 4.2, then we can attach this one to https://github.com/openshift/builder/pull/125.

Comment 41 Ben Parees 2020-05-14 16:00:42 UTC

i don't think it's worth backporting to 4.2 at this point

Comment 43 Red Hat Bugzilla 2023-09-14 05:48:51 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.