Bug 1744410
| Summary: | [Conformance] s2i build with a quota Building from a template should create an s2i build with a quota and run it | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | ge liu <geliu> |
| Component: | Build | Assignee: | Adam Kaplan <adam.kaplan> |
| Status: | CLOSED DUPLICATE | QA Contact: | wewang <wewang> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.2.0 | CC: | aos-bugs, bparees, jforrest, piqin, wzheng |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | 4.2.0 | ||
| Hardware: | x86_64 | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-09-19 01:36:34 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
ge liu
2019-08-22 06:18:46 UTC
Is this failing consistently? From the logs it appears that the build's node may be running out of memory. This issue happened in [ci][bare metal] few times recently. https://testgrid.k8s.io/redhat-openshift-release-informing#redhat-canary-openshift-ocp-installer-e2e-metal-4.2&sort-by-flakiness= (In reply to Adam Kaplan from comment #1) > Is this failing consistently? From the logs it appears that the build's node > may be running out of memory. Yes, I can see this frequently: blob:null/cbf1cc77-4343-4e73-b6ee-b09df51d3bfa And I manually run the test, also get out of memory error: [wzheng@openshift-qe 4.2httpsproxy]$ oc describe builds s2i-build-quota-2 Name: s2i-build-quota-2 Namespace: wzheng1 Created: About a minute ago Labels: buildconfig=s2i-build-quota name=s2i-build-quota openshift.io/build-config.name=s2i-build-quota openshift.io/build.start-policy=Serial Annotations: openshift.io/build-config.name=s2i-build-quota openshift.io/build.number=2 openshift.io/build.pod-name=s2i-build-quota-2-build Status: Failed (The build pod was killed due to an out of memory condition.) Started: Tue, 03 Sep 2019 15:51:36 CST Duration: 1m28s Build Config: s2i-build-quota Build Pod: s2i-build-quota-2-build Strategy: Source From Image: DockerImage docker.io/openshift/test-build-simples2i:latest Binary: provided on build Build trigger cause: <unknown> Log Tail: time="2019-09-03T07:53:02Z" level=debug msg="setting supplemental groups" time="2019-09-03T07:53:02Z" level=debug msg="setting gid" time="2019-09-03T07:53:02Z" level=debug msg="setting capabilities" time="2019-09-03T07:53:02Z" level=debug msg="setting uid" time="2019-09-03T07:53:02Z" level=debug msg="Running &exec...local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin\")" Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 96s default-scheduler Successfully assigned wzheng1/s2i-build-quota-2-build to wzheng93-tsxbq-compute-1 Normal Pulled 92s kubelet, wzheng93-tsxbq-compute-1 Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d5703ccf7e1c105ddd010e8ee85b06083d122015a2932d97be354ccc44ec3260" already present on machine Normal Started 90s kubelet, wzheng93-tsxbq-compute-1 Started container git-clone Normal Created 90s kubelet, wzheng93-tsxbq-compute-1 Created container git-clone Normal BuildStarted 89s build-controller Build wzheng1/s2i-build-quota-2 is now running Normal Pulled 85s kubelet, wzheng93-tsxbq-compute-1 Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d5703ccf7e1c105ddd010e8ee85b06083d122015a2932d97be354ccc44ec3260" already present on machine Normal Created 83s kubelet, wzheng93-tsxbq-compute-1 Created container manage-dockerfile Normal Started 83s kubelet, wzheng93-tsxbq-compute-1 Started container manage-dockerfile Normal Pulled 80s kubelet, wzheng93-tsxbq-compute-1 Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d5703ccf7e1c105ddd010e8ee85b06083d122015a2932d97be354ccc44ec3260" already present on machine Normal Created 77s kubelet, wzheng93-tsxbq-compute-1 Created container sti-build Normal Started 77s kubelet, wzheng93-tsxbq-compute-1 Started container sti-build Normal Killing 10s kubelet, wzheng93-tsxbq-compute-1 Stopping container sti-build Normal BuildFailed 8s build-controller Build wzheng1/s2i-build-quota-2 failed [wzheng@openshift-qe 4.2httpsproxy]$ oc get builds NAME TYPE FROM STATUS STARTED DURATION s2i-build-quota-1 Source Complete 24 minutes ago 2m32s s2i-build-quota-2 Source Binary Failed (OutOfMemoryKilled) About a minute ago 1m28s [wzheng@openshift-qe 4.2httpsproxy]$ oc logs builds/s2i-build-quota-2 [wzheng@openshift-qe 4.2httpsproxy]$ oc get pods NAME READY STATUS RESTARTS AGE s2i-build-quota-1-build 0/1 Completed 0 25m s2i-build-quota-2-build 0/1 OOMKilled 0 2m4s https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-metal-4.2/38 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/canary-openshift-ocp-installer-e2e-metal-4.2/37 @Wenjing can you obtain the amount of RAM and CPU the worker nodes on these clusters have, and get an indication of how much load is being placed on them during the test runs? Seems to me that the nodes need more RAM, or you need to add more nodes as workers. I cannot reproduce now, close it as a NOT_A_BUG for now, feel free to re-open if anyone meets again. This is failing almost every test run on metal. I am re-opening this and making it urgent. For reference I'm looking at https://testgrid.k8s.io/redhat-openshift-release-informing#redhat-canary-openshift-ocp-installer-e2e-metal-4.2 i think this is a dupe of https://bugzilla.redhat.com/show_bug.cgi?id=1752557 but we can leave it open if we need to track a BZ per platform until this is resolved. The PR that will hopefully address it is here: https://github.com/openshift/origin/pull/23825 I'm fine with dup'ing it on the other BZ if you want. It is flaking on AWS too, just not as often. We should just make sure the fix actually fixes it for all the platforms. *** This bug has been marked as a duplicate of bug 1752557 *** |