Bug 1415946
Summary: | Should show build failure reason for docker build when builder image is wrong | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Yadan Pei <yapei> |
Component: | Build | Assignee: | Corey Daley <cdaley> |
Status: | CLOSED ERRATA | QA Contact: | Wang Haoran <haowang> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.5.0 | CC: | aos-bugs, bparees, ipalade, tdawson |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: Build failure reason was not getting set/saved correctly
Consequence: Build failure reason could not be shown in command output
Fix: Updated code to correctly save build failure reason
Result: Build failure reason now shows correctly in command output
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2017-04-12 19:10:32 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Yadan Pei
2017-01-24 07:27:06 UTC
i hate to criticize, but the description of this issue makes very little sense. Anyway, there's an actual test case that covers this exact problem, see: https://github.com/openshift/origin/blob/master/test/extended/builds/failure_status.go#L92-L107 The problem is a race condition in updating the build object and not a missing implementation. Will update once i have a fix. Victor what's the status on this bug? Any theories on where/why the race is occurring? i believe somewhere in the controller, we update the build object and overwrite whatever values we pushed to the API from the build pod. see: https://github.com/openshift/origin/blob/master/pkg/build/controller/controller.go#L134 Usually the fields are empty for actions involving the build process. s2i and docker strategy. This happens very rarely and usually happens more often as a test flake than by running the build yourself. The controller update, should either be updated to use something similar to this https://github.com/openshift/origin/blob/master/pkg/build/builder/sti.go#L121 Or the only updates that are done to the build object should be done only from the controller itself. But, because the build pod is isolated, the controller doesn't know what happens to it unless it receives information from the API that is updated by the build pod. the build status reason and build status message get cleared by: https://github.com/openshift/origin/blob/master/pkg/build/controller/controller.go#L229-L230 but i can't think my way through a path where we'd have a build enter the failed state, and then still go through nextBuildPhase (which is normally only called for build objects that are in the New state. We wouldn't even have a pod yet if the build were in the New state. And if the build were being updated elsewhere, then the controller's update would fail due to a version conflict, so it wouldn't be able to overwrite the reason/message fields) Submitted PR https://github.com/openshift/origin/pull/12873 This has been merged into ocp and is in OCP v3.5.0.32 or newer. Checked against oc v3.5.0.32-1+4f84c83 with the same step in "Steps to Reproduce", after build finishes, 'oc get builds' returns Build Failure reason $ oc get builds NAME TYPE FROM STATUS STARTED DURATION ruby22-sample-build-1 Docker Git@e79d887 Failed (PullBuilderImageFailed) About a minute ago Move to VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0884 |