Bug 1415946 - Should show build failure reason for docker build when builder image is wrong
Summary: Should show build failure reason for docker build when builder image is wrong
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Corey Daley
QA Contact: Wang Haoran
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-24 07:27 UTC by Yadan Pei
Modified: 2017-07-24 14:11 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Build failure reason was not getting set/saved correctly Consequence: Build failure reason could not be shown in command output Fix: Updated code to correctly save build failure reason Result: Build failure reason now shows correctly in command output
Clone Of:
Environment:
Last Closed: 2017-04-12 19:10:32 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0884 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.5 RPM Release Advisory 2017-04-12 22:50:07 UTC

Description Yadan Pei 2017-01-24 07:27:06 UTC
Description of problem:
Should show build failure reason for docker build when builder image is wrong.
When creating docker build with Wrong Post Commit/Wrong Source, detail failure reason is shown in 'oc get builds' output as expected, this should also applies to docker build when wrong builder image is given

Version-Release number of selected component (if applicable):
oc v3.5.0.7+390ef18
kubernetes v1.5.2+43a9be4

How reproducible:
Always

Steps to Reproduce:
1. Create docker build with Wrong Builder Image
$ wget https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/build/ruby22rhel7-template-docker.json
$ vi ruby22rhel7-template-docker.json
.....
"strategy": {
"type": "Docker",
"dockerStrategy": {
"from": {
"kind": "DockerImage",
"name": "docker.io/openshift/rubyyyy-20-centos7:latest"
},
........
$ oc new-app -f ruby22rhel7-template-docker.json
2.Wait for #1 build finish, check build status via CLI
$ oc get builds


Actual results:
2. Only show "Failed" in Status, detail failure reason was not shown
$ oc get builds
NAME                    TYPE      FROM          STATUS    STARTED         DURATION
ruby22-sample-build-1   Docker    Git@e79d887   Failed    3 minutes ago   11s

Expected results:
2. Should also show failure reason in output like 
$ oc get builds
NAME                    TYPE      FROM          STATUS    STARTED         DURATION
ruby22-sample-build-1   Docker    Git@e79d887   Failed (PullBuilderImageFailed)     3 minutes ago   11s


Additional info:

Comment 1 Victor Ionut Palade 2017-01-24 15:15:50 UTC
i hate to criticize, but the description of this issue makes very little sense.

Anyway, there's an actual test case that covers this exact problem, see: https://github.com/openshift/origin/blob/master/test/extended/builds/failure_status.go#L92-L107

The problem is a race condition in updating the build object and not a missing implementation.
Will update once i have a fix.

Comment 2 Ben Parees 2017-01-30 15:05:25 UTC
Victor what's the status on this bug?  Any theories on where/why the race is occurring?

Comment 3 Victor Ionut Palade 2017-01-30 20:08:53 UTC
i believe somewhere in the controller, we update the build object and overwrite whatever values we pushed to the API from the build pod.
see:
https://github.com/openshift/origin/blob/master/pkg/build/controller/controller.go#L134

Usually the fields are empty for actions involving the build process. s2i and docker strategy.
This happens very rarely and usually happens more often as a test flake than by running the build yourself.
The controller update, should either be updated to use something similar to this https://github.com/openshift/origin/blob/master/pkg/build/builder/sti.go#L121 
Or the only updates that are done to the build object should be done only from the controller itself. But, because the build pod is isolated, the controller doesn't know what happens to it unless it receives information from the API that is updated by the build pod.

Comment 4 Ben Parees 2017-01-30 20:34:53 UTC
the build status reason and build status message get cleared by:
https://github.com/openshift/origin/blob/master/pkg/build/controller/controller.go#L229-L230

but i can't think my way through a path where we'd have a build enter the failed state, and then still go through nextBuildPhase (which is normally only called for build objects that are in the New state. We wouldn't even have a pod yet if the build were in the New state.  And if the build were being updated elsewhere, then the controller's update would fail due to a version conflict, so it wouldn't be able to overwrite the reason/message fields)

Comment 5 Corey Daley 2017-02-08 21:34:05 UTC
Submitted PR https://github.com/openshift/origin/pull/12873

Comment 6 Corey Daley 2017-02-17 21:00:14 UTC
https://github.com/openshift/origin/pull/12873

Comment 7 Troy Dawson 2017-02-21 22:37:37 UTC
This has been merged into ocp and is in OCP v3.5.0.32 or newer.

Comment 9 Yadan Pei 2017-02-23 05:06:39 UTC
Checked against oc v3.5.0.32-1+4f84c83 with the same step in "Steps to Reproduce", after build finishes, 'oc get builds' returns Build Failure reason

$ oc get builds
NAME                    TYPE      FROM          STATUS                            STARTED              DURATION
ruby22-sample-build-1   Docker    Git@e79d887   Failed (PullBuilderImageFailed)   About a minute ago   


Move to VERIFIED

Comment 11 errata-xmlrpc 2017-04-12 19:10:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0884


Note You need to log in before you can comment on or make changes to this bug.