Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1596449 - Build update that sets phase status failed doesn't contain log snippet data
Build update that sets phase status failed doesn't contain log snippet data
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build (Show other bugs)
3.10.0
Unspecified Unspecified
unspecified Severity low
: ---
: 3.11.0
Assigned To: Ben Parees
wewang
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-06-28 23:17 EDT by Clayton Coleman
Modified: 2018-10-11 03:21 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The logic for updating the build status waited to update the lognsippet containing the tail of the build log only ran after the build status was updated to the failed state. Consequence: The build would first transition to failed, then get updated again with the log snippet. This means code watching for the build to enter a failed state would not see the logsnippet value populated initially. Fix: Code has been changed to populate the logsnippet field when the build transitions to failed, so the build update will contain both the failed state and the logsnippet. Result: Code that watches the build for a transition to failed state will see the logsnippet as part of the update that transitioned the build to failed, instead of seeing a secondary update later.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-10-11 03:20:43 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 None None None 2018-10-11 03:21 EDT

  None (edit)
Description Clayton Coleman 2018-06-28 23:17:41 EDT
I'm observing that when certain pod failures occur I see the build transition to failed status with an empty logSnippet, and then a fraction of a second later the logSnippet is set.

This makes trying to programmatically capture build logSnippet painful because i have to wait an arbitrary amount of time after seeing failed for a logSnippet that may or may not get set.  The logSnippet is useful to my client (ci-operator) so that I can convey that back to the user for summarizing why we're failing a build.

Is this something in the build controller, or something in the pod?

Expected outcome: phase set to Failed at the same time logSnippet is set.

Actual outcome: phase set to failed, and then a subsequent update sets logSnippet.

Impact: Makes it hard to have a good UI for a developer of openshift using our CI tools for when a build fails.
Comment 1 Clayton Coleman 2018-06-28 23:19:21 EDT
I *think* it's because setBuildCompletionData checks:

	if build.Status.Phase == buildapi.BuildPhaseFailed && len(build.Status.LogSnippet) == 0 &&

but should instead check the current and new phase

	if build.Status.Phase == buildapi.BuildPhaseFailed || update.getPhase() == buildapi.BuildPhaseFailed && len(build.Status.LogSnippet) == 0 && ...
Comment 2 Ben Parees 2018-07-11 16:03:39 EDT
https://github.com/openshift/origin/pull/20290
Comment 3 wewang 2018-08-27 04:46:47 EDT
verified in openshift v3.11.0-0.22.0

steps:
1. Download file application-template-dockerbuild.json and updated with wrong post-command
 -----------------

        "postCommit": {
          "args": ["bundle", "exec", "rake1", "test"]
        },
--------------------

2. Create apps
 $ oc new-app application-template-dockerbuild.json

3.Check the build 
$ oc get builds --watch
NAME                  TYPE      FROM          STATUS    STARTED          DURATION
ruby-sample-build-1   Docker    Git@7ccd324   Running   13 seconds ago   
ruby-sample-build-1   Docker    Git@7ccd324   Failed (PostCommitHookFailed)   19 seconds ago 

4. Check the build controller log in master 
[root@qe-wewang-master-etcd-1 ~]# oc logs pod/master-controllers-qe-wewang-master-etcd-1 -n kube-system |grep build_controller.go

I0827 08:23:25.120689       1 build_controller.go:340] Handling build wewang/ruby-sample-build-1 (Failed)
I0827 08:23:25.121315       1 build_controller.go:1192] Patching build wewang/ruby-sample-build-1 (Failed) with buildUpdate(logSnippet: "Successfully built 14c7c1584c28\nRunning post commit hook ...\nbundler: command not found: rake1\nInstall missing gem executables with `bundle install`\nerror: build error: container \"openshift_docker-build_ruby...wang_post-commit_289cef79\" returned non-zero exit code: 127")

@Ben Parees I saw the pr and thought when build controll set status to failed and at the same time gave logSnippet, do you help to check if my verified steps are correct? 
thanks.
Comment 4 Ben Parees 2018-08-27 10:15:37 EDT
Yes that looks fine, thanks!
Comment 6 errata-xmlrpc 2018-10-11 03:20:43 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Note You need to log in before you can comment on or make changes to this bug.