Bug 1575990 - Build-Logs aren't provided during oc start-build
Summary: Build-Logs aren't provided during oc start-build
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 3.7.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.10.0
Assignee: Adam Kaplan
QA Contact: XiuJuan Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-08 13:38 UTC by Dmitry Zhukovski
Modified: 2021-06-10 16:06 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: streaming of build logs failed due to a server-side timeout waiting for the build pod to start. Consequence: oc start-build could hang if the --wait and --follow flags were set. Fix: 1) Server-side timeout for a build pod to start was increased from 10 to 30 seconds. 2) If --follow flag is specified and the log streaming fails, return an error message to the user. 3) If --follow and --wait is specified, retry log streaming. Result: 1) Log stream failures due to build pod wait timeouts are less likely to occur. 2) If --follow fails, user is presented with the message, "Failed to stream the build logs - to view the logs, run oc logs build/<build-name>" 3) If --follow and --wait flags are set, oc start-build will retry fetching the build logs until successful.
Clone Of:
Environment:
Last Closed: 2018-08-27 18:25:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dmitry Zhukovski 2018-05-08 13:38:01 UTC
Description of problem:

oc start-build --follow=true --wait=true httpd-latest --loglevel=4
build "xxx-yyy-zz" started
error getting logs (unable to wait for build xxx-yyy-zz to run: timed out waiting for the condition), waiting for build to complete

Version-Release number of selected component (if applicable):
3.7

How reproducible:
everytime

Steps to Reproduce:



Actual results:


Expected results:


Additional info:

Comment 2 Ben Parees 2018-05-08 13:50:36 UTC
Adam, 

I believe the current build log logic waits a bit for the build pod to be running before giving up and exiting.

The --wait logic will wait basically forever for the build pod to complete.

I do not recall why we don't also retry/wait forever for the build pod to be running so we can fetch the logs.  It is worth revisiting.

Comment 3 Ben Parees 2018-05-08 13:55:34 UTC
currently we wait 10s for the build to start running when we go to retrieve the logs:

https://github.com/openshift/origin/blob/master/pkg/build/registry/buildlog/rest.go#L41

https://github.com/openshift/origin/blob/master/pkg/build/registry/buildlog/rest.go#L84-L91

I don't think we want to change the rest api behavior, but it would be reasonable to change the startbuild logic to retry fetching the logs:

https://github.com/openshift/origin/blob/master/pkg/oc/cli/cmd/startbuild.go#L440-L449

Comment 4 Ben Parees 2018-05-08 13:56:31 UTC
(I also think the severity should be lowered.  the simple workaround is to run:

oc start-build --wait  # will wait until the build is complete
oc logs build/foo # retrieve the build logs once we know the build is complete

Comment 5 Adam Kaplan 2018-05-15 16:37:43 UTC
Pull Request: https://github.com/openshift/origin/pull/19695

Comment 6 openshift-github-bot 2018-05-23 07:05:31 UTC
Commit pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/7a1bf39a413ca5127b1ff09cff4dd27b883381f0
Improve resilience of oc start-build log streaming

* Add retry when attempting to stream build logs.
* Increase server-side build wait timeout to 30s.

Fixes bug 1575990

Comment 7 XiuJuan Wang 2018-06-07 09:24:41 UTC
Can't reproduce this bug with 
oc version
oc v3.10.0-0.63.0
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://**:8443
openshift v3.10.0-0.63.0
kubernetes v1.10.0+b81c8f8

# oc start-build dancer-mysql-example --follow --wait  -n install-test  --loglevel=4
build "dancer-mysql-example-5" started
Cloning "https://github.com/openshift/dancer-ex.git" ...
	Commit:	950d3f52355c0c6989908419219d2c3cfdf7b8ff (Merge pull request #71 from bparees/httpcookies)
	Author:	Ben Parees <bparees.github.com>
	Date:	Fri Mar 9 12:56:51 2018 -0500
---> Installing application source ...
---> Copying configuration files...


Note You need to log in before you can comment on or make changes to this bug.