Bug 1333030

Summary: build fails to push if secret is created after the build starts
Product: OpenShift Container Platform Reporter: Scott Creeley <screeley>
Component: BuildAssignee: Ben Parees <bparees>
Status: CLOSED ERRATA QA Contact: Wenjing Zheng <wzheng>
Severity: low Docs Contact:
Priority: medium    
Version: 3.2.0CC: abhgupta, aos-bugs, bingli, bmcelvee, bparees, ccoleman, dakini, jokerman, mmccomas, xtian
Target Milestone: ---Keywords: OnlineStarter
Target Release: 3.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Builds select the secret to be used for pushing the output image at the time they are started. Consequence: When a build starts before the default service account secrets for a project are created, the build may not find a suitable secret for pushing the image, resulting in the build failing when it goes to push the image. Fix: The build will now be held until the default service account secrets exist, ensuring that if the default secret is suitable for pushing the image, it can/will be used. Result: Initial builds in a newly created project will no longer be at risk of failing if the build is created before the default secrets are populated.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-28 14:05:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Scott Creeley 2016-05-04 14:33:19 UTC
Description of problem:
kicking the tires using this google doc: 
    https://docs.google.com/document/d/13j8AJlolcB61AmWhogoBRzW2q1Z3YgWO8e11-sI8E9I/edit#

After deploying the node-js-ex with mongo, my pods are not running...
[screeley@screeley ~]$ oc get pods
NAME                             READY     STATUS    RESTARTS   AGE
mongodb-1-4tgva                  1/1       Running   0          26m
nodejs-mongodb-example-1-build   0/1       Error     0          26m


I get this error in the log:
     F0504 10:01:16.982179       1 builder.go:204] Error: build error: Failed to push image. Response from registry is: Post https://172.30.94.234:5000/v2/screeley-kickingtires/nodejs-mongodb-example/blobs/uploads/: no basic auth credentials

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
After deploying the node-js-ex with mongo, my pods are not running...
[screeley@screeley ~]$ oc get pods
NAME                             READY     STATUS    RESTARTS   AGE
mongodb-1-4tgva                  1/1       Running   0          26m
nodejs-mongodb-example-1-build   0/1       Error     0          26m


I get this error in the log:
     F0504 10:01:16.982179       1 builder.go:204] Error: build error: Failed to push image. Response from registry is: Post https://172.30.94.234:5000/v2/screeley-kickingtires/nodejs-mongodb-example/blobs/uploads/: no basic auth credentials

Expected results:


Additional info:

Comment 1 Scott Creeley 2016-05-04 16:43:36 UTC
After talking with some developers, this is a known issue and is a result of timing (time between project creation, launch of first app and having all proper authentications/permissions/attributes created for the project).

After triggering a 2nd build on the app/image, everything worked as expected, we should make the application retry until it succeeds or the background authentication/permissions etc... catch up with the deployment.

Comment 2 Ben Parees 2016-05-06 15:24:58 UTC
Clayton and I had some back and forth via email on this, but I still don't agree that it's sensible for a build to retry itself based on failed credentials, just like we wouldn't retry a build due a code compilation failure.

Comment 3 Stefanie Forrester 2016-05-06 20:43:23 UTC
This bug is the same as one I filed here https://bugzilla.redhat.com/show_bug.cgi?id=1333510

Comment 4 Ben Parees 2016-07-13 13:36:05 UTC
To clarify:  there are two bugs..this one (what should a build do when the push secret is not available) and the one Stefanie referenced (why isn't the push secret available?)

Comment 5 Bing Li 2016-08-02 10:53:28 UTC
@Ben Parees
I'm not sure if this is a same issue with what I met in online dev-preview-int env. I started a build using the python:3.5 image, and it failed to push the built image to registry. But secrets seems fine.

F0802 06:21:06.517392       1 builder.go:204] Error: build error: Failed to push image. Response from registry is: Post https://172.30.113.38:5000/v2/binglipushtest06/binglipushtest06/blobs/uploads/: no basic auth credentials
$ oc get sa
NAME       SECRETS   AGE
builder    2         6m
default    2         6m
deployer   2         6m
$ oc get secrets
NAME                       TYPE                                  DATA      AGE
builder-dockercfg-qqir5    kubernetes.io/dockercfg               1         8m
builder-token-muvgs        kubernetes.io/service-account-token   3         8m
builder-token-rmup0        kubernetes.io/service-account-token   3         8m
default-dockercfg-pb3ax    kubernetes.io/dockercfg               1         8m
default-token-pztmq        kubernetes.io/service-account-token   3         8m
default-token-rpkap        kubernetes.io/service-account-token   3         8m
deployer-dockercfg-8gmkq   kubernetes.io/dockercfg               1         8m
deployer-token-fioui       kubernetes.io/service-account-token   3         8m
deployer-token-nvd42       kubernetes.io/service-account-token   3         8m

I have met this issue many times in online, and secrets all have been created successfully. If it's not a same issue, I would file a new bug. Thanks!

Comment 6 Ben Parees 2016-08-02 17:05:18 UTC
@Bing Li 

That is a separate issue, please file it as a separate bug and provide loglevel 5 build logs.

Comment 7 Abhishek Gupta 2016-11-01 16:54:56 UTC
Can we move this bug to ON_QA? The issue with missing push secrets is resolved and I don't believe we want to do anything different (other than just letting the build fail and not retrying) in case the push secrets are missing.

Comment 8 Ben Parees 2016-11-01 17:00:43 UTC
Ultimately i'd like to see us rework how the build resolves the secrets so that it can do some retrying in cases where the secret is missing but we know it's likely to appear momentarily, so that's why i've left this bug open.  Unfortunately fixing that requires a significant refactor of the build logic.

Comment 9 Ben Parees 2018-01-19 13:24:36 UTC
finding the internal registry:

defaultRegistry := env("OPENSHIFT_DEFAULT_REGISTRY", "${DOCKER_REGISTRY_SERVICE_HOST}:${DOCKER_REGISTRY_SERVICE_PORT}")

Comment 10 openshift-github-bot 2018-01-25 15:26:57 UTC
Commit pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/d939f0e2bd33167629d0e54bc928b2e16eb25c64
wait for default service account secrets before running build

bug 1333030

https://bugzilla.redhat.com/show_bug.cgi?id=1333030

Comment 11 Ben Parees 2018-01-25 15:46:37 UTC
https://github.com/openshift/origin/pull/18253

Comment 13 Wenjing Zheng 2018-01-30 09:45:36 UTC
@Ben, I followed below steps to verify this bug, could you please confirm whether it is ok?

1. Edit builder service account to use non-existed secret;
2. Start a build (build will be in New state)
3. Change the service account to use correct secret;
4. Watch the build status, it will go running and completed at last.

Comment 14 Ben Parees 2018-01-30 13:50:00 UTC
Yes that's a reasonable way to verify this.

Comment 15 Wenjing Zheng 2018-01-31 02:18:17 UTC
Thanks for your confirmation!

Will verify this bug with this version v3.9.0-0.31.0 now.

Comment 21 errata-xmlrpc 2018-03-28 14:05:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489