Bug 1719792
| Summary: | Build fail with "Unable to look up the service account secrets for this build" because SA token generation is delayed | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Alberto Gonzalez de Dios <algonzal> | |
| Component: | Build | Assignee: | Gabe Montero <gmontero> | |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | wewang <wewang> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 3.11.0 | CC: | adam.kaplan, ahoness, aos-bugs, apjagtap, bparees, gmontero, jokerman, mfojtik, mmccomas, rdiazgav, sparpate, wzheng | |
| Target Milestone: | --- | Keywords: | Reopened | |
| Target Release: | 3.11.z | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1729508 1729509 (view as bug list) | Environment: | ||
| Last Closed: | 2019-11-13 13:55:11 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1729508, 1729509 | |||
|
Description
Alberto Gonzalez de Dios
2019-06-12 14:55:18 UTC
This is a bug in the build controller - we need to wait or retry fetching the builder service account secrets. Unfortunately the best work-around in this situation is to cancel the first build, then start a new one for the given BuildConfig. If the build is started via a script (`oc start-build`, `oc new-app`, etc.), the script may need to add a loop to check the status of the build and cancel+restart if the build does not move past the New state after a set period of time - 5 minutes is reasonable. > This is a bug in the build controller - we need to wait or retry fetching the builder service account secrets. The controller does retry this. so i'm not sure why this would be happening: whenever we set that message we return an error: https://github.com/openshift/openshift-controller-manager/blob/master/pkg/build/controller/build/build_controller.go#L1079 https://github.com/openshift/openshift-controller-manager/blob/master/pkg/build/controller/build/build_controller.go#L1109 https://github.com/openshift/openshift-controller-manager/blob/master/pkg/build/controller/build/build_controller.go#L1132 and that error bubbles up to the sync loop to ultimately cause the key to be retried in the queue. https://github.com/openshift/openshift-controller-manager/blob/master/pkg/build/controller/build/build_controller.go#L1675 if someone can reproduce this w/ level 4 logging enabled in the openshift controller manager it might shed some light on what is happening here. it does not look like those controller logs had loglevel 4 enabled. *** Bug 1729509 has been marked as a duplicate of this bug. *** Customer case has been closed. |