Bug 1937535
Summary: | Not all image pulls within OpenShift builds retry | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Hongkai Liu <hongkliu> |
Component: | Build | Assignee: | Nalin Dahyabhai <nalin> |
Status: | CLOSED ERRATA | QA Contact: | XiuJuan Wang <xiuwang> |
Severity: | medium | Docs Contact: | Rolfe Dlugy-Hegwer <rdlugyhe> |
Priority: | unspecified | ||
Version: | 4.7 | CC: | adam.kaplan, aos-bugs, gmontero, nalin, wking |
Target Milestone: | --- | ||
Target Release: | 4.8.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
When OpenShift builds interact with image registries, such as pulling base images, intermittent communications issues can produce build failures. The current release increases the number of retries to these interactions. Now, OpenShift builds are more resilient when they encounter intermittent communication issues with image registries.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2021-07-27 22:52:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1940052 |
Description
Hongkai Liu
2021-03-10 21:43:06 UTC
Gabe, I could reproduce this bug on 4.8.0-0.nightly-2021-03-17-014745 cluster with senario: step 1: Specify a private image as source image and add pull secret. source: git: uri: https://github.com/openshift/ruby-hello-world.git images: - from: kind: DockerImage name: 172.30.128.188:5000/busybox paths: - destinationDir: openshiftqedir sourcePath: /opt/app-root pullSecret: name: test type: Git strategy: sourceStrategy: env: - name: EXAMPLE value: sample-app from: kind: ImageStreamTag name: ruby:2.7 namespace: openshift type: Source step2: Trigger build. the build failed for pulling private source image without retry. $ oc logs -f build/ruby-sample-build-1 Cloning "https://github.com/openshift/ruby-hello-world.git" ... Commit: f476e11e538445e76470b0c63252b49e294a51d2 (Merge pull request #121 from vrutkovs/ruby-2.7) Author: Ben Parees <bparees.github.com> Date: Wed Mar 10 09:52:09 2021 -0500 Caching blobs under "/var/cache/blobs". error: error creating buildah builder: Error initializing source docker://172.30.128.188:5000/busybox:latest: error pinging docker registry 172.30.128.188:5000: Get "https://172.30.128.188:5000/v2/": http: server gave HTTP response to HTTPS client Successfully senario: Trigger build with pull private image, set invaild secret at first, then correct secret quickly. source: git: uri: http://github.com/openshift/rails-ex.git type: Git strategy: sourceStrategy: from: kind: ImageStreamTag name: mystream:latest namespace: rhf34 pullSecret: name: test $ oc logs -f build/rails-ex-8 Cloning "http://github.com/openshift/rails-ex.git" ... Commit: 9e6fe17f934b87b9a399e2623d6c7dfcebd4b530 (Merge pull request #130 from pvalena/bundler) Author: Pavel Valena <pvalena> Date: Wed Sep 16 16:23:12 2020 +0200 Caching blobs under "/var/cache/blobs". error trying to parse file /var/run/secrets/openshift.io/pull/.dockerconfigjson: illegal base64 data at input byte 28 Warning: Pull failed, retrying in 5s ... Getting image source signatures Copying blob sha256:0669b0daf1fba90642d105f3bc2c94365c5282155a33cc65ac946347a90d90d1 Copying config sha256:83aa35aa1c79e4b6957e018da6e322bfca92bf3b4696a211b42502543c242d6f Writing manifest to image destination Storing signatures Generating dockerfile with builder image 172.30.128.188:5000/busybox@sha256:afe605d272837ce1732f390966166c2afff5391208ddd57de10942748694049d Hey XiuJuan So I dove into error: error creating buildah builder: Error initializing source docker://172.30.128.188:5000/busybox:latest: error pinging docker registry 172.30.128.188:5000: Get "https://172.30.128.188:5000/v2/": http: server gave HTTP response to HTTPS client the top level error there corresponds to https://github.com/openshift/builder/blob/c910b5cd6c0e0a284c544d3fd98d1ddf8167cbc7/pkg/build/builder/source.go#L451-L458 which is where Nalin added retry. If you then work off the "Error intializing source", you get into the retry copy logic of containers image. The thing is, that logic does not retry on just any error. It distinguishes intermittent errors from ones that will persist. For reference: https://github.com/openshift/builder/blob/f9787dc13c7cff8ccbb6dd5d93a9bfddc2412ed0/vendor/github.com/containers/common/pkg/retry/retry.go#L45-L95 A server giving a "HTTP response to HTTPS client" is one of those persistent or perm fail errors. So the lack of retry there is good / expected. Based on that, and the retry you were able to identify, I'm marking this verified. thanks Supporting information for release notes: Cause: intermittent communication issues can occur when interacting with image registries Consequence: certain interactions between openshift builds and image registries, for example when pulling images as source, could result in build failure when those intermittent issues occurred Fix: retry for pulling images for all permutations of interaction between openshift builds and image registries was added Result: openshift builds are now more resilient when they encounter intermittent communication issues with image registries I suggest changing "source images" to "base images" in the doc text, since we're talking about what's usually called the base image in a Dockerfile, that's what we use it for in "Docker" strategy builds, and it's how we use the s2i builder image in "Source" strategy builds. Thanks, Nalin. Updated. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |