Bug 1818476
| Summary: | invalid must-gather istag imported for ppc64le setups | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mark Hamzy <mhamzy> | |
| Component: | Samples | Assignee: | Gabe Montero <gmontero> | |
| Status: | CLOSED ERRATA | QA Contact: | XiuJuan Wang <xiuwang> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.5 | CC: | adam.kaplan, aos-bugs, gmontero, jokerman, mfojtik, psundara, pweil, ssadeghi, wzheng, xiuwang | |
| Target Milestone: | --- | Flags: | xiuwang:
needinfo-
|
|
| Target Release: | 4.5.0 | |||
| Hardware: | ppc64le | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: |
Cause: the samples operator was not copying the install pull secret when bootstrapped as removed
Consequence: even though the samples operator has not installed any imagestreams into the openshift namespace, the CVO installs a set of imagestreams in the openshift to assist with customer problem determination, and those imagestreams were failing to import
Fix: samples operator was updated to copy the install pull secret to the openshift namespace even when marked as removed
Result: the CVO imagestreams in the openshift namespace import successfully
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1829073 1829083 (view as bug list) | Environment: | ||
| Last Closed: | 2020-08-04 18:07:32 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1829073 | |||
|
Description
Mark Hamzy
2020-03-28 13:14:00 UTC
Moving to devex team since they own cluster-sample-operator that is responsible for this. So Maciej's statement needs some clarification. The samples operator does not actually imagestreams like the must gather one, i.e. https://github.com/openshift/cluster-samples-operator/blob/master/manifests/08-openshift-imagestreams.yaml#L87-L100 The CVO does it as part of processing cluster operator manifests. It just so happens that Clayton chose the samples operator (granted it was the most reasonable cluster operator to pick) as the repo to set up those manifests. That said, this still ties into https://issues.redhat.com/browse/DEVEXP-465, which is on tap for this sprint, though it is blocked by a dependency, in that you need the cluster install secret in the openshift namespace to pull those images. The samples operator is charged with copying the installer pull secret into the openshift namespace. The PR for https://issues.redhat.com/browse/DEVEXP-465 includes doing this for both power and Z, as well as installing samples from openshift/library that have been earmarked for power and Z. The aforementioned dependency: the PR is being held until the multi arch team can get their periodic e2e jobs updated to run our image ecosystem suite, to validate the openshift/libray imagestreams (vs. the ones installed by the CVO via the cluster operator manifest). So based on how the multi arch team work goes, and how badly resolution of the bugzilla is desired, we may have decisions to make around 1) merging the existing PR as is prior to the multi arch e2e's being ready 2) breaking up that PR to not include the openshift/library content, but enable enough of the samples operator to copy the install pull secret. @Gabe or option 3 - change the `Removed` behavior so that the install pull secret is always copied into the `openshift` namespace? This is akin to what the Image Registry operator needs to do with the node-ca daemonset. Yes that is a possiblity @Adam, though I'm not crazy about that option. I'm more inclined, as I noted in the associated PR https://github.com/openshift/cluster-samples-operator/pull/225, to pick the change to not boostrap as removed into a new PR, but not pick the commit that actually pulls in content to install, besides what the CVO installs, until the multi arch team gets the CI ready. But we can percolate on it more in either place. So chalk up another one to my aging memory ;-) .... the code *ALREADY* keeps the pull secret around when set to removed, i.e. Adam's suggestion. Looking at the history, we have always been doing it. And I've got comments explaining why. Where the break down occurs is if we *BOOTSTRAP* as removed. Then the copy does not happen. Currently it only handles managed to removed transitions. There is a simple fix and test case to cover the bootstrap as removed use case. I'll have that up momentarily. Going to keep the course here vs. the more complicated change I previously suggested. Hey Yaakov, Could you help handle this bug? Thanks https://bugzilla.redhat.com/show_bug.cgi?id=1827694 is the same as this I believe. It does look like the bootstrap code does not kick in to copy the pull secret. I would agree Prashanth reading https://bugzilla.redhat.com/show_bug.cgi?id=1827694 ... the same root cause I've reached out to Jeremy in slack as well But if it helps the multi arch team out at all, Ben Parees and I have agreed that we could have openshift QE verify this fix in 4.5 using disconnected scenarios on x86, and once verified, it would accelerate the backporting process to 4.4.z and 4.3.z, assuming getting 4.4.z or 4.3.z versions of the fix is what multi arch currently needs. I'll wait a bit longer to hear if there is agreement on that approach from the multi arch side. But absent feedback in the next day or two I'll probably just pull the trigger on that approach. Ok talked to Jeremy and Prashanth on slack. We are go with the plan I noted in #Comment 12 @XiuJuan - I've assigned QA contact back to you. Use this to verify a) run a 4.5 x86 nightly b) set up either a disconnected or x86 scenario so that samples operator bootstraps as removed (just like it currently bootstraps as removed for ppc/z) c) ensure the "must-gather" imagestream in the openshift namespace is able to pull the images from the install payload if we are good, mark verify, and we'll start cherrypicking, etc. 1. must-gather imagestream is importing image from install payload.
$oc adm release info --pullspecs registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-04-29-231711 | grep must
must-gather quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
2. But schedule still don't work after I has inserted CA for mirror registry one hour, must-gather imagestream didn't retry to import image.
This issue could reproduce in 4.4 sometimes.., should track in another bug. Therefore mark this bug as verified to backport.
$ oc get is must-gather -n openshift -o yaml
apiVersion: image.openshift.io/v1
kind: ImageStream
metadata:
creationTimestamp: "2020-04-30T04:23:01Z"
generation: 2
managedFields:
- apiVersion: image.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:tags:
.: {}
k:{"name":"latest"}:
.: {}
f:annotations: {}
f:from:
.: {}
f:kind: {}
f:name: {}
f:generation: {}
f:importPolicy:
.: {}
f:scheduled: {}
f:name: {}
f:referencePolicy:
.: {}
f:type: {}
f:status:
f:publicDockerImageRepository: {}
manager: cluster-version-operator
operation: Update
time: "2020-04-30T07:24:06Z"
name: must-gather
namespace: openshift
resourceVersion: "88336"
selfLink: /apis/image.openshift.io/v1/namespaces/openshift/imagestreams/must-gather
uid: 9da66489-19aa-42e3-9cf0-8f4a8e8323c1
spec:
lookupPolicy:
local: false
tags:
- annotations: null
from:
kind: DockerImage
name: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
generation: 2
importPolicy:
scheduled: true
name: latest
referencePolicy:
type: Source
status:
dockerImageRepository: image-registry.openshift-image-registry.svc:5000/openshift/must-gather
publicDockerImageRepository: default-route-openshift-image-registry.apps.qe-hashadebug1.qe.gcp.devcluster.openshift.com/openshift/must-gather
tags:
- conditions:
- generation: 2
lastTransitionTime: "2020-04-30T04:23:31Z"
message: 'Internal error occurred: [qe-hashadebug1.mirror-registry.qe.gcp.devcluster.openshift.com:5000/ocp/release@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10:
Get https://qe-hashadebug1.mirror-registry.qe.gcp.devcluster.openshift.com:5000/v2/:
x509: certificate signed by unknown authority, quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10:
Get https://quay.io/v2/: net/http: request canceled while waiting for connection
(Client.Timeout exceeded while awaiting headers)]'
reason: InternalError
status: "False"
type: ImportSuccess
items: null
tag: latest
3.Succeed to import imagestream manually.
$ oc import-image must-gather:latest --from=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
imagestream.image.openshift.io/must-gather imported
Name: must-gather
Namespace: openshift
Created: 4 hours ago
Labels: <none>
Annotations: openshift.io/image.dockerRepositoryCheck=2020-04-30T08:17:37Z
Image Repository: default-route-openshift-image-registry.apps.qe-hashadebug1.qe.gcp.devcluster.openshift.com/openshift/must-gather
Image Lookup: local=false
Unique Images: 1
Tags: 1
latest
updates automatically from registry quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
* quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
Less than a second ago
Image Name: must-gather:latest
Docker Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
Name: sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
Created: Less than a second ago
Annotations: image.openshift.io/dockerLayersOrder=ascending
Image Size: 112.3MB in 6 layers
Layers: 76.26MB sha256:23302e52b49d49a0a25da8ea870bc1973e7d51c9b306f3539cd397318bd8b0a5
1.62kB sha256:cf5693de4d3cdd6f352978b87c8f89ead294eff44938598f57a91cf7a02417d2
3.493MB sha256:0bdf979777916584a9b874502354ee8d4fed33d269f3f2ce1e6f314a28fa17f4
8.238MB sha256:50cca4081e146c0b51639e41dd082ecc93b980543ca57ad70ccf3a83ea96754b
24.31MB sha256:8b7a400dbb20ab2151fd7db5e269eeb753b16024884961646b9adf0dc3937e4f
4.909kB sha256:079124ff201fbee6535c8b546f6254847b1f0fe36a1d431039feedd80bcf0b3a
Image Created: 20 hours ago
Author: <none>
Arch: amd64
Command: /bin/bash
Working Dir: <none>
User: 0
Exposes Ports: <none>
Docker Labels: License=GPLv2+
architecture=x86_64
authoritative-source-url=registry.access.redhat.com
build-date=2020-04-29T12:37:35.217293
com.redhat.build-host=cpt-1008.osbs.prod.upshift.rdu2.redhat.com
com.redhat.component=ose-must-gather-container
com.redhat.license_terms=https://www.redhat.com/en/about/red-hat-end-user-license-agreements
description=OpenShift is a platform for developing, building, and deploying containerized applications.
distribution-scope=public
io.k8s.description=OpenShift is a platform for developing, building, and deploying containerized applications.
io.k8s.display-name=OpenShift Client
io.openshift.build.commit.id=b898b8c6355d1c649e8a4d2eafad31f2391bd050
io.openshift.build.commit.url=https://github.com/openshift/must-gather/commit/b898b8c6355d1c649e8a4d2eafad31f2391bd050
io.openshift.build.source-location=https://github.com/openshift/must-gather
io.openshift.maintainer.component=oc
io.openshift.maintainer.product=OpenShift Container Platform
io.openshift.tags=openshift,cli
name=openshift/ose-must-gather
release=202004291217
summary=Provides the latest release of the Red Hat Universal Base Image 7.
url=https://access.redhat.com/containers/#/registry.access.redhat.com/openshift/ose-must-gather/images/v4.5.0-202004291217
vcs-ref=5b0aa84fd447dc28b79f37894823ba99af0c9e80
vcs-type=git
vendor=Red Hat, Inc.
version=v4.5.0
Environment: __doozer=merge
BUILD_RELEASE=202004291217
BUILD_VERSION=v4.5.0
OS_GIT_MAJOR=4
OS_GIT_MINOR=5
OS_GIT_PATCH=0
OS_GIT_TREE_STATE=clean
OS_GIT_VERSION=4.5.0-202004291217-b898b8c
OS_GIT_COMMIT=b898b8c
SOURCE_DATE_EPOCH=1585787896
SOURCE_GIT_COMMIT=b898b8c6355d1c649e8a4d2eafad31f2391bd050
SOURCE_GIT_TAG=b898b8c
SOURCE_GIT_URL=https://github.com/openshift/must-gather
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
container=oci
Report a new bug abort import failure https://bugzilla.redhat.com/show_bug.cgi?id=1829786 thanks @XiuJuan Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 |