Bug 1818476 - invalid must-gather istag imported for ppc64le setups
Summary: invalid must-gather istag imported for ppc64le setups
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Samples
Version: 4.5
Hardware: ppc64le
OS: Linux
unspecified
high
Target Milestone: ---
: 4.5.0
Assignee: Gabe Montero
QA Contact: XiuJuan Wang
URL:
Whiteboard:
Depends On:
Blocks: 1829073
TreeView+ depends on / blocked
 
Reported: 2020-03-28 13:14 UTC by Mark Hamzy
Modified: 2020-08-04 18:07 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: the samples operator was not copying the install pull secret when bootstrapped as removed Consequence: even though the samples operator has not installed any imagestreams into the openshift namespace, the CVO installs a set of imagestreams in the openshift to assist with customer problem determination, and those imagestreams were failing to import Fix: samples operator was updated to copy the install pull secret to the openshift namespace even when marked as removed Result: the CVO imagestreams in the openshift namespace import successfully
Clone Of:
: 1829073 1829083 (view as bug list)
Environment:
Last Closed: 2020-08-04 18:07:32 UTC
Target Upstream Version:
Embargoed:
xiuwang: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-samples-operator pull 256 0 None closed Bug 1818476: copy install pull secret when boostrapped as removed 2020-07-29 06:19:40 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-08-04 18:07:35 UTC

Description Mark Hamzy 2020-03-28 13:14:00 UTC
Description of problem:

Changing the must-gather image stream succeeds but does not stick.

Version-Release number of selected component (if applicable):

[root@ci-test-instance ~]# oc status
In project default on server https://api.test.tt.testing:6443

svc/openshift - kubernetes.default.svc.cluster.local
svc/kubernetes - 172.30.0.1:443 -> 6443

View details with 'oc describe <resource>/<name>' or list everything with 'oc get all'.
[root@ci-test-instance ~]# oc version
Client Version: 4.3.9-202003252046-6a90d0a
Server Version: 4.3.0-0.nightly-ppc64le-2020-03-27-120622
Kubernetes Version: v1.16.2

How reproducible:

Every time

Steps to Reproduce:

[root@ci-test-instance ~]# oc get -n openshift is/must-gather -o yaml
apiVersion: image.openshift.io/v1
kind: ImageStream
metadata:
  annotations:
    openshift.io/image.dockerRepositoryCheck: "2020-03-28T13:09:03Z"
  creationTimestamp: "2020-03-27T21:32:21Z"
  generation: 14
  name: must-gather
  namespace: openshift
  resourceVersion: "485217"
  selfLink: /apis/image.openshift.io/v1/namespaces/openshift/imagestreams/must-gather
  uid: 019d7600-835d-4321-a9bd-09176e1e5439
spec:
  lookupPolicy:
    local: false
  tags:
  - annotations: null
    from:
      kind: DockerImage
      name: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:219fe826dcbace73e259f2ffadc2ff2f975557345719b8eb221c81351f4a7705
    generation: 14
    importPolicy:
      scheduled: true
    name: latest
    referencePolicy:
      type: Source
status:
  dockerImageRepository: image-registry.openshift-image-registry.svc:5000/openshift/must-gather
  tags:
  - conditions:
    - generation: 2
      lastTransitionTime: "2020-03-27T21:32:22Z"
      message: you may not have access to the container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:219fe826dcbace73e259f2ffadc2ff2f975557345719b8eb221c81351f4a7705"
      reason: Unauthorized
      status: "False"
      type: ImportSuccess
    items: null
    tag: latest


[root@ci-test-instance ~]# oc patch -n openshift is/must-gather --type='json' --patch='[{"op": "replace", "path": "/spec/tags/0/from/name", "value": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7b6da6f4cf11175ad3440ac3e2659373bcea4f3a92289d536c5750bd820e7d77"}]'
imagestream.image.openshift.io/must-gather patched


[root@ci-test-instance ~]# oc get -n openshift is/must-gather -o yaml
apiVersion: image.openshift.io/v1
kind: ImageStream
metadata:
  annotations:
    openshift.io/image.dockerRepositoryCheck: "2020-03-28T13:09:03Z"
  creationTimestamp: "2020-03-27T21:32:21Z"
  generation: 15
  name: must-gather
  namespace: openshift
  resourceVersion: "486383"
  selfLink: /apis/image.openshift.io/v1/namespaces/openshift/imagestreams/must-gather
  uid: 019d7600-835d-4321-a9bd-09176e1e5439
spec:
  lookupPolicy:
    local: false
  tags:
  - annotations: null
    from:
      kind: DockerImage
      name: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7b6da6f4cf11175ad3440ac3e2659373bcea4f3a92289d536c5750bd820e7d77
    generation: 15
    importPolicy:
      scheduled: true
    name: latest
    referencePolicy:
      type: Source
status:
  dockerImageRepository: image-registry.openshift-image-registry.svc:5000/openshift/must-gather
  tags:
  - conditions:
    - generation: 2
      lastTransitionTime: "2020-03-27T21:32:22Z"
      message: you may not have access to the container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:219fe826dcbace73e259f2ffadc2ff2f975557345719b8eb221c81351f4a7705"
      reason: Unauthorized
      status: "False"
      type: ImportSuccess
    items: null
    tag: latest


[root@ci-test-instance ~]# oc get -n openshift is/must-gather -o yaml
apiVersion: image.openshift.io/v1
kind: ImageStream
metadata:
  annotations:
    openshift.io/image.dockerRepositoryCheck: "2020-03-28T13:12:15Z"
  creationTimestamp: "2020-03-27T21:32:21Z"
  generation: 16
  name: must-gather
  namespace: openshift
  resourceVersion: "486521"
  selfLink: /apis/image.openshift.io/v1/namespaces/openshift/imagestreams/must-gather
  uid: 019d7600-835d-4321-a9bd-09176e1e5439
spec:
  lookupPolicy:
    local: false
  tags:
  - annotations: null
    from:
      kind: DockerImage
      name: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:219fe826dcbace73e259f2ffadc2ff2f975557345719b8eb221c81351f4a7705
    generation: 16
    importPolicy:
      scheduled: true
    name: latest
    referencePolicy:
      type: Source
status:
  dockerImageRepository: image-registry.openshift-image-registry.svc:5000/openshift/must-gather
  tags:
  - conditions:
    - generation: 2
      lastTransitionTime: "2020-03-27T21:32:22Z"
      message: you may not have access to the container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:219fe826dcbace73e259f2ffadc2ff2f975557345719b8eb221c81351f4a7705"
      reason: Unauthorized
      status: "False"
      type: ImportSuccess
    items: null
    tag: latest


Actual results:


Expected results:


Additional info:

Comment 1 Maciej Szulik 2020-03-30 14:00:24 UTC
Moving to devex team since they own cluster-sample-operator that is responsible for this.

Comment 2 Gabe Montero 2020-03-30 14:27:56 UTC
So Maciej's statement needs some clarification.  The samples operator does not actually imagestreams like the must gather one, i.e. https://github.com/openshift/cluster-samples-operator/blob/master/manifests/08-openshift-imagestreams.yaml#L87-L100
The CVO does it as part of processing cluster operator manifests.  It just so happens that Clayton chose the samples operator (granted it was the most reasonable cluster operator to pick) as the repo to set up those manifests.

That said, this still ties into https://issues.redhat.com/browse/DEVEXP-465, which is on tap for this sprint, though it is blocked by a dependency, in that you need the cluster install secret in the openshift namespace to pull those images.

The samples operator is charged with copying the installer pull secret into the openshift namespace.

The PR for https://issues.redhat.com/browse/DEVEXP-465 includes doing this for both power and Z, as well as installing samples from openshift/library that have been earmarked for power and Z.

The aforementioned dependency:  the PR is being held until the multi arch team can get their periodic e2e jobs updated to run our image ecosystem suite, to validate the openshift/libray imagestreams (vs. the ones installed by the CVO via the cluster operator manifest).

So based on how the multi arch team work goes, and how badly resolution of the bugzilla is desired, we may have decisions to make around 

1) merging the existing PR as is prior to the multi arch e2e's being ready
2) breaking up that PR to not include the openshift/library content, but enable enough of the samples operator to copy the install pull secret.

Comment 3 Adam Kaplan 2020-03-31 15:24:39 UTC
@Gabe or option 3 - change the `Removed` behavior so that the install pull secret is always copied into the `openshift` namespace? This is akin to what the Image Registry operator needs to do with the node-ca daemonset.

Comment 4 Gabe Montero 2020-03-31 15:44:40 UTC
Yes that is a possiblity @Adam, though I'm not crazy about that option.

I'm more inclined, as I noted in the associated PR https://github.com/openshift/cluster-samples-operator/pull/225, to pick the change to not boostrap as removed into a new PR, but not pick the commit that actually pulls in content to install, besides what the CVO installs, until the multi arch team gets the CI ready.

But we can percolate on it more in either place.

Comment 5 Gabe Montero 2020-04-05 23:31:03 UTC
So chalk up another one to my aging memory ;-) .... the code *ALREADY* keeps the pull secret around when set to removed, i.e. Adam's suggestion.

Looking at the history, we have always been doing it.  And I've got comments explaining why.

Where the break down occurs is if we *BOOTSTRAP* as removed.  Then the copy does not happen.  Currently it only handles managed to removed transitions.

There is a simple fix and test case to cover the bootstrap as removed use case.  I'll have that up momentarily.  Going to keep the course here vs. the more complicated change I previously suggested.

Comment 8 XiuJuan Wang 2020-04-10 03:17:28 UTC
Hey Yaakov,
Could you help handle this bug?
Thanks

Comment 11 Prashanth Sundararaman 2020-04-29 19:16:07 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1827694 is the same as this I believe. It does look like the bootstrap code does not kick in to copy the pull secret.

Comment 12 Gabe Montero 2020-04-29 20:01:39 UTC
I would agree Prashanth reading https://bugzilla.redhat.com/show_bug.cgi?id=1827694 ... the same root cause 

I've reached out to Jeremy in slack as well

But if it helps the multi arch team out at all, Ben Parees and I have agreed that we could have openshift QE verify this fix in 4.5 
using disconnected scenarios on x86, and once verified, it would accelerate the backporting process to 4.4.z and 4.3.z, assuming 
getting 4.4.z or 4.3.z versions of the fix is what multi arch currently needs.

I'll wait a bit longer to hear if there is agreement on that approach from the multi arch side.  But absent feedback in the next 
day or two I'll probably just pull the trigger on that approach.

Comment 13 Gabe Montero 2020-04-29 21:36:37 UTC
Ok talked to Jeremy and Prashanth on slack.

We are go with the plan I noted in #Comment 12

@XiuJuan - I've assigned QA contact back to you.

Use this to verify
a) run a 4.5 x86 nightly
b) set up either a disconnected or x86 scenario so that samples operator bootstraps as removed (just like it currently bootstraps as removed for ppc/z)
c) ensure the "must-gather" imagestream in the openshift namespace is able to pull the images from the install payload

if we are good, mark verify, and we'll start cherrypicking, etc.

Comment 14 XiuJuan Wang 2020-04-30 08:20:09 UTC
1. must-gather imagestream is importing image from install payload.
$oc adm release info --pullspecs registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-04-29-231711  | grep must
  must-gather                                    quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10

2. But schedule still don't work after I has inserted CA for mirror registry one hour, must-gather imagestream didn't retry to import image.
This issue could reproduce in 4.4 sometimes.., should track in another bug. Therefore mark this bug as verified to backport.

$ oc get is must-gather -n openshift -o yaml 
apiVersion: image.openshift.io/v1
kind: ImageStream
metadata:
  creationTimestamp: "2020-04-30T04:23:01Z"
  generation: 2
  managedFields:
  - apiVersion: image.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        f:tags:
          .: {}
          k:{"name":"latest"}:
            .: {}
            f:annotations: {}
            f:from:
              .: {}
              f:kind: {}
              f:name: {}
            f:generation: {}
            f:importPolicy:
              .: {}
              f:scheduled: {}
            f:name: {}
            f:referencePolicy:
              .: {}
              f:type: {}
      f:status:
        f:publicDockerImageRepository: {}
    manager: cluster-version-operator
    operation: Update
    time: "2020-04-30T07:24:06Z"
  name: must-gather
  namespace: openshift
  resourceVersion: "88336"
  selfLink: /apis/image.openshift.io/v1/namespaces/openshift/imagestreams/must-gather
  uid: 9da66489-19aa-42e3-9cf0-8f4a8e8323c1
spec:
  lookupPolicy:
    local: false
  tags:
  - annotations: null
    from:
      kind: DockerImage
      name: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
    generation: 2
    importPolicy:
      scheduled: true
    name: latest
    referencePolicy:
      type: Source
status:
  dockerImageRepository: image-registry.openshift-image-registry.svc:5000/openshift/must-gather
  publicDockerImageRepository: default-route-openshift-image-registry.apps.qe-hashadebug1.qe.gcp.devcluster.openshift.com/openshift/must-gather
  tags:
  - conditions:
    - generation: 2
      lastTransitionTime: "2020-04-30T04:23:31Z"
      message: 'Internal error occurred: [qe-hashadebug1.mirror-registry.qe.gcp.devcluster.openshift.com:5000/ocp/release@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10:
        Get https://qe-hashadebug1.mirror-registry.qe.gcp.devcluster.openshift.com:5000/v2/:
        x509: certificate signed by unknown authority, quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10:
        Get https://quay.io/v2/: net/http: request canceled while waiting for connection
        (Client.Timeout exceeded while awaiting headers)]'
      reason: InternalError
      status: "False"
      type: ImportSuccess
    items: null
    tag: latest

3.Succeed to import imagestream manually.
$ oc import-image must-gather:latest --from=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10 
imagestream.image.openshift.io/must-gather imported

Name:			must-gather
Namespace:		openshift
Created:		4 hours ago
Labels:			<none>
Annotations:		openshift.io/image.dockerRepositoryCheck=2020-04-30T08:17:37Z
Image Repository:	default-route-openshift-image-registry.apps.qe-hashadebug1.qe.gcp.devcluster.openshift.com/openshift/must-gather
Image Lookup:		local=false
Unique Images:		1
Tags:			1

latest
  updates automatically from registry quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10

  * quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
      Less than a second ago

Image Name:	must-gather:latest
Docker Image:	quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
Name:		sha256:c7301145c5db9ff1d9bc430214e775d316181f1c392ec342ba1964f03e9dbc10
Created:	Less than a second ago
Annotations:	image.openshift.io/dockerLayersOrder=ascending
Image Size:	112.3MB in 6 layers
Layers:		76.26MB	sha256:23302e52b49d49a0a25da8ea870bc1973e7d51c9b306f3539cd397318bd8b0a5
		1.62kB	sha256:cf5693de4d3cdd6f352978b87c8f89ead294eff44938598f57a91cf7a02417d2
		3.493MB	sha256:0bdf979777916584a9b874502354ee8d4fed33d269f3f2ce1e6f314a28fa17f4
		8.238MB	sha256:50cca4081e146c0b51639e41dd082ecc93b980543ca57ad70ccf3a83ea96754b
		24.31MB	sha256:8b7a400dbb20ab2151fd7db5e269eeb753b16024884961646b9adf0dc3937e4f
		4.909kB	sha256:079124ff201fbee6535c8b546f6254847b1f0fe36a1d431039feedd80bcf0b3a
Image Created:	20 hours ago
Author:		<none>
Arch:		amd64
Command:	/bin/bash
Working Dir:	<none>
User:		0
Exposes Ports:	<none>
Docker Labels:	License=GPLv2+
		architecture=x86_64
		authoritative-source-url=registry.access.redhat.com
		build-date=2020-04-29T12:37:35.217293
		com.redhat.build-host=cpt-1008.osbs.prod.upshift.rdu2.redhat.com
		com.redhat.component=ose-must-gather-container
		com.redhat.license_terms=https://www.redhat.com/en/about/red-hat-end-user-license-agreements
		description=OpenShift is a platform for developing, building, and deploying containerized applications.
		distribution-scope=public
		io.k8s.description=OpenShift is a platform for developing, building, and deploying containerized applications.
		io.k8s.display-name=OpenShift Client
		io.openshift.build.commit.id=b898b8c6355d1c649e8a4d2eafad31f2391bd050
		io.openshift.build.commit.url=https://github.com/openshift/must-gather/commit/b898b8c6355d1c649e8a4d2eafad31f2391bd050
		io.openshift.build.source-location=https://github.com/openshift/must-gather
		io.openshift.maintainer.component=oc
		io.openshift.maintainer.product=OpenShift Container Platform
		io.openshift.tags=openshift,cli
		name=openshift/ose-must-gather
		release=202004291217
		summary=Provides the latest release of the Red Hat Universal Base Image 7.
		url=https://access.redhat.com/containers/#/registry.access.redhat.com/openshift/ose-must-gather/images/v4.5.0-202004291217
		vcs-ref=5b0aa84fd447dc28b79f37894823ba99af0c9e80
		vcs-type=git
		vendor=Red Hat, Inc.
		version=v4.5.0
Environment:	__doozer=merge
		BUILD_RELEASE=202004291217
		BUILD_VERSION=v4.5.0
		OS_GIT_MAJOR=4
		OS_GIT_MINOR=5
		OS_GIT_PATCH=0
		OS_GIT_TREE_STATE=clean
		OS_GIT_VERSION=4.5.0-202004291217-b898b8c
		OS_GIT_COMMIT=b898b8c
		SOURCE_DATE_EPOCH=1585787896
		SOURCE_GIT_COMMIT=b898b8c6355d1c649e8a4d2eafad31f2391bd050
		SOURCE_GIT_TAG=b898b8c
		SOURCE_GIT_URL=https://github.com/openshift/must-gather
		PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
		container=oci

Comment 15 XiuJuan Wang 2020-04-30 11:21:13 UTC
Report a new bug abort import failure https://bugzilla.redhat.com/show_bug.cgi?id=1829786

Comment 16 Gabe Montero 2020-04-30 13:06:14 UTC
thanks @XiuJuan

Comment 18 errata-xmlrpc 2020-08-04 18:07:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.