+++ This bug was initially created as a clone of Bug #1833207 +++ Description of problem: Like bug 1821783, but since that is already CLOSED ERRATA, in a new bug. I'm still seeing CI failures in the update-chain jobs due to: fail [github.com/openshift/origin/test/extended/operators/images.go:154]: May 7 07:06:24.907: Pods found with invalid container images not present in release payload: openshift-marketplace/certified-operators-847568b454-nsqsv/certified-operators image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0bbb83b21f43ec97f52d10fc9d088cd57e8a6c970ad8a699e718081734b415d1 openshift-marketplace/community-operators-ddbdbc749-jz75c/community-operators image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0bbb83b21f43ec97f52d10fc9d088cd57e8a6c970ad8a699e718081734b415d1 and similar. CI search for the past 2d [1]. Examples jobs [2,3,4]. This mechanism is breaking at least the following test: [sig-arch] Managed cluster should ensure pods use downstream images from our release image with proper ImagePullPolicy [Suite:openshift/conformance/parallel] I dunno what happened with bug 1821783. Seems like it went straight from NEW -> ON_QA without any code changes or stated motivation? Can someone who understands OLM explain what's going on with these failures? [1]: https://search.apps.build01.ci.devcluster.openshift.com/?name=upgrade&search=Pods%20found%20with%20invalid%20container%20images%20not%20present%20in%20release%20payload [2]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.2-to-4.3-to-4.4-to-4.5-ci/59 [3]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.1-to-4.2-to-4.3-to-4.4-nightly/74 [4]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.3-to-4.4-to-4.5-to-4.6-ci/38 --- Additional comment from Evan Cordell on 2020-05-08 13:06:33 UTC --- > I dunno what happened with bug 1821783. Seems like it went straight from NEW -> ON_QA without any code changes or stated motivation? Can someone who understands OLM explain what's going on with these failures? It looks like it was verified that, on a new install, the images are correct. I agree that this is an issue that needs to be investigated / addressed. The image is specified here: https://github.com/operator-framework/operator-marketplace/blob/97ae9930ea7cfaa27248b2cecaf312d177b16629/manifests/09_operator.yaml#L45 and should be replaced during release because of: https://github.com/operator-framework/operator-marketplace/blob/97ae9930ea7cfaa27248b2cecaf312d177b16629/manifests/image-references#L9-L12 OperatorSource pods are reconciled on a timer, so I suspect that just waiting a bit longer would force marketplace to roll out an update with a new image. The fix should be that we explicitly check if the operatorsource pod has the `image` that marketplace is configured with. --- Additional comment from Eric Paris on 2020-05-08 14:31:30 UTC --- This bug has been set to target the 4.5.0 release without specifying a severity. As part of triage when determining the importance of bugs a severity should be specified. Since these bugs have not been properly triaged we are removing the target release. Teams will need to add a severity before deferring these bugs again. --- Additional comment from Evan Cordell on 2020-05-18 20:21:22 UTC --- Since this issue should reconcile itself in real clusters, it is not a blocker. Moving to 4.6 --- Additional comment from Nick Hale on 2020-06-02 17:31:13 UTC --- Marking for upcoming sprint.
cloning this to 4.5 and marking high, it is breaking our 4.2->4.5 upgrade CI testing. https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.2-to-4.3-to-4.4-to-4.5-ci/91 [sig-arch] Managed cluster should ensure pods use downstream images from our release image with proper ImagePullPolicy [Suite:openshift/conformance/parallel] expand_less 4m22s fail [github.com/openshift/origin/test/extended/operators/images.go:154]: Jun 8 07:55:57.781: Pods found with invalid container images not present in release payload: openshift-marketplace/certified-operators-745b8b4946-mbv8d/certified-operators image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:eb18cb01725f9a26a5bf851bb2e7a91588e70876b18f73854eaaa953aab7f6ad openshift-marketplace/community-operators-fd8b68876-kkclh/community-operators image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:eb18cb01725f9a26a5bf851bb2e7a91588e70876b18f73854eaaa953aab7f6ad openshift-marketplace/redhat-marketplace-6b9b964456-b657m/redhat-marketplace image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2569a3f79728be9174d720159e3dc0d2c6127b6961c6c8ea00f7187310bf8348 openshift-marketplace/redhat-operators-86dbd57d46-c6h4f/redhat-operators image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:eb18cb01725f9a26a5bf851bb2e7a91588e70876b18f73854eaaa953aab7f6ad
1, Cluster version is 4.5.0-0.nightly-2020-06-15-194331 [scolange@scolange ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-06-15-194331 True False 6h26m Cluster version is 4.5.0-0.nightly-2020-06-15-194331 marketplace-operator version contains the fixed PR: [scolange@scolange ~]$ oc exec marketplace-operator-7b46798c6c-smw7v -- marketplace-operator --version time="2020-06-16T09:20:48Z" level=info msg="Go Version: go1.13.4" time="2020-06-16T09:20:48Z" level=info msg="Go OS/Arch: linux/amd64" time="2020-06-16T09:20:48Z" level=info msg="operator-sdk Version: v0.8.0" Marketplace source git commit: b1ba97ff4cf3999fd8fcdc2c97700d5291dca1f0 2, check these OperatorSource images if are downstream. [scolange@scolange ~]$ oc get pod NAME READY STATUS RESTARTS AGE certified-operators-7bc7ccb7f8-hmllm 1/1 Running 0 6h37m community-operators-87bbf6587-sx7wj 1/1 Running 0 6h37m marketplace-operator-7b46798c6c-smw7v 1/1 Running 0 6h37m qe-app-registry-6cb6579b66-56dpp 1/1 Running 0 6h27m redhat-marketplace-9587c6c97-pgpss 1/1 Running 0 6h37m redhat-operators-76895bb4bc-6prxh 1/1 Running 0 6h37m [scolange@scolange ~]$ oc exec certified-operators-7bc7ccb7f8-hmllm -- cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.8 (Maipo) [scolange@scolange ~]$ oc exec community-operators-87bbf6587-sx7wj -- cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.8 (Maipo) [scolange@scolange ~]$ oc exec marketplace-operator-7b46798c6c-smw7v -- cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.8 (Maipo) [scolange@scolange ~]$ oc exec redhat-marketplace-9587c6c97-pgpss -- cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.8 (Maipo) [scolange@scolange ~]$ oc exec redhat-operators-76895bb4bc-6prxh -- cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.8 (Maipo) All of them are downstream, LGTM, verify it.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409