1833207 – openshift-marketplace: Pods found with invalid container images not present in release payload (v2)

Bug 1833207 - openshift-marketplace: Pods found with invalid container images not present in release payload (v2)

Summary: openshift-marketplace: Pods found with invalid container images not present i...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	OLM
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Evan Cordell
QA Contact:	Jian Zhang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1845644
TreeView+	depends on / blocked

Reported:	2020-05-08 04:52 UTC by W. Trevor King
Modified:	2023-12-15 17:51 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1845644 (view as bug list)
Environment:
Last Closed:	2020-10-27 15:58:53 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	operator-framework operator-marketplace pull 313	0	None	closed	Bug 1833207: Ensure correct registry image	2021-02-02 22:11:50 UTC
Red Hat Product Errata	RHBA-2020:4196	0	None	None	None	2020-10-27 15:59:14 UTC

Description W. Trevor King 2020-05-08 04:52:26 UTC

Description of problem:

Like bug 1821783, but since that is already CLOSED ERRATA, in a new bug.  I'm still seeing CI failures in the update-chain jobs due to:

  fail [github.com/openshift/origin/test/extended/operators/images.go:154]: May  7 07:06:24.907: Pods found with invalid container images not present in release payload: openshift-marketplace/certified-operators-847568b454-nsqsv/certified-operators image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0bbb83b21f43ec97f52d10fc9d088cd57e8a6c970ad8a699e718081734b415d1
openshift-marketplace/community-operators-ddbdbc749-jz75c/community-operators image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0bbb83b21f43ec97f52d10fc9d088cd57e8a6c970ad8a699e718081734b415d1

and similar.  CI search for the past 2d [1].  Examples jobs [2,3,4].  This mechanism is breaking at least the following test:

[sig-arch] Managed cluster should ensure pods use downstream images from our release image with proper ImagePullPolicy [Suite:openshift/conformance/parallel]

I dunno what happened with bug 1821783.  Seems like it went straight from NEW -> ON_QA without any code changes or stated motivation?  Can someone who understands OLM explain what's going on with these failures?

[1]: https://search.apps.build01.ci.devcluster.openshift.com/?name=upgrade&search=Pods%20found%20with%20invalid%20container%20images%20not%20present%20in%20release%20payload
[2]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.2-to-4.3-to-4.4-to-4.5-ci/59
[3]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.1-to-4.2-to-4.3-to-4.4-nightly/74
[4]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.3-to-4.4-to-4.5-to-4.6-ci/38

Comment 1 Evan Cordell 2020-05-08 13:06:33 UTC

> I dunno what happened with bug 1821783.  Seems like it went straight from NEW -> ON_QA without any code changes or stated motivation?  Can someone who understands OLM explain what's going on with these failures?

It looks like it was verified that, on a new install, the images are correct. I agree that this is an issue that needs to be investigated / addressed.


The image is specified here: https://github.com/operator-framework/operator-marketplace/blob/97ae9930ea7cfaa27248b2cecaf312d177b16629/manifests/09_operator.yaml#L45

and should be replaced during release because of: https://github.com/operator-framework/operator-marketplace/blob/97ae9930ea7cfaa27248b2cecaf312d177b16629/manifests/image-references#L9-L12

OperatorSource pods are reconciled on a timer, so I suspect that just waiting a bit longer would force marketplace to roll out an update with a new image. The fix should be that we explicitly check if the operatorsource pod has the `image` that marketplace is configured with.

Comment 8 Jian Zhang 2020-06-11 08:44:41 UTC

1, Cluster version is 4.6.0-0.nightly-2020-06-11-041445
marketplace-operator version contains the fixed PR:
[root@preserve-olm-env data]# oc exec marketplace-operator-5cf4488cfc-6b8t8 -- marketplace-operator --version
Marketplace source git commit: a00763fa951dad170d671eb6ddc69f8dcab13c6e
time="2020-06-11T08:40:20Z" level=info msg="Go Version: go1.13.4"
time="2020-06-11T08:40:20Z" level=info msg="Go OS/Arch: linux/amd64"
time="2020-06-11T08:40:20Z" level=info msg="operator-sdk Version: v0.8.0"

2, check these OperatorSource images if are downstream.

[root@preserve-olm-env data]# oc get pods
NAME                                    READY   STATUS    RESTARTS   AGE
certified-operators-68487fcd8d-5vl58    1/1     Running   0          45m
community-operators-6b8c84d5c5-bl54c    1/1     Running   0          44m
marketplace-operator-5cf4488cfc-6b8t8   1/1     Running   0          45m
redhat-marketplace-fb6b559c5-n4wh6      1/1     Running   0          44m
redhat-operators-78645b7dbb-pb4t2       1/1     Running   0          45m

[root@preserve-olm-env data]# oc exec certified-operators-68487fcd8d-5vl58  -- cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.8 (Maipo)

[root@preserve-olm-env data]# oc exec community-operators-6b8c84d5c5-bl54c -- cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.8 (Maipo)

[root@preserve-olm-env data]# oc exec redhat-operators-78645b7dbb-pb4t2  -- cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.8 (Maipo)

[root@preserve-olm-env data]# oc exec redhat-marketplace-fb6b559c5-n4wh6   -- cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.8 (Maipo)

[root@preserve-olm-env data]# oc exec marketplace-operator-5cf4488cfc-6b8t8    -- cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.8 (Maipo)

All of them are downstream, LGTM, verify it.

Comment 11 errata-xmlrpc 2020-10-27 15:58:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Note You need to log in before you can comment on or make changes to this bug.