Description of problem: If a bundle image reference in an InstallPlan is unpullable, the corresponding bundle unpack job will wait indefinitely because its managed pod never advances beyond Pending. The job should eventually fail if it does not succeed within a limited time period, and the InstallPlan's lookup status should reflect the job failure. How reproducible: Always Steps to Reproduce: 1. Create an InstallPlan whose status contains a BundleLookup that references an unpullable image. Actual results: The unpack job's pod never starts because it is unable to pull the image, and the job never terminates. Expected results: After a reasonable period of time (five or ten minutes?), the unpack job terminates and the InstallPlan status indicates that the bundle lookup has failed.
Marking this with upcoming sprint and removing from 4.7.0. This will be addressed in a future sprint.
Marking as triaged and reviewed in sprint. We just need to assign this one out.
Marking as reviewed in sprint: We already have a PR in progress that adds a timeout and fails the InstallPlan on bundle unpack failures. We're currently working on covering the case of non-existent bundle images after which the PR should be ready to go.
The linked PR: https://github.com/operator-framework/operator-lifecycle-manager/pull/2093 is only upstream for now so I've unlinked the BZ bot from that so this doesn't go to MODIFIED/ON_QA when that PR merges. That should happen once I have a corresponding PR on the downstream repo: https://github.com/openshift/operator-framework-olm
Changing to assigned since only the upstream PR was merged. https://github.com/operator-framework/operator-lifecycle-manager/pull/2093 The BZ should move to POST once the downstream PR is opened.
Verify: zhaoxia@xzha-mac etcd % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-05-15-141455 True False 5h6m Cluster version is 4.8.0-0.nightly-2021-05-15-141455 zhaoxia@xzha-mac etcd % oc adm release info registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-05-15-141455 |grep operator-lifecycle-manager operator-lifecycle-manager sha256:d03207b3ed3d3713bdd5e3f8cd912f61d91e676d3916da0c3e1957366069dc4d 1) prepare a index image which bundle is unpullable zhaoxia@xzha-mac etcd % opm alpha bundle build -c 4.8 -e 4.8 -d ./0.9.2/ -p etcd -t quay.io/olmqe/etcd-bundle:bug-0.9.2 -o=true INFO[0000] Building annotations.yaml ... => => naming to quay.io/olmqe/etcd-bundle:bug-0.9.2 zhaoxia@xzha-mac etcd % docker push quay.io/olmqe/etcd-bundle:bug-0.9.2 The push refers to repository [quay.io/olmqe/etcd-bundle] f0911355df3b: Pushed e5a10c489bac: Pushed bug-0.9.2: digest: sha256:916850868de1b58d9b5ea72e571b071f4e618389395d1fbd2edf1709b36bfd9f size: 732 zhaoxia@xzha-mac etcd % opm index add -b quay.io/olmqe/etcd-bundle:bug-0.9.2 -t quay.io/olmqe/etcd-index:bug-0.9.2 -c docker --mode semver INFO[0000] building the index bundles="[quay.io/olmqe/etcd-bundle:bug-0.9.2]" INFO[0000] running /usr/local/bin/docker pull quay.io/olmqe/etcd-bundle:bug-0.9.2 bundles="[quay.io/olmqe/etcd-bundle:bug-0.9.2]" INFO[0005] running docker create bundles="[quay.io/olmqe/etcd-bundle:bug-0.9.2]" INFO[0005] running docker cp bundles="[quay.io/olmqe/etcd-bundle:bug-0.9.2]" INFO[0005] running docker rm bundles="[quay.io/olmqe/etcd-bundle:bug-0.9.2]" INFO[0005] Could not find optional dependencies file dir=bundle_tmp064132418 file=bundle_tmp064132418/metadata load=annotations INFO[0005] found csv, loading bundle dir=bundle_tmp064132418 file=bundle_tmp064132418/manifests load=bundle INFO[0005] loading bundle file dir=bundle_tmp064132418/manifests file=etcdbackups.etcd.database.coreos.com.crd.yaml load=bundle INFO[0005] loading bundle file dir=bundle_tmp064132418/manifests file=etcdclusters.etcd.database.coreos.com.crd.yaml load=bundle INFO[0005] loading bundle file dir=bundle_tmp064132418/manifests file=etcdoperator.v0.9.2.clusterserviceversion.yaml load=bundle INFO[0005] loading bundle file dir=bundle_tmp064132418/manifests file=etcdrestores.etcd.database.coreos.com.crd.yaml load=bundle INFO[0005] Generating dockerfile bundles="[quay.io/olmqe/etcd-bundle:bug-0.9.2]" INFO[0005] writing dockerfile: index.Dockerfile695761296 bundles="[quay.io/olmqe/etcd-bundle:bug-0.9.2]" INFO[0005] running docker build bundles="[quay.io/olmqe/etcd-bundle:bug-0.9.2]" INFO[0005] [docker build -f index.Dockerfile695761296 -t quay.io/olmqe/etcd-index:bug-0.9.2 .] bundles="[quay.io/olmqe/etcd-bundle:bug-0.9.2]" zhaoxia@xzha-mac etcd % docker push quay.io/olmqe/etcd-index:bug-0.9.2 The push refers to repository [quay.io/olmqe/etcd-index] 3c9814547744: Pushed 87d838f78e87: Mounted from olmqe/ditto-index e9033f602913: Mounted from olmqe/ditto-index 479bfeeb0448: Layer already exists 71b397855bc4: Layer already exists b2d5eeeaba3a: Layer already exists bug-0.9.2: digest: sha256:b8fe75aae745d3d69e76bcef1eb1e7e27181006d217cad46d81e3df907f17d1c size: 1578 Delete quay.io/olmqe/etcd-bundle:bug-0.9.2 using quay.io GUI 2) create catsrc using this index image, create sub zhaoxia@xzha-mac etcd % cat catsrc.yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: etcd-index-1 namespace: openshift-marketplace spec: displayName: etcd-cr publisher: testqe sourceType: grpc image: quay.io/olmqe/etcd-index:bug-0.9.2 updateStrategy: registryPoll: interval: 2m zhaoxia@xzha-mac etcd % cat sub.yaml apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: etcd namespace: test-bug spec: channel: "4.8" installPlanApproval: Automatic name: etcd source: etcd-index-1 sourceNamespace: openshift-marketplace zhaoxia@xzha-mac etcd % oc apply -f catsrc.yaml catalogsource.operators.coreos.com/etcd-index-1 created zhaoxia@xzha-mac etcd % oc get packagemanifests | grep etcd etcd Community Operators 5h13m etcd etcd-cr 46s zhaoxia@xzha-mac etcd % date; oc apply -f sub.yaml Mon May 17 16:11:27 CST 2021 subscription.operators.coreos.com/etcd created 3) monitor installplan status zhaoxia@xzha-mac etcd % date; oc get ip -o=jsonpath={..status} Mon May 17 16:12:01 CST 2021 {"bundleLookups":[{"catalogSourceRef":{"name":"etcd-index-1","namespace":"openshift-marketplace"},"conditions":[{"message":"bundle contents have not yet been persisted to installplan status","reason":"BundleNotUnpacked","status":"True","type":"BundleLookupNotPersisted"},{"lastTransitionTime":"2021-05-17T08:11:35Z","message":"unpack job not completed: Unpack pod(openshift-marketplace/4ac4da32fae9695941e732bb422cdc9c4db9dcaf24ee477ad1681f72624zmgw) container(pull) is pending. Reason: ImagePullBackOff, Message: Back-off pulling image \"quay.io/olmqe/etcd-bundle:bug-0.9.2\"","reason":"JobIncomplete","status":"True","type":"BundleLookupPending"}],"identifier":"etcdoperator.v0.9.2","path":"quay.io/olmqe/etcd-bundle:bug-0.9.2","properties":"{\"properties\":[{\"type\":\"olm.gvk\",\"value\":{\"group\":\"etcd.database.coreos.com\",\"kind\":\"EtcdBackup\",\"version\":\"v1beta2\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"etcd.database.coreos.com\",\"kind\":\"EtcdCluster\",\"version\":\"v1beta2\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"etcd.database.coreos.com\",\"kind\":\"EtcdRestore\",\"version\":\"v1beta2\"}},{\"type\":\"olm.package\",\"value\":{\"packageName\":\"etcd\",\"version\":\"0.9.2\"}}]}","replaces":""}],"catalogSources":[],"phase":"Installing"} True True% zhaoxia@xzha-mac etcd % date; oc get ip -o=jsonpath={..status} -n test-bug Mon May 17 16:22:07 CST 2021 {"bundleLookups":[{"catalogSourceRef":{"name":"etcd-index-1","namespace":"openshift-marketplace"},"conditions":[{"message":"bundle contents have not yet been persisted to installplan status","reason":"BundleNotUnpacked","status":"True","type":"BundleLookupNotPersisted"},{"lastTransitionTime":"2021-05-17T08:11:35Z","message":"unpack job not completed: Unpack pod(openshift-marketplace/4ac4da32fae9695941e732bb422cdc9c4db9dcaf24ee477ad1681f72624zmgw) container(pull) is pending. Reason: ImagePullBackOff, Message: Back-off pulling image \"quay.io/olmqe/etcd-bundle:bug-0.9.2\"","reason":"JobIncomplete","status":"True","type":"BundleLookupPending"},{"lastTransitionTime":"2021-05-17T08:21:38Z","message":"Job was active longer than specified deadline","reason":"DeadlineExceeded","status":"True","type":"BundleLookupFailed"}],"identifier":"etcdoperator.v0.9.2","path":"quay.io/olmqe/etcd-bundle:bug-0.9.2","properties":"{\"properties\":[{\"type\":\"olm.gvk\",\"value\":{\"group\":\"etcd.database.coreos.com\",\"kind\":\"EtcdBackup\",\"version\":\"v1beta2\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"etcd.database.coreos.com\",\"kind\":\"EtcdCluster\",\"version\":\"v1beta2\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"etcd.database.coreos.com\",\"kind\":\"EtcdRestore\",\"version\":\"v1beta2\"}},{\"type\":\"olm.package\",\"value\":{\"packageName\":\"etcd\",\"version\":\"0.9.2\"}}]}","replaces":""}],"catalogSources":[],"conditions":[{"lastTransitionTime":"2021-05-17T08:21:38Z","lastUpdateTime":"2021-05-17T08:21:38Z","message":"Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline","reason":"InstallCheckFailed","status":"False","type":"Installed"}],"phase":"Failed"} True True True False% 4) check logs 1660 time="2021-05-17T08:20:22Z" level=info msg=syncing id=fgodz ip=install-cns6w namespace=test-bug phase=Installing 1661 time="2021-05-17T08:20:27Z" level=info msg=syncing id=z6vQR ip=install-cns6w namespace=test-bug phase=Installing 1662 time="2021-05-17T08:20:32Z" level=info msg=syncing id=kv014 ip=install-cns6w namespace=test-bug phase=Installing 1663 time="2021-05-17T08:20:37Z" level=info msg=syncing id=yuhel ip=install-cns6w namespace=test-bug phase=Installing 1664 time="2021-05-17T08:20:42Z" level=info msg=syncing id=zvtO2 ip=install-cns6w namespace=test-bug phase=Installing 1665 time="2021-05-17T08:20:47Z" level=info msg=syncing id=pUWDX ip=install-cns6w namespace=test-bug phase=Installing 1666 time="2021-05-17T08:20:53Z" level=info msg=syncing id=MQfiH ip=install-cns6w namespace=test-bug phase=Installing 1667 time="2021-05-17T08:20:58Z" level=info msg=syncing id=HXytH ip=install-cns6w namespace=test-bug phase=Installing 1668 time="2021-05-17T08:21:03Z" level=info msg=syncing id=uDYZn ip=install-cns6w namespace=test-bug phase=Installing 1669 time="2021-05-17T08:21:08Z" level=info msg=syncing id=uXa15 ip=install-cns6w namespace=test-bug phase=Installing 1670 time="2021-05-17T08:21:13Z" level=info msg=syncing id=Z/6cu ip=install-cns6w namespace=test-bug phase=Installing 1671 time="2021-05-17T08:21:18Z" level=info msg=syncing id=NIRpY ip=install-cns6w namespace=test-bug phase=Installing 1672 time="2021-05-17T08:21:23Z" level=info msg=syncing id=IEeXY ip=install-cns6w namespace=test-bug phase=Installing 1673 time="2021-05-17T08:21:28Z" level=info msg=syncing id=2uqS2 ip=install-cns6w namespace=test-bug phase=Installing 1674 time="2021-05-17T08:21:33Z" level=info msg=syncing id=AE9hO ip=install-cns6w namespace=test-bug phase=Installing 1675 time="2021-05-17T08:21:35Z" level=info msg="state.Key.Namespace=openshift-marketplace state.Key.Name=qe-app-registry state.State=CONNECTING" 1676 time="2021-05-17T08:21:38Z" level=info msg=syncing id=2mipO ip=install-cns6w namespace=test-bug phase=Installing 1677 time="2021-05-17T08:21:38Z" level=warning msg="status not equal, updating..." id=2mipO ip=install-cns6w namespace=test-bug phase=Installing 1678 time="2021-05-17T08:21:38Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink= 1679 time="2021-05-17T08:21:38Z" level=info msg=syncing id=U/5h8 ip=install-cns6w namespace=test-bug phase=Installing 1680 time="2021-05-17T08:21:38Z" level=info msg="Bundle unpacking failed. Reason: DeadlineExceeded, and Message: Job was active longer than specified deadline" id=U/5h8 ip=install-cns6w namespace=test-b ug phase=Installing 1681 time="2021-05-17T08:21:38Z" level=info msg="transitioning InstallPlan to failed" id=U/5h8 ip=install-cns6w namespace=test-bug phase=Installing LGTM, verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438