Bug 1968700 - catalog-operator crashes when status.initContainerStatuses[].state.waiting is nil
Summary: catalog-operator crashes when status.initContainerStatuses[].state.waiting is...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Ben Luddy
QA Contact: kuiwang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-07 20:29 UTC by Ben Luddy
Modified: 2021-07-27 23:12 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 23:11:53 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift operator-framework-olm pull 88 0 None open Bug 1968700: Fix nil pointer dereference while reporting bundle unpack status. 2021-06-07 20:31:39 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:12:11 UTC

Description Ben Luddy 2021-06-07 20:29:16 UTC
Description of problem:

catalog-operator can crash while installing an operator with the following panic:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x90d19eb]

goroutine 329 [running]:
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/bundle.(*ConfigMapUnpacker).pendingContainerStatusMessages(0xc042d20, 0xc75a000, 0x8, 0x0, 0xa2aec00, 0x0)
	/home/bluddy/w/operator-lifecycle-manager/pkg/controller/bundle/bundle_unpacker.go:531 +0x29b
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/bundle.(*ConfigMapUnpacker).UnpackBundle(0xc042d20, 0xcf32900, 0x7b8a800, 0xfffffff2, 0xa2b0520, 0x0, 0x9511100)
	/home/bluddy/w/operator-lifecycle-manager/pkg/controller/bundle/bundle_unpacker.go:461 +0x80a
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog.(*Operator).unpackBundles(0xc17e480, 0xc1870e0, 0x0, 0xceb4c90, 0x1, 0x0)
	/home/bluddy/w/operator-lifecycle-manager/pkg/controller/operators/catalog/operator.go:1216 +0x16c
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog.(*Operator).syncInstallPlans(0xc17e480, 0x950c520, 0xc1870e0, 0x0, 0x0)
	/home/bluddy/w/operator-lifecycle-manager/pkg/controller/operators/catalog/operator.go:1425 +0xaa9
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.LegacySyncHandler.ToSyncerWithDelete.func1(0x979763c, 0xc4efc80, 0x9789d40, 0xceb4c70, 0xceb4c70, 0x9415f20)
	/home/bluddy/w/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer.go:183 +0x1f5
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate.SyncFunc.Sync(0xc689600, 0x979763c, 0xc4efc80, 0x9789d40, 0xceb4c70, 0xcf97701, 0x0)
	/home/bluddy/w/operator-lifecycle-manager/pkg/lib/kubestate/kubestate.go:184 +0x3c
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*QueueInformer).Sync(...)
	/home/bluddy/w/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer.go:36
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).processNextWorkItem(0xc52c1c0, 0x979763c, 0xc4efc80, 0xc11d950, 0x0)
	/home/bluddy/w/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:287 +0x297
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).worker(0xc52c1c0, 0x979763c, 0xc4efc80, 0xc11d950)
	/home/bluddy/w/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:231 +0x39
created by github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start
	/home/bluddy/w/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x3b9


Version-Release number of selected component (if applicable): 4.8


How reproducible: 


Steps to Reproduce:

1. Install any operator via a Subscription

Actual results:

catalog-operator crashes


Expected results:

catalog-operator does not crash

Comment 2 kuiwang 2021-06-09 05:58:57 UTC
verify it on 4.8. LGTM

--
[root@preserve-olm-env 1968700]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-06-09-000526   True        False         8m50s   Cluster version is 4.8.0-0.nightly-2021-06-09-000526
[root@preserve-olm-env 1968700]# oc get pod -n openshift-operator-lifecycle-manager
NAME                                READY   STATUS    RESTARTS   AGE
catalog-operator-59b74994d5-bmtp4   1/1     Running   0          45m
olm-operator-6f97647bb6-zbdhx       1/1     Running   0          45m
packageserver-fbf5b7757-nfp9n       1/1     Running   0          35m
packageserver-fbf5b7757-swrfm       1/1     Running   0          35m
[root@preserve-olm-env 1968700]# oc exec catalog-operator-59b74994d5-bmtp4 -n openshift-operator-lifecycle-manager -- olm --version
OLM version: 0.17.0
git commit: 41991c626f4aa90edf3bdead54f97a2bf8dc4af5
[root@preserve-olm-env 1968700]# cat og-single.yaml 
kind: OperatorGroup
apiVersion: operators.coreos.com/v1
metadata:
  name: og-single1
  namespace: default
spec:
  targetNamespaces:
  - default
[root@preserve-olm-env 1968700]# oc apply -f og-single.yaml 
operatorgroup.operators.coreos.com/og-single1 created
[root@preserve-olm-env 1968700]# cat catsrc.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: cockroachdb-catalog
  namespace: default
spec:
  displayName: cockroachdb Operator Catalog
  image: quay.io/kuiwang/cockroachdb-index:2.1.1-30860
  publisher: QE
  sourceType: grpc
[root@preserve-olm-env 1968700]# oc apply -f catsrc.yaml 
catalogsource.operators.coreos.com/cockroachdb-catalog created
[root@preserve-olm-env 1968700]# 
[root@preserve-olm-env 1968700]# oc projects|grep life
openshift-operator-lifecycle-manager
[root@preserve-olm-env 1968700]# oc get pod -n openshift-operator-lifecycle-manager
NAME                                READY   STATUS    RESTARTS   AGE
catalog-operator-59b74994d5-bmtp4   1/1     Running   0          47m
olm-operator-6f97647bb6-zbdhx       1/1     Running   0          47m
packageserver-fbf5b7757-nfp9n       1/1     Running   0          37m
packageserver-fbf5b7757-swrfm       1/1     Running   0          37m
[root@preserve-olm-env 1968700]# oc logs catalog-operator-59b74994d5-bmtp4 -n openshift-operator-lifecycle-manager --tail=30
time="2021-06-09T05:34:58Z" level=info msg="catalog polling result: no update" CatalogSource=redhat-operators
time="2021-06-09T05:34:58Z" level=info msg="catalog polling result: no update" CatalogSource=redhat-operators
time="2021-06-09T05:36:19Z" level=info msg="catalog update required at 2021-06-09 05:36:19.995892103 +0000 UTC m=+1450.022644825" CatalogSource=community-operators
time="2021-06-09T05:36:19Z" level=info msg="catalog update required at 2021-06-09 05:36:19.995892313 +0000 UTC m=+1450.022645022" CatalogSource=certified-operators
time="2021-06-09T05:36:30Z" level=info msg="catalog polling result: no update" CatalogSource=community-operators
time="2021-06-09T05:36:30Z" level=info msg="catalog polling result: no update" CatalogSource=certified-operators
time="2021-06-09T05:36:30Z" level=info msg="catalog polling result: no update" CatalogSource=community-operators
time="2021-06-09T05:36:30Z" level=info msg="catalog polling result: no update" CatalogSource=certified-operators
time="2021-06-09T05:37:37Z" level=info msg="Adding related objects for operator-lifecycle-manager-catalog"
time="2021-06-09T05:42:42Z" level=info msg="Adding related objects for operator-lifecycle-manager-catalog"
time="2021-06-09T05:44:55Z" level=info msg="catalog update required at 2021-06-09 05:44:55.150235989 +0000 UTC m=+1965.176988711" CatalogSource=redhat-marketplace
time="2021-06-09T05:45:05Z" level=info msg="catalog polling result: no update" CatalogSource=redhat-marketplace
time="2021-06-09T05:45:05Z" level=info msg="catalog polling result: no update" CatalogSource=redhat-marketplace
time="2021-06-09T05:47:47Z" level=info msg="Adding related objects for operator-lifecycle-manager-catalog"
time="2021-06-09T05:48:03Z" level=info msg="catalog update required at 2021-06-09 05:48:03.330679602 +0000 UTC m=+2153.357432325" CatalogSource=redhat-operators
time="2021-06-09T05:48:03Z" level=info msg="catalog update required at 2021-06-09 05:48:03.331719559 +0000 UTC m=+2153.358472286" CatalogSource=certified-operators
time="2021-06-09T05:48:03Z" level=info msg="catalog update required at 2021-06-09 05:48:03.401530083 +0000 UTC m=+2153.428282793" CatalogSource=community-operators
time="2021-06-09T05:48:13Z" level=info msg="catalog polling result: no update" CatalogSource=certified-operators
time="2021-06-09T05:48:13Z" level=info msg="catalog polling result: no update" CatalogSource=redhat-operators
time="2021-06-09T05:48:13Z" level=info msg="catalog polling result: no update" CatalogSource=certified-operators
time="2021-06-09T05:48:13Z" level=info msg="catalog polling result: no update" CatalogSource=redhat-operators
time="2021-06-09T05:48:13Z" level=info msg="catalog polling result: no update" CatalogSource=community-operators
time="2021-06-09T05:48:13Z" level=info msg="catalog polling result: no update" CatalogSource=community-operators
time="2021-06-09T05:48:58Z" level=warning msg="couldn't find service in cache" service=cockroachdb-catalog
time="2021-06-09T05:48:59Z" level=info msg="state.Key.Namespace=default state.Key.Name=cockroachdb-catalog state.State=CONNECTING"
time="2021-06-09T05:49:02Z" level=info msg="state.Key.Namespace=default state.Key.Name=cockroachdb-catalog state.State=TRANSIENT_FAILURE"
time="2021-06-09T05:49:03Z" level=info msg="state.Key.Namespace=default state.Key.Name=cockroachdb-catalog state.State=CONNECTING"
time="2021-06-09T05:49:23Z" level=info msg="state.Key.Namespace=default state.Key.Name=cockroachdb-catalog state.State=TRANSIENT_FAILURE"
time="2021-06-09T05:49:24Z" level=info msg="state.Key.Namespace=default state.Key.Name=cockroachdb-catalog state.State=CONNECTING"
time="2021-06-09T05:49:24Z" level=info msg="state.Key.Namespace=default state.Key.Name=cockroachdb-catalog state.State=READY"
[root@preserve-olm-env 1968700]# 

[root@preserve-olm-env 1968700]# cat sub.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: cockroachdb
  namespace: default
spec:
  channel: alpha
  installPlanApproval: Automatic
  name: cockroachdb
  source: cockroachdb-catalog
  sourceNamespace: default
  startingCSV: cockroachdb.v2.0.9
[root@preserve-olm-env 1968700]# oc apply -f sub.yaml 
subscription.operators.coreos.com/cockroachdb created
[root@preserve-olm-env 1968700]# 

[root@preserve-olm-env 1968700]# oc get ip;oc get csv
NAME            CSV                  APPROVAL    APPROVED
install-7pngp   cockroachdb.v2.0.9   Automatic   true
install-jd96s   cockroachdb.v2.1.1   Automatic   true
NAME                 DISPLAY       VERSION   REPLACES             PHASE
cockroachdb.v2.0.9   CockroachDB   2.0.9                          Replacing
cockroachdb.v2.1.1   CockroachDB   2.1.1     cockroachdb.v2.0.9   Installing
[root@preserve-olm-env 1968700]# oc get ip;oc get csv
NAME            CSV                  APPROVAL    APPROVED
install-7pngp   cockroachdb.v2.0.9   Automatic   true
install-jd96s   cockroachdb.v2.1.1   Automatic   true
NAME                 DISPLAY       VERSION   REPLACES             PHASE
cockroachdb.v2.1.1   CockroachDB   2.1.1     cockroachdb.v2.0.9   Succeeded
[root@preserve-olm-env 1968700]# 

[root@preserve-olm-env 1968700]# oc get pod -n openshift-operator-lifecycle-manager
NAME                                READY   STATUS    RESTARTS   AGE
catalog-operator-59b74994d5-bmtp4   1/1     Running   0          53m
olm-operator-6f97647bb6-zbdhx       1/1     Running   0          53m
packageserver-fbf5b7757-nfp9n       1/1     Running   0          43m
packageserver-fbf5b7757-swrfm       1/1     Running   0          43m
[root@preserve-olm-env 1968700]# oc logs catalog-operator-59b74994d5-bmtp4 -n openshift-operator-lifecycle-manager --tail=30
time="2021-06-09T05:53:02Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-06-09T05:53:02Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-06-09T05:53:02Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-06-09T05:53:02Z" level=info msg=syncing id=pIUlK ip=install-jd96s namespace=default phase=Installing
time="2021-06-09T05:53:03Z" level=info msg=syncing id=jhqPj ip=install-7pngp namespace=default phase=Complete
time="2021-06-09T05:53:07Z" level=info msg=syncing id=WjEJW ip=install-jd96s namespace=default phase=Installing
time="2021-06-09T05:53:07Z" level=warning msg="status not equal, updating..." id=WjEJW ip=install-jd96s namespace=default phase=Installing
time="2021-06-09T05:53:07Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-06-09T05:53:07Z" level=info msg=syncing id=GE/l/ ip=install-jd96s namespace=default phase=Installing
time="2021-06-09T05:53:12Z" level=info msg=syncing id=NVTlF ip=install-jd96s namespace=default phase=Installing
time="2021-06-09T05:53:12Z" level=info msg="added to bundle, Kind=ClusterServiceVersion" configmap=default/945ac25af8d283ef724cca78ed406290dae58016d0cfe3909299b9fe336d87e key=cockroachdb.v2.1.1.clusterserviceversion.yaml
time="2021-06-09T05:53:12Z" level=info msg="added to bundle, Kind=CustomResourceDefinition" configmap=default/945ac25af8d283ef724cca78ed406290dae58016d0cfe3909299b9fe336d87e key=cockroachdbs.charts.helm.k8s.io.crd.yaml
time="2021-06-09T05:53:12Z" level=warning msg="status not equal, updating..." id=NVTlF ip=install-jd96s namespace=default phase=Installing
time="2021-06-09T05:53:12Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-06-09T05:53:12Z" level=info msg=syncing id=TWF7V ip=install-jd96s namespace=default phase=Installing
time="2021-06-09T05:53:12Z" level=info msg="added to bundle, Kind=ClusterServiceVersion" configmap=default/945ac25af8d283ef724cca78ed406290dae58016d0cfe3909299b9fe336d87e key=cockroachdb.v2.1.1.clusterserviceversion.yaml
time="2021-06-09T05:53:12Z" level=info msg="added to bundle, Kind=CustomResourceDefinition" configmap=default/945ac25af8d283ef724cca78ed406290dae58016d0cfe3909299b9fe336d87e key=cockroachdbs.charts.helm.k8s.io.crd.yaml
W0609 05:53:12.753345       1 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W0609 05:53:12.758016       1 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W0609 05:53:12.781081       1 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
time="2021-06-09T05:53:12Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-06-09T05:53:12Z" level=info msg=syncing id=KK0cx ip=install-jd96s namespace=default phase=Complete
time="2021-06-09T05:53:12Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-06-09T05:53:13Z" level=warning msg="an error was encountered during reconciliation" error="Operation cannot be fulfilled on subscriptions.operators.coreos.com \"cockroachdb\": the object has been modified; please apply your changes to the latest version and try again" event=update reconciling="*v1alpha1.Subscription" selflink=
E0609 05:53:13.023371       1 queueinformer_operator.go:290] sync {"update" "default/cockroachdb"} failed: Operation cannot be fulfilled on subscriptions.operators.coreos.com "cockroachdb": the object has been modified; please apply your changes to the latest version and try again
time="2021-06-09T05:53:13Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-06-09T05:53:13Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-06-09T05:53:13Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
I0609 05:53:14.083885       1 event.go:282] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"", Name:"default", UID:"7b609a85-e0d3-40d3-845c-8414cbce2897", APIVersion:"v1", ResourceVersion:"35884", FieldPath:""}): type: 'Warning' reason: 'ResolutionFailed' constraints not satisfiable: subscription cockroachdb requires at least one of cockroachdb-catalog/default/alpha/cockroachdb.v2.1.11 or @existing/default//cockroachdb.v2.1.1, subscription cockroachdb exists, @existing/default//cockroachdb.v2.0.9 is mandatory, cockroachdb-catalog/default/alpha/cockroachdb.v2.1.11, @existing/default//cockroachdb.v2.0.9 and @existing/default//cockroachdb.v2.1.1 originate from package cockroachdb
time="2021-06-09T05:55:19Z" level=info msg="catalog update required at 2021-06-09 05:55:19.834744515 +0000 UTC m=+2589.861497224" CatalogSource=redhat-marketplace
[root@preserve-olm-env 1968700]# 

[root@preserve-olm-env 1968700]# oc delete sub cockroachdb
subscription.operators.coreos.com "cockroachdb" deleted
[root@preserve-olm-env 1968700]#  oc delete catsrc cockroachdb-catalog
catalogsource.operators.coreos.com "cockroachdb-catalog" deleted
[root@preserve-olm-env 1968700]#  oc delete og og-single1
operatorgroup.operators.coreos.com "og-single1" deleted
[root@preserve-olm-env 1968700]#  oc delete csv cockroachdb.v2.1.1
clusterserviceversion.operators.coreos.com "cockroachdb.v2.1.1" deleted
[root@preserve-olm-env 1968700]# 

--

Comment 5 errata-xmlrpc 2021-07-27 23:11:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.