Bug 1732302
Summary: | catalog-operator will panic when the installing operator's ClusterRole/ClusterRoleBinding exist | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jian Zhang <jiazha> | |
Component: | OLM | Assignee: | Evan Cordell <ecordell> | |
OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | bandrade, cblecker, chezhang, chuo, ecordell, jfan, scolange | |
Version: | 4.1.z | |||
Target Milestone: | --- | |||
Target Release: | 4.2.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1732911 1733324 (view as bug list) | Environment: | ||
Last Closed: | 2019-10-16 06:30:44 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1732911, 1733324 |
Description
Jian Zhang
2019-07-23 07:00:12 UTC
*** Bug 1732911 has been marked as a duplicate of this bug. *** Hi, Evan Sorry, I didn't find the fixed PR merged in the release-4.1 branch, or am I missing something? Change status to `ASSIGNED` first since no fixed PR. Making this the bug for 4.2 and will duplicate for 4.1.z @Evan, > Making this the bug for 4.2 and will duplicate for 4.1.z OK, so I change the `Target Release` of bug 1732214 to 4.1.z since this one if for 4.2 now. Hi, Christoph
> Steps to Reproduce:
1. Deploy ClusterRole/ClusterRoleBinding to the cluster first manually
2. Deploy operator that attempts to create the same ClusterRole/ClusterRoleBinding
Based on my understanding, OLM will create the `ClusterRole/ClusterRoleBinding` objects with a random number. Such as:
ClusterRole: etcdoperator.v0.9.4-clusterwide-4t9p5
ClusterRoleBinding: etcdoperator.v0.9.4-clusterwide-4t9p5-etcd-operator-zslsf
So, my question is how can we deploy the operator with the same ClusterRole/ClusterRoleBinding? Thanks!
It's possible to include additional ClusterRole/ClusterRoleBindings objects in the operator bundle with static names. This isn't optimal, but this was the scenario we saw this bug trigger in. Christoph, Yeah, thanks! Below are the test steps, please let me know if anymore steps. Thanks! 1) Add Clusterrole/ClusterRoleBinding files in operator bundles. The statics ClusterRole/ClusterRoleBinding names are: etcdoperator.v0.9.4-clusterwide-test, etcdoperator.v0.9.4-clusterrolebinding-test, see below: mac:etcd jianzhang$ pwd /Users/jianzhang/goproject/src/github.com/operator-framework/operator-registry/manifests/etcd mac:etcd jianzhang$ ls etcd.package.yaml etcdclusterrolebinding.yaml etcdbackup.crd.yaml etcdoperator.v0.9.4-clusterwide.clusterserviceversion.yaml etcdcluster.crd.yaml etcdrestore.crd.yaml etcdclusterrole.yaml mac:etcd jianzhang$ cat etcdclusterrole.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: etcdoperator.v0.9.4-clusterwide-test rules: - apiGroups: - etcd.database.coreos.com resources: - etcdclusters - etcdbackups - etcdrestores verbs: - '*' - apiGroups: - "" resources: - pods - services - endpoints - persistentvolumeclaims - events verbs: - '*' - apiGroups: - apps resources: - deployments verbs: - '*' - apiGroups: - "" resources: - secrets verbs: - get mac:etcd jianzhang$ cat etcdclusterrolebinding.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: etcdoperator.v0.9.4-clusterrolebinding-test roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: etcdoperator.v0.9.4-clusterwide-test subjects: - kind: ServiceAccount name: etcd-operator namespace: openshift-operators 2) Build a test registry image and push it to Quay. mac:operator-registry jianzhang$ docker build -f upstream-example.Dockerfile -t quay.io/jiazha/etcd-operator:bug-1732302 . ... Successfully built b25276cabf1e Successfully tagged quay.io/jiazha/etcd-operator:bug-1732302 mac:operator-registry jianzhang$ docker push quay.io/jiazha/etcd-operator:bug-1732302 ... 3) Create a CatalogSource to consume this test image. mac:~ jianzhang$ cat cs-bug.yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: etcd-bug-operator namespace: openshift-marketplace spec: sourceType: grpc image: quay.io/jiazha/etcd-operator:bug-1732302 displayName: ETCD Bug Operators publisher: jian mac:~ jianzhang$ oc get catalogsource -n openshift-marketplace NAME NAME TYPE PUBLISHER AGE certified-operators Certified Operators grpc Red Hat 3h43m community-operators Community Operators grpc Red Hat 3h43m etcd-bug-operator ETCD Bug Operators grpc jian 22s redhat-operators Red Hat Operators grpc Red Hat 3h43m 3) Create that static ClusterRole/ClusterRoleBinding objects. mac:operator-registry jianzhang$ oc create -f manifests/etcd/etcdclusterrole.yaml clusterrole.rbac.authorization.k8s.io/etcdoperator.v0.9.4-clusterwide-test created mac:operator-registry jianzhang$ oc create -f manifests/etcd/etcdclusterrolebinding.yaml clusterrolebinding.rbac.authorization.k8s.io/etcdoperator.v0.9.4-clusterrolebinding-test created mac:~ jianzhang$ oc get clusterrolebinding |grep etcd etcdoperator.v0.9.4-clusterrolebinding-test 8s mac:~ jianzhang$ oc get clusterrole |grep etcd etcdoperator.v0.9.4-clusterwide-test 24s 4) Create this test operator. mac:~ jianzhang$ cat sub-bug.yaml apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: generateName: etcd-bug- namespace: openshift-operators spec: source: etcd-bug-operator sourceNamespace: openshift-marketplace name: etcd startingCSV: etcdoperator.v0.9.4-clusterwide channel: clusterwide-alpha mac:~ jianzhang$ oc get sub -n openshift-operators NAME PACKAGE SOURCE CHANNEL etcd-bug-kjtv2 etcd etcd-bug-operator clusterwide-alpha mac:~ jianzhang$ oc get csv -n openshift-operators NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.4-clusterwide etcd 0.9.4-clusterwide Succeeded 5) Check the OLM pods status. mac:~ jianzhang$ oc get pods -n openshift-operator-lifecycle-manager NAME READY STATUS RESTARTS AGE catalog-operator-7d78f889bf-85vlx 1/1 Running 0 4h6m olm-operator-5c744884f9-q8l4n 1/1 Running 0 4h6m packageserver-578f95779-288kf 1/1 Running 0 4h3m packageserver-578f95779-mjhz6 1/1 Running 0 4h3m 6) Re-run above steps 1,2,4,5,6 with a new registry image(quay.io/jiazha/etcd-operator:bug2-1732302) which no `clusterPermission` configured in the csv. mac:~ jianzhang$ cat cs-bug.yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: etcd-bug-operator namespace: openshift-marketplace spec: sourceType: grpc image: quay.io/jiazha/etcd-operator:bug2-1732302 displayName: ETCD Bug Operators publisher: jian mac:~ jianzhang$ oc get pods -n openshift-operator-lifecycle-manager NAME READY STATUS RESTARTS AGE catalog-operator-7d78f889bf-85vlx 1/1 Running 0 4h23m olm-operator-5c744884f9-q8l4n 1/1 Running 0 4h23m packageserver-578f95779-288kf 1/1 Running 0 4h20m packageserver-578f95779-mjhz6 1/1 Running 0 4h20m The OLM pods worked well, no panic, LGTM, verify it. Cluster and OLM versions: mac:~ jianzhang$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.0-0.nightly-2019-07-31-162901 True False 4h12m Cluster version is 4.2.0-0.nightly-2019-07-31-162901 mac:~ jianzhang$ oc -n openshift-operator-lifecycle-manager exec catalog-operator-7d78f889bf-85vlx -- olm --version OLM version: 0.11.0 git commit: d2209c409b35f1db4669c474044decc6995f624d Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |