Description of problem: When user tries to install descheduler operator after uninstalling it once user sees that operator cannot be installed anymore and always stuck in pending state and below are the events seen openshift-marketplace 8m49s Normal Pulling pod/b0778e7e41e078154ec9bf38a589b15369569659b4f8942b0e98cf614e29b8s Pulling image "quay.io/openshift-qe-optional-operators/ose-ose-cluster-kube-descheduler-operator-bundle@sha256:c641783a95c2b6a53ea03309df2c042f64c1040918181a79f968e595e2c9c06c" openshift-marketplace 8m47s Normal Pulled pod/b0778e7e41e078154ec9bf38a589b15369569659b4f8942b0e98cf614e29b8s Successfully pulled image "quay.io/openshift-qe-optional-operators/ose-ose-cluster-kube-descheduler-operator-bundle@sha256:c641783a95c2b6a53ea03309df2c042f64c1040918181a79f968e595e2c9c06c" in 2.495931079s openshift-kube-descheduler-operator 8m22s Normal SuccessfulCreate replicaset/descheduler-operator-5d88f758f6 Created pod: descheduler-operator-5d88f758f6-lhcb7 openshift-kube-descheduler-operator 8m22s Normal AllRequirementsMet clusterserviceversion/clusterkubedescheduleroperator.4.6.0-202010061132.p0 all requirements found, attempting install openshift-kube-descheduler-operator 8m22s Normal InstallWaiting clusterserviceversion/clusterkubedescheduleroperator.4.6.0-202010061132.p0 installing: waiting for deployment descheduler-operator to become ready: Waiting for deployment spec update to be observed... openshift-kube-descheduler-operator 8m22s Normal RequirementsUnknown clusterserviceversion/clusterkubedescheduleroperator.4.6.0-202010061132.p0 requirements not yet checked openshift-kube-descheduler-operator 8m22s Normal ScalingReplicaSet deployment/descheduler-operator Scaled up replica set descheduler-operator-5d88f758f6 to 1 openshift-kube-descheduler-operator 8m22s Normal InstallSucceeded clusterserviceversion/clusterkubedescheduleroperator.4.6.0-202010061132.p0 waiting for install components to report healthy openshift-kube-descheduler-operator 8m21s Normal InstallWaiting clusterserviceversion/clusterkubedescheduleroperator.4.6.0-202010061132.p0 installing: waiting for deployment descheduler-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available... openshift-kube-descheduler-operator 8m16s Normal AddedInterface pod/descheduler-operator-5d88f758f6-lhcb7 Add eth0 [10.129.2.111/23] openshift-kube-descheduler-operator 8m16s Normal Pulling pod/descheduler-operator-5d88f758f6-lhcb7 Pulling image "registry.redhat.io/openshift4/ose-cluster-kube-descheduler-operator@sha256:47f4331c5aa1a973a10383e4bdbea61055e7e7223509b019b54d83e79710d0c0" openshift-kube-descheduler-operator 8m12s Normal Created pod/descheduler-operator-5d88f758f6-lhcb7 Created container descheduler-operator openshift-kube-descheduler-operator 8m12s Normal Pulled pod/descheduler-operator-5d88f758f6-lhcb7 Successfully pulled image "registry.redhat.io/openshift4/ose-cluster-kube-descheduler-operator@sha256:47f4331c5aa1a973a10383e4bdbea61055e7e7223509b019b54d83e79710d0c0" in 4.388792663s openshift-kube-descheduler-operator 8m12s Normal Started pod/descheduler-operator-5d88f758f6-lhcb7 Started container descheduler-operator openshift-kube-descheduler-operator 8m11s Normal InstallSucceeded clusterserviceversion/clusterkubedescheduleroperator.4.6.0-202010061132.p0 install strategy completed with no errors openshift-kube-descheduler-operator 8m10s Normal LeaderElection configmap/openshift-cluster-kube-descheduler-operator-lock descheduler-operator-5d88f758f6-lhcb7_1c12ae0b-211e-4952-88c5-84156a11892e became leader openshift-kube-descheduler-operator 7m13s Normal ScalingReplicaSet deployment/cluster Scaled up replica set cluster-8d9684bf6 to 1 openshift-kube-descheduler-operator 7m13s Normal ConfigMapCreated deployment/descheduler-operator Created ConfigMap/cluster -n openshift-kube-descheduler-operator because it was missing openshift-kube-descheduler-operator 7m13s Normal DeploymentCreated deployment/descheduler-operator Created Deployment.apps/cluster -n openshift-kube-descheduler-operator because it was missing openshift-kube-descheduler-operator 7m13s Normal SuccessfulCreate replicaset/cluster-8d9684bf6 Created pod: cluster-8d9684bf6-qdr4w openshift-kube-descheduler-operator 7m11s Normal AddedInterface pod/cluster-8d9684bf6-qdr4w Add eth0 [10.131.1.234/23] openshift-kube-descheduler-operator 7m7s Normal Pulling pod/cluster-8d9684bf6-qdr4w Pulling image "registry.redhat.io/openshift4/ose-descheduler@sha256:85d3a4805c16b6ba8515ec3b1109bda676f8b92d7c352c4ae9dc7866a312ac99" openshift-kube-descheduler-operator 7m3s Normal Pulled pod/cluster-8d9684bf6-qdr4w Successfully pulled image "registry.redhat.io/openshift4/ose-descheduler@sha256:85d3a4805c16b6ba8515ec3b1109bda676f8b92d7c352c4ae9dc7866a312ac99" in 3.756431287s openshift-kube-descheduler-operator 7m2s Normal Started pod/cluster-8d9684bf6-qdr4w Started container openshift-descheduler openshift-kube-descheduler-operator 7m2s Normal Created pod/cluster-8d9684bf6-qdr4w Created container openshift-descheduler openshift-kube-descheduler-operator 5m56s Normal Killing pod/cluster-8d9684bf6-qdr4w Stopping container openshift-descheduler openshift-kube-descheduler-operator 5m49s Warning RequirementsNotMet clusterserviceversion/clusterkubedescheduleroperator.4.6.0-202010061132.p0 requirements no longer met openshift-kube-descheduler-operator 5m49s Normal Killing pod/descheduler-operator-5d88f758f6-lhcb7 Stopping container descheduler-operator openshift-kube-descheduler-operator 4m37s Normal RequirementsUnknown clusterserviceversion/clusterkubedescheduleroperator.4.6.0-202010061132.p0 requirements not yet checked openshift-kube-descheduler-operator 4m37s Normal RequirementsNotMet clusterserviceversion/clusterkubedescheduleroperator.4.6.0-202010061132.p0 one or more requirements couldn't be found Version-Release number of selected component (if applicable): [knarra@knarra Openshift]$ oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterkubedescheduleroperator.4.6.0-202010061132.p0 Kube Descheduler Operator 4.6.0-202010061132.p0 How Reproducible: Always Steps to Reproduce: 1. Install 4.6 cluster 2. Install descheduler operator from UI 3. create cluster kubedescheduler object from UI, everything installs successfully 4. Delete cluster kubedescheduler object from UI 5. Uninstall cluster-kube-descheduler-operator from UI 6. Now try installing the operator again from UI Actual Results: I see that operator never gets installed and always stuck in pending state Expected Results: Operator should get installed successfully
Created attachment 1721149 [details] catalog operator logs
Created attachment 1721150 [details] OLM operator logs
Created attachment 1721152 [details] descheduler operator install plan
Restarting catalog-operator, resp. olm-operator pod does not help. Also, removing the installplan object does not help (installplan object is not recreated after deleting it).
Created attachment 1721156 [details] catalog operator logs (delete followed by re-creation of descheduler operator)
There's a bunch of ``` time="2020-10-13T09:30:07Z" level=info msg="error updating InstallPlan status" id=S4nLU ip=install-vdjqm namespace=openshift-kube-descheduler-operator phase=Installing updateError="Operation cannot be fulfilled on installplans.operators.coreos.com \"install-vdjqm\": the object has been modified; please apply your changes to the latest version and try again" E1013 09:30:07.681290 1 queueinformer_operator.go:290] sync {"update" "openshift-kube-descheduler-operator/install-vdjqm"} failed: error updating InstallPlan status: Operation cannot be fulfilled on installplans.operators.coreos.com "install-vdjqm": the object has been modified; please apply your changes to the latest version and try again ``` Are not there some read/write conflicts in OLM itself?
Hi Rama, I coundn't reproduce it in my cluster: https://mastern-jenkins-csb-openshift-qe.cloud.paas.psi.redhat.com/job/Launch%20Environment%20Flexy/117614/artifact/workdir/install-dir/auth/kubeconfig/*view*/ [root@preserve-olm-env data]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-10-12-223649 True False 6h41m Cluster version is 4.6.0-0.nightly-2020-10-12-223649 [root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-5f654b87f-mnc74 -- olm --version OLM version: 0.16.1 git commit: 6f59080264afd89fa786ca872f759470d8764b22 1) Install it from UI, it works well. [root@preserve-olm-env data]# oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL openshift-kube-descheduler-operator cluster-kube-descheduler-operator cluster-kube-descheduler-operator qe-app-registry 4.6 [root@preserve-olm-env data]# oc get ip -n openshift-kube-descheduler-operator NAME CSV APPROVAL APPROVED install-5k5tj clusterkubedescheduleroperator.4.6.0-202010061132.p0 Automatic true [root@preserve-olm-env data]# oc get csv -n openshift-kube-descheduler-operator NAME DISPLAY VERSION REPLACES PHASE clusterkubedescheduleroperator.4.6.0-202010061132.p0 Kube Descheduler Operator 4.6.0-202010061132.p0 Succeeded 2) Uninstall it, it works well. [root@preserve-olm-env data]# oc get sub -A No resources found [root@preserve-olm-env data]# oc get ip -n openshift-kube-descheduler-operator No resources found in openshift-kube-descheduler-operator namespace. [root@preserve-olm-env data]# oc get sa -n openshift-kube-descheduler-operator NAME SECRETS AGE builder 2 2m11s default 2 2m11s deployer 2 2m11s 3) Reinstall it, it works well. [root@preserve-olm-env data]# oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL openshift-kube-descheduler-operator cluster-kube-descheduler-operator cluster-kube-descheduler-operator qe-app-registry 4.6 [root@preserve-olm-env data]# oc get ip -n openshift-kube-descheduler-operator NAME CSV APPROVAL APPROVED install-pjsg6 clusterkubedescheduleroperator.4.6.0-202010061132.p0 Automatic true [root@preserve-olm-env data]# oc get csv -n openshift-kube-descheduler-operator NAME DISPLAY VERSION REPLACES PHASE clusterkubedescheduleroperator.4.6.0-202010061132.p0 Kube Descheduler Operator 4.6.0-202010061132.p0 Succeeded [root@preserve-olm-env data]# oc get pods -n openshift-kube-descheduler-operator NAME READY STATUS RESTARTS AGE descheduler-operator-ccd58fcb7-lh2zd 1/1 Running 0 38s [root@preserve-olm-env data]# oc get sa -n openshift-kube-descheduler-operator NAME SECRETS AGE builder 2 4m59s default 2 4m59s deployer 2 4m59s openshift-descheduler 2 46s In your cluster: https://mastern-jenkins-csb-openshift-qe.cloud.paas.psi.redhat.com/job/Launch%20Environment%20Flexy/117438/artifact/workdir/install-dir/auth/kubeconfig/*view*/ [root@preserve-olm-env data]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-rc.2 True False 27h Cluster version is 4.6.0-rc.2 [root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-59546d8c85-dj5qf -- olm --version OLM version: 0.16.1 git commit: 6f59080264afd89fa786ca872f759470d8764b22 [root@preserve-olm-env data]# oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL openshift-kube-descheduler-operator cluster-kube-descheduler-operator cluster-kube-descheduler-operator qe-app-registry 4.6 [root@preserve-olm-env data]# [root@preserve-olm-env data]# [root@preserve-olm-env data]# oc get ip -n openshift-kube-descheduler-operator NAME CSV APPROVAL APPROVED install-rctv7 clusterkubedescheduleroperator.4.6.0-202010061132.p0 Automatic true It's weried, there are the same CSVs(Replaces) here. Could you help give more details? Thanks! [root@preserve-olm-env data]# oc get csv -n openshift-kube-descheduler-operator NAME DISPLAY VERSION REPLACES PHASE clusterkubedescheduleroperator.4.6.0-202010061132.p0 Kube Descheduler Operator 4.6.0-202010061132.p0 clusterkubedescheduleroperator.4.6.0-202010061132.p0 Pending [root@preserve-olm-env data]# oc describe csv -n openshift-kube-descheduler-operator Name: clusterkubedescheduleroperator.4.6.0-202010061132.p0 Namespace: openshift-kube-descheduler-operator Labels: olm.api.623e59b3c80e3376=provided operators.coreos.com/cluster-kube-descheduler-operator.openshift-kube-descheduler-op= Annotations: alm-examples: ... Status: Conditions: Last Transition Time: 2020-10-13T07:15:17Z Last Update Time: 2020-10-13T07:15:17Z Message: requirements not yet checked Phase: Pending Reason: RequirementsUnknown Last Transition Time: 2020-10-13T07:15:17Z Last Update Time: 2020-10-13T07:15:17Z Message: one or more requirements couldn't be found Phase: Pending Reason: RequirementsNotMet Last Transition Time: 2020-10-13T07:15:17Z Last Update Time: 2020-10-13T07:15:17Z Message: one or more requirements couldn't be found Phase: Pending Reason: RequirementsNotMet Requirement Status: Group: operators.coreos.com Kind: ClusterServiceVersion Message: CSV minKubeVersion (1.19.0) less than server version (v1.19.0+d59ce34) Name: clusterkubedescheduleroperator.4.6.0-202010061132.p0 Status: Present Version: v1alpha1 Group: apiextensions.k8s.io Kind: CustomResourceDefinition Message: CRD is present and Established condition is true Name: kubedeschedulers.operator.openshift.io Status: Present Uuid: f05e9e93-73cb-484e-ad77-32f28e8f6190 Version: v1 Dependents: Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["*"],"apiGroups":["operator.openshift.io"],"resources":["*"]} Status: NotSatisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["*"],"apiGroups":["kubedeschedulers.operator.openshift.io"],"resources":["*"]} Status: NotSatisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["*"],"apiGroups":[""],"resources":["services","pods","configmaps","secrets","names","nodes","pods/eviction","events"]} Status: NotSatisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["get","watch","list"],"apiGroups":["scheduling.k8s.io"],"resources":["priorityclasses"]} Status: NotSatisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["*"],"apiGroups":["apps"],"resources":["deployments","replicasets"]} Status: NotSatisfied Version: v1 Group: Kind: ServiceAccount Message: Policy rule not satisfied for service account Name: openshift-descheduler Status: PresentNotSatisfied Version: v1 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal RequirementsUnknown 115m (x2 over 115m) operator-lifecycle-manager requirements not yet checked Normal RequirementsNotMet 115m (x2 over 115m) operator-lifecycle-manager one or more requirements couldn't be found The SA is exist. [root@preserve-olm-env data]# oc get sa -n openshift-kube-descheduler-operator NAME SECRETS AGE builder 2 134m default 2 134m deployer 2 134m openshift-descheduler 2 130m Anyway, workaround: 1) Uninstall it. [root@preserve-olm-env data]# oc get sa -n openshift-kube-descheduler-operator NAME SECRETS AGE builder 2 165m default 2 165m deployer 2 165m [root@preserve-olm-env data]# oc get sub -n openshift-kube-descheduler-operator No resources found in openshift-kube-descheduler-operator namespace. [root@preserve-olm-env data]# oc get ip -n openshift-kube-descheduler-operator No resources found in openshift-kube-descheduler-operator namespace. [root@preserve-olm-env data]# oc get csv -n openshift-kube-descheduler-operator No resources found in openshift-kube-descheduler-operator namespace. 2) Delete the job, ConfigMap in the openshift-marketplace project. [root@preserve-olm-env data]# oc get job No resources found in openshift-marketplace namespace. [root@preserve-olm-env data]# oc get cm NAME DATA AGE marketplace-operator-lock 0 27h marketplace-trusted-ca 1 28h 3) Delete the OLM pods. [root@preserve-olm-env data]# oc delete pods --all -n openshift-operator-lifecycle-manager pod "catalog-operator-59546d8c85-dj5qf" deleted pod "olm-operator-6984b748cf-bbs28" deleted pod "packageserver-9b65c8b76-wl7lf" deleted pod "packageserver-9b65c8b76-xx9pv" deleted [root@preserve-olm-env data]# [root@preserve-olm-env data]# oc get pods -n openshift-operator-lifecycle-manager NAME READY STATUS RESTARTS AGE catalog-operator-59546d8c85-hgqs4 1/1 Running 0 22s olm-operator-6984b748cf-m26vz 1/1 Running 0 22s packageserver-9b65c8b76-djzx7 1/1 Running 0 21s packageserver-9b65c8b76-mdl78 1/1 Running 0 21s 4) Reinstall it, it works well. [root@preserve-olm-env data]# oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL openshift-kube-descheduler-operator cluster-kube-descheduler-operator cluster-kube-descheduler-operator qe-app-registry 4.6 [root@preserve-olm-env data]# oc get ip -n openshift-kube-descheduler-operator NAME CSV APPROVAL APPROVED install-jwvr5 clusterkubedescheduleroperator.4.6.0-202010061132.p0 Automatic true [root@preserve-olm-env data]# oc get csv -n openshift-kube-descheduler-operator NAME DISPLAY VERSION REPLACES PHASE clusterkubedescheduleroperator.4.6.0-202010061132.p0 Kube Descheduler Operator 4.6.0-202010061132.p0 Succeeded [root@preserve-olm-env data]# [root@preserve-olm-env data]# oc get pods -n openshift-kube-descheduler-operator NAME READY STATUS RESTARTS AGE descheduler-operator-5d88f758f6-lhvbs 1/1 Running 0 49s [root@preserve-olm-env data]# [root@preserve-olm-env data]# oc get sa -n openshift-kube-descheduler-operator NAME SECRETS AGE builder 2 3h8m default 2 3h8m deployer 2 3h8m openshift-descheduler 2 59s
Just tried workaround suggested by JianZhang and i could successfully deploy the operator, but we still do not know what is causing this issue, thanks !!
Setting target release to the active development branch (4.7.0). For any fixes, where required and requested, cloned BZs will be created for those release maintenance streams where appropriate once they are identified.
Hi, So the main issue is here due to the skipRange that is specified in the CSV. It is ">=4.3.0-0 < 4.6.0" I believe. The version in the installed CSV is 4.6.0-202010061132.p0 which falls into that range. Essentially, when major, minor, and patch are equal, a pre-release version has lower precedence than a normal version so 4.6.0-202010061132.p0 < 4.6.0. Because of this, the solver is confused by this and adds the this version into `replaces` field so we end up with the situation that this version is replacing itself. As a result, the CSV is stuck in Pending state. You can fix this by changing the skipRange to ">=4.3.0-0 < 4.6.0-0". Thanks, Vu
Thank you very much for debugging and providing the fix!!! Another thing we have learned about OLM.
Verified bug with the payload below and i see that uninstalling & installing descheduler operator works fine. [knarra@knarra openshift-client-linux-4.7.0-0.nightly-2020-11-10-023606]$ ./oc version Client Version: 4.7.0-0.nightly-2020-11-10-023606 Server Version: 4.7.0-0.nightly-2020-11-10-023606 Kubernetes Version: v1.19.2+7e80e12 [knarra@knarra openshift-client-linux-4.7.0-0.nightly-2020-11-10-023606]$ ./oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterkubedescheduleroperator.4.7.0-202011031553.p0 Kube Descheduler Operator 4.7.0-202011031553.p0 Succeeded Also as per the PR i see that olm.skipRange has been set as '>=4.3.0-0 < 4.7.0-0'. Uninstalled & installed about 4 times and did not see any issue so moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633