Description of problem: ocs-operator.v4.5.0-463 is in Pending state Version-Release number of selected component (if applicable): openshift installer (4.5.0-0.nightly-2020-06-26-023641) ocs-operator.v4.5.0-463 How reproducible: 1/1 Steps to Reproduce: 1. Install OCS4.5 using ocs-ci 2. verify operator is installed or not Actual results: $ oc -n openshift-storage get csv NAME DISPLAY VERSION REPLACES PHASE awss3operator.1.0.1 AWS S3 Operator 1.0.1 awss3operator.1.0.0 Succeeded ocs-operator.v4.5.0-463.ci OpenShift Container Storage 4.5.0-463.ci Pending [vavuthu@localhost rem]$ Expected results: operator should be in succeed state Additional info: Jenkins Job: https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/9166/consoleFull Must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jnk-ai3c33-t1/jnk-ai3c33-t1_20200626T052528/logs/failed_testcase_ocs_logs_1593149791/deployment_ocs_logs/
status condition in describe of operator > $ oc describe csv ocs-operator.v4.5.0-463.ci status: Conditions: Last Transition Time: 2020-06-26T06:06:25Z Last Update Time: 2020-06-26T06:06:25Z Message: requirements not yet checked Phase: Pending Reason: RequirementsUnknown Last Transition Time: 2020-06-26T06:06:25Z Last Update Time: 2020-06-26T06:06:26Z Message: one or more requirements couldn't be found Phase: Pending Reason: RequirementsNotMet Last Transition Time: 2020-06-26T06:06:25Z Last Update Time: 2020-06-26T06:06:26Z Message: one or more requirements couldn't be found Phase: Pending Reason: RequirementsNotMet Requirement Status: Group: apiextensions.k8s.io Kind: CustomResourceDefinition Message: CRD is present and Established condition is true Name: backingstores.noobaa.io Status: Present Uuid: e89cbf42-75c4-419c-ad8a-505e384b91e9 Version: v1 Dependents: Group: rbac.authorization.k8s.io Kind: PolicyRule Message: namespaced rule:{"verbs":["get","list","watch"],"apiGroups":[""],"resources":["services","endpoints","pods"]} Status: NotSatisfied Version: v1 Group: Kind: ServiceAccount Message: Policy rule not satisfied for service account Name: noobaa-metrics Status: PresentNotSatisfied Version: v1 Dependents: Group: rbac.authorization.k8s.io Kind: PolicyRule Message: namespaced rule:{"verbs":["get","watch","list","delete","update","create"],"apiGroups":[""],"resources":["endpoints"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: namespaced rule:{"verbs":["get","list","create","delete"],"apiGroups":[""],"resources":["configmaps"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: namespaced rule:{"verbs":["get","watch","list","delete","update","create"],"apiGroups":["coordination.k8s.io"],"resources":["leases"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["get","list"],"apiGroups":[""],"resources":["secrets"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["get","list","watch","create","delete","update","patch"],"apiGroups":[""],"resources":["persistentvolumes"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["get","list","watch","update"],"apiGroups":[""],"resources":["persistentvolumeclaims"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["get","list","watch"],"apiGroups":["storage.k8s.io"],"resources":["storageclasses"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["list","watch","create","update","patch"],"apiGroups":[""],"resources":["events"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["get","list","watch","update","patch"],"apiGroups":["storage.k8s.io"],"resources":["volumeattachments"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["get","list","watch"],"apiGroups":[""],"resources":["nodes"]} Status: Satisfied Version: v1 Group: rbac.authorization.k8s.io Kind: PolicyRule Message: cluster rule:{"verbs":["update","patch"],"apiGroups":[""],"resources":["persistentvolumeclaims/status"]} Status: Satisfied Version: v1 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal RequirementsUnknown 98m (x2 over 98m) operator-lifecycle-manager requirements not yet checked Normal RequirementsNotMet 98m operator-lifecycle-manager one or more requirements couldn't be found $
Build 466 has still the same issue: https://ceph-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ocs-ci/480/console
Looking through this, I can't find anything menaingful about what's going on. The only logs I'm finding are these lines in the olm-operator logs: 2020-06-30T02:34:03.193147487Z time="2020-06-30T02:34:03Z" level=info msg="csv in operatorgroup" csv=ocs-operator.v4.5.0-467.ci id=brBbF namespace=openshift-storage opgroup=openshift-storage-operatorgroup phase=Pending 2020-06-30T02:34:04.624762784Z time="2020-06-30T02:34:04Z" level=info msg="requirements were not met" csv=ocs-operator.v4.5.0-467.ci id=brBbF namespace=openshift-storage phase=Pending 2020-06-30T02:34:04.69350063Z E0630 02:34:04.693431 1 queueinformer_operator.go:290] sync {"update" "openshift-storage/ocs-operator.v4.5.0-467.ci"} failed: requirements were not met Can we try running the last-known good build again, just to make sure it still works? If not then this may be a problem with OCP. Can we try deploying OCS 4.5 on OCP 4.4 as well? Nonetheless, this looks like a genuine problem and should be taken care of, giving devel_ack+.
As noted in comment #5 - even older versions that are known to work, don't work anymore. I didn't test any other version than 462. However, I agree that this might point to the problem being in OCP rather than OCS (or somewhere in the middle). If the issue was entirely in OCS, 462 should have still been deployable.
*** Bug 1852607 has been marked as a duplicate of this bug. ***
Looking at the CSV and all other resources, it seems like Requirements were met but for some reason CSV is not ready to accept it. This seems to be an issue in OLM dependency resolution. I also noticed that the OCS CSV is constantly getting refreshed which could be the reason for CSV not getting updated correctly.
Trying to run verification of OCS 4.5 with OCP 4.4 here: https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/9353/
So it sounds like this is an OCP level bug. Cloning this into OCP.
Please let us know if someone is looking at cluster and for how long do you need it, otherwise we will destroy it in few hours.
From what I remember and tested: | OCP 4.4 | OCP 4.5 OCS 4.4 | works | works OCS 4.5 | broken | broken
(In reply to Petr Balogh from comment #21) > From what I remember and tested: > | OCP 4.4 | OCP 4.5 > OCS 4.4 | works | works > OCS 4.5 | broken | broken Ugh, seriously?! From all our discussions, I *thought* it was: | OCP 4.4 | OCP 4.5 old | OCP 4.5 new --------+---------+--------------+-------------- OCS 4.4 | works | works | broken(?) OCS 4.5 | works | works | broken @Petr, please double check
@Michael: OCS 4.4.1 on OCP 4.4 nightly: https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/9443/ 3days back - Deployment OK. OCS 4.4.1 on OCP 4.5 nightly: https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/9453/ 3days back - Deployment OK. OCS 4.5 (4.5.0-470.ci) on OCP 4.4 nightly: https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/9357/ 5days back - Deployment is failing: 14:01:16 - MainThread - ocs_ci.ocs.ocp - INFO - Resource ocs-operator.v4.5.0-470.ci is in phase: Pending! 14:01:16 - MainThread - ocs_ci.utility.utils - ERROR - (check_phase) return incorrect status after 720 second timeout OCS 4.5 (4.5.0-470.ci) on OCP 4.5 nightly: https://ceph-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ocs-ci/482/console 6days back (engineering job) - Deployment is failing: 22:36:48 - MainThread - ocs_ci.ocs.ocp - INFO - Resource ocs-operator.v4.5.0-470.ci is in phase: Pending! 22:36:48 - MainThread - ocs_ci.utility.utils - ERROR - (check_phase) return incorrect status after 720 second timeout I think that I see what I said here https://bugzilla.redhat.com/show_bug.cgi?id=1851328#c21. Do you want me to trigger the jobs again? Petr
+1 to petr(In reply to Michael Adam from comment #22) > (In reply to Petr Balogh from comment #21) > > From what I remember and tested: > > | OCP 4.4 | OCP 4.5 > > OCS 4.4 | works | works > > OCS 4.5 | broken | broken > > > Ugh, seriously?! > > From all our discussions, I *thought* it was: > > | OCP 4.4 | OCP 4.5 old | OCP 4.5 new > --------+---------+--------------+-------------- > OCS 4.4 | works | works | broken(?) > OCS 4.5 | works | works | broken > > > @Petr, please double check @michael if we see Comment#17 and Comment#19, OCS 4.5 is broken on latest nightlies of OCP 4.4, OCP 4.5 and OCP 4.6 too >>What works for OCS 4.5 is -> OCS 4.5 on OCP 4.5 build older than OR equal to Jun 17th nightlies >>What works for OCS 4.4 -> OCP 4.4, OCP 4.5 (even latest nightlies) So to summarize from Petr's comment#23 > | OCP 4.4 new | OCP 4.5 old | OCP 4.5 new | OCP 4.6 new > --------+---------+--------------+-------------- > OCS 4.4 | works | works | works | not tested (n+2) > OCS 4.5 | broken | works | broken | broken
Look at https://bugzilla.redhat.com/show_bug.cgi?id=1852865#c13 for more details.
https://github.com/openshift/ocs-operator/pull/613 master patch merged
https://github.com/openshift/ocs-operator/pull/617 backport PR. Merged.
Contained in 4.5.0-479.ci / https://ceph-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/OCS%20Build%20Pipeline%204.5/58/
Verified below combinations: 1) OCP 4.4 + OCS 4.5 - vSphere ocs-operator.v4.5.0-482.ci openshift installer (4.4.0-0.nightly-2020-07-09-063156) Job: https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/9641/console 2) OCP 4.5 + OCS 4.5 -vSphere ocs-operator.v4.5.0-484.ci openshift installer (4.5.0-0.nightly-2020-07-07-210042) https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/9696/console Marking as Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3754