Bug 2016324
| Summary: | Manual approval strategy deployment is problematic | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Petr Balogh <pbalogh> |
| Component: | odf-operator | Assignee: | Nitin Goyal <nigoyal> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Petr Balogh <pbalogh> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.9 | CC: | jrivera, muagarwa, ocs-bugs, odf-bz-bot, ratamir, rperiyas |
| Target Milestone: | --- | Keywords: | Regression |
| Target Release: | ODF 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | v4.9.0-228.ci | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-01-07 17:46:31 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The install worked for me very well with the Manual Approval. But I had to approve the other plan also for noobaa and ocs which I was not expecting to do as the Subscriptions were created with the installPlanApproval Automatic. So we do need to figure out why the plan was created Manual for the OCS and NOOBAA. $ oc get subscriptions,installplans,csvs,pods NAME PACKAGE SOURCE CHANNEL subscription.operators.coreos.com/noobaa-operator noobaa-operator odf-catalogsource alpha subscription.operators.coreos.com/ocs-operator ocs-operator odf-catalogsource alpha subscription.operators.coreos.com/odf-operator odf-operator odf-catalogsource alpha NAME CSV APPROVAL APPROVED installplan.operators.coreos.com/install-4dzhc noobaa-operator.v5.9.0 Manual true installplan.operators.coreos.com/install-6nqsw odf-operator.v4.9.0 Manual true NAME DISPLAY VERSION REPLACES PHASE clusterserviceversion.operators.coreos.com/noobaa-operator.v5.9.0 NooBaa Operator 5.9.0 Succeeded clusterserviceversion.operators.coreos.com/ocs-operator.v4.9.0 OpenShift Container Storage 4.9.0 Succeeded clusterserviceversion.operators.coreos.com/odf-operator.v4.9.0 OpenShift Data Foundation 4.9.0 Succeeded NAME READY STATUS RESTARTS AGE pod/5cb2b16ec2b11bf63dbe691d44a63535dc026bb5315d5075dc6c398b3c4lxkv 0/1 Completed 0 3m19s pod/c4b05566c04876677a22d39fc9c02512401d0962109610e85c8fb900d3hgswz 0/1 Completed 0 3m19s pod/c5d1376974666727b02bf25b3a4828241612186744ef417a668b4bc175xs922 0/1 Completed 0 4m11s pod/noobaa-operator-58cd5c744-rtw88 1/1 Running 0 2m36s pod/ocs-metrics-exporter-5d78547546-wkdzs 1/1 Running 0 2m14s pod/ocs-operator-7c9f547494-c2fsk 1/1 Running 0 2m14s pod/odf-catalogsource-58jk5 1/1 Running 0 4m22s pod/odf-console-5cb444d8c6-9fxlz 1/1 Running 0 3m26s pod/odf-operator-controller-manager-86dc54d559-75xzb 2/2 Running 1 3m26s pod/rook-ceph-operator-7cdbdb4678-4mkcw 1/1 Running 0 2m14s $ oc get subscriptions.operators.coreos.com -o custom-columns=NAME:.metadata.name,APPROVAL:.spec.installPlanApproval NAME APPROVAL noobaa-operator Automatic ocs-operator Automatic odf-operator Manual $ oc get installplans.operators.coreos.com -o custom-columns=NAME:.metadata.name,CSV:.spec.clusterServiceVersionNames NAME CSV install-4dzhc [noobaa-operator.v5.9.0 ocs-operator.v4.9.0] install-6nqsw [odf-operator.v4.9.0] After talking to the OLM Dev I got to know that `this is by design, all operators installed in the same namespace share the approval / auto-update behavior` So to fix this we need to align with the odf-operator subscription if it is automatic we need to create the ocs and noobaa both as automatic else otherwise. The main point is that from UI the OCS and noobaa operators disappears - at least after I approved OCS operator it did. And as I pointed out by UI flow, after I approved OCS install plan the noobaa operator disappeared from UI and then I was not able to approve it quite easily from UI. And in CLI I saw this: $ oc get installPlan -n openshift-storage NAME CSV APPROVAL APPROVED install-4pkrh noobaa-operator.v4.9.0 Manual true install-lgcww odf-operator.v4.9.0 Manual true install-m52ct noobaa-operator.v4.9.0 Manual false I guess you did approval from CLI right? What flow will we tell the customer to do? Basically as we are hiding the OCS and Noobaa operator from installed operator lists I think it might be problematic also for z-stream upgrades. Not sure if there will be pending upgrade if it will appear again? Also when deploying via UI it's asking me only to approve install plan for ODF - there is no mention to do it for OCS and noobaa. This problem still persists, even with the latest patched to odf-operator for dependency management. The actual solution isn't too complicated, and I do agree the UX problem needs to be resolved ASAP, so giving devel_ack+. We'll try to get something merged and backported this week, but it may leak into early next week. I don't see this bug has been moved to ON_QE yet but out of curiosity I tested OCS 4.8 manual approval strategy installed and then upgrade to ODF by installing ODF operator using also manual approval strategy and I was surprised it was actually working. Before ugprade: $ oc get pod -n openshift-storage NAME READY STATUS RESTARTS AGE csi-cephfsplugin-62ccx 3/3 Running 0 35m csi-cephfsplugin-82x88 3/3 Running 0 35m csi-cephfsplugin-9sx96 3/3 Running 0 35m csi-cephfsplugin-provisioner-86d5fc595d-pnt97 6/6 Running 0 35m csi-cephfsplugin-provisioner-86d5fc595d-z7vvp 6/6 Running 0 35m csi-rbdplugin-gssgq 3/3 Running 0 35m csi-rbdplugin-provisioner-7486d85bff-s7tj9 6/6 Running 0 35m csi-rbdplugin-provisioner-7486d85bff-wpz84 6/6 Running 0 35m csi-rbdplugin-v5xmf 3/3 Running 0 35m csi-rbdplugin-xhwng 3/3 Running 0 35m noobaa-core-0 1/1 Running 0 30m noobaa-db-pg-0 1/1 Running 0 30m noobaa-endpoint-57ddf94d55-6g9rz 1/1 Running 0 29m noobaa-operator-7766c78c4b-4j8hx 1/1 Running 0 35m ocs-metrics-exporter-768d9f7f75-628sz 1/1 Running 0 35m ocs-operator-69f768695c-wpqjq 1/1 Running 0 35m rook-ceph-crashcollector-0f6f7cfc9eea81b8e3a94fec0fb04a46-7s9g8 1/1 Running 0 31m rook-ceph-crashcollector-2e8e07ed3d05b303d820a8923d57b3d3-nzs47 1/1 Running 0 31m rook-ceph-crashcollector-77418e86d7c39af2e164c06b73c226ba-pv47p 1/1 Running 0 31m rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-75c5ffc5hv99k 2/2 Running 0 30m rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-56ff8897tq2xc 2/2 Running 0 30m rook-ceph-mgr-a-65b4b67f85-c2z28 2/2 Running 0 31m rook-ceph-mon-a-8cf547d7f-7htmk 2/2 Running 0 35m rook-ceph-mon-b-658fffc787-t62xs 2/2 Running 0 34m rook-ceph-mon-c-7c8cdb998c-9rh9z 2/2 Running 0 33m rook-ceph-operator-595488fd9d-vf2cq 1/1 Running 0 35m rook-ceph-osd-0-645cbb7587-zs5c4 2/2 Running 0 30m rook-ceph-osd-1-576dff9768-w8xg9 2/2 Running 0 30m rook-ceph-osd-2-7d7747b9f8-gjr7b 2/2 Running 0 30m rook-ceph-osd-prepare-ocs-deviceset-0-data-07c9lk--1-dtcrb 0/1 Completed 0 31m rook-ceph-osd-prepare-ocs-deviceset-1-data-0vj8pn--1-t6lhf 0/1 Completed 0 31m rook-ceph-osd-prepare-ocs-deviceset-2-data-09n6fm--1-fzkrl 0/1 Completed 0 31m rook-ceph-tools-cf7458d47-wp52w 1/1 Running 0 30m $ oc get installPlan -n openshift-storage NAME CSV APPROVAL APPROVED install-7wqdf ocs-operator.v4.8.4 Manual true After upgrade: $ oc get installPlan -n openshift-storage NAME CSV APPROVAL APPROVED install-7wqdf ocs-operator.v4.8.4 Manual true install-hvb9b ocs-operator.v4.9.0 Manual true install-rnkmx odf-operator.v4.9.0 Manual true $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE noobaa-operator.v4.9.0 NooBaa Operator 4.9.0 Succeeded ocs-operator.v4.9.0 OpenShift Container Storage 4.9.0 ocs-operator.v4.8.4 Succeeded odf-operator.v4.9.0 OpenShift Data Foundation 4.9.0 Succeeded $ oc get pod -n openshift-storage NAME READY STATUS RESTARTS AGE csi-cephfsplugin-9dhxl 3/3 Running 0 5m55s csi-cephfsplugin-hpkx8 3/3 Running 0 5m30s csi-cephfsplugin-provisioner-74476fd8bd-jv7kk 6/6 Running 0 5m53s csi-cephfsplugin-provisioner-74476fd8bd-mh86p 6/6 Running 0 5m53s csi-cephfsplugin-zn6bq 3/3 Running 0 5m10s csi-rbdplugin-hpdfl 3/3 Running 0 5m48s csi-rbdplugin-provisioner-586b89b84-27v4p 6/6 Running 0 5m56s csi-rbdplugin-provisioner-586b89b84-ghxjv 6/6 Running 0 5m56s csi-rbdplugin-qgdf5 3/3 Running 0 5m9s csi-rbdplugin-wjk97 3/3 Running 0 5m57s noobaa-core-0 1/1 Running 0 4m32s noobaa-db-pg-0 1/1 Running 0 5m3s noobaa-endpoint-849c48d977-vptk8 1/1 Running 0 4m5s noobaa-operator-6f55976cbd-2hnf9 1/1 Running 0 6m7s ocs-metrics-exporter-57fd7dfd44-ds82n 1/1 Running 0 6m32s ocs-operator-f86569488-rwqmb 1/1 Running 0 6m31s odf-console-7dc4779787-sgb66 1/1 Running 0 7m31s odf-operator-controller-manager-84d99999f4-fwkdc 2/2 Running 0 7m31s rook-ceph-crashcollector-0f6f7cfc9eea81b8e3a94fec0fb04a46-xllfh 1/1 Running 0 4m58s rook-ceph-crashcollector-2e8e07ed3d05b303d820a8923d57b3d3-5zkvv 1/1 Running 0 2m13s rook-ceph-crashcollector-77418e86d7c39af2e164c06b73c226ba-cjsgm 1/1 Running 0 5m rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-5cfbc654m2xv8 2/2 Running 0 5m10s rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-bfc5d4dcxz7dv 2/2 Running 0 5m rook-ceph-mgr-a-7745c49bbd-ws7jb 2/2 Running 0 86s rook-ceph-mon-a-9ddb8844f-qlxjm 2/2 Running 0 4m58s rook-ceph-mon-b-644cf5785f-lnvzb 2/2 Running 0 3m40s rook-ceph-mon-c-5b8b6d6877-zmww9 2/2 Running 0 2m13s rook-ceph-operator-b74f6645f-k6bkf 1/1 Running 0 6m32s rook-ceph-osd-0-7458c85978-bvwgn 2/2 Running 0 79s rook-ceph-osd-1-797ddbbc7-h6lg6 2/2 Running 0 50s rook-ceph-osd-2-5d785d5c75-7j72l 2/2 Running 0 33s rook-ceph-osd-prepare-ocs-deviceset-0-data-07c9lk--1-dtcrb 0/1 Completed 0 44m rook-ceph-osd-prepare-ocs-deviceset-1-data-0vj8pn--1-t6lhf 0/1 Completed 0 44m rook-ceph-osd-prepare-ocs-deviceset-2-data-09n6fm--1-fzkrl 0/1 Completed 0 44m rook-ceph-tools-699fc595dc-wpdzj 1/1 Running 0 6m3s So once it's moved to ON_QE we can probably verify this. I will also try to upgrade to 4.9.1 build. $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE noobaa-operator.v4.9.0 NooBaa Operator 4.9.0 Succeeded ocs-operator.v4.9.0 OpenShift Container Storage 4.9.0 Succeeded odf-operator.v4.9.0 OpenShift Data Foundation 4.9.0 Succeeded $ oc get installplan -n openshift-storage NAME CSV APPROVAL APPROVED install-t757p ocs-operator.v4.9.0 Automatic true install-wmwk7 noobaa-operator.v4.9.1 Manual false $ oc get installplan -n openshift-storage NAME CSV APPROVAL APPROVED install-t757p ocs-operator.v4.9.0 Automatic true install-wmwk7 noobaa-operator.v4.9.1 Manual true $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE noobaa-operator.v4.9.1 NooBaa Operator 4.9.1 noobaa-operator.v4.9.0 Succeeded ocs-operator.v4.9.1 OpenShift Container Storage 4.9.1 ocs-operator.v4.9.0 Installing odf-operator.v4.9.1 OpenShift Data Foundation 4.9.1 odf-operator.v4.9.0 Succeeded $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE noobaa-operator.v4.9.1 NooBaa Operator 4.9.1 noobaa-operator.v4.9.0 Succeeded ocs-operator.v4.9.1 OpenShift Container Storage 4.9.1 ocs-operator.v4.9.0 Succeeded odf-operator.v4.9.1 OpenShift Data Foundation 4.9.1 odf-operator.v4.9.0 Succeeded So looks like the scenario when I installed 4.9.0 with automatic approval strategy and then changed to manual and upgraded to 4.9.1 build 4.9.1-227.ci it worked well. Only one install plan for approve was actually for noobaa operator as you can see. I think that I should also test scenario of fresh manual approval strategy deployment of 4.9.0 and also this upgrade to 4.9.1 build. For this scenario I am scheduling this job: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/7533/ The manual approval strategy installation of 4.9.0 and upgrade to 4.9.1 worked also as expected: $ oc get installplan -n openshift-storage NAME CSV APPROVAL APPROVED install-fxzw9 ocs-operator.v4.9.0 Manual true $ oc get subscription -n openshift-storage NAME PACKAGE SOURCE CHANNEL noobaa-operator-stable-4.9-redhat-operators-openshift-marketplace noobaa-operator redhat-operators stable-4.9 ocs-operator-stable-4.9-redhat-operators-openshift-marketplace ocs-operator redhat-operators stable-4.9 odf-operator odf-operator redhat-operators stable-4.9 $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE noobaa-operator.v4.9.0 NooBaa Operator 4.9.0 Succeeded ocs-operator.v4.9.0 OpenShift Container Storage 4.9.0 Succeeded odf-operator.v4.9.0 OpenShift Data Foundation 4.9.0 Succeeded oc get installplan -n openshift-storage NAME CSV APPROVAL APPROVED install-fxzw9 ocs-operator.v4.9.0 Manual true install-vq8rk ocs-operator.v4.9.1 Manual false $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE noobaa-operator.v4.9.1 NooBaa Operator 4.9.1 noobaa-operator.v4.9.0 Succeeded ocs-operator.v4.9.0 OpenShift Container Storage 4.9.0 Replacing ocs-operator.v4.9.1 OpenShift Container Storage 4.9.1 ocs-operator.v4.9.0 Installing odf-operator.v4.9.0 OpenShift Data Foundation 4.9.0 Replacing odf-operator.v4.9.1 OpenShift Data Foundation 4.9.1 odf-operator.v4.9.0 Installing $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE noobaa-operator.v4.9.1 NooBaa Operator 4.9.1 noobaa-operator.v4.9.0 Succeeded ocs-operator.v4.9.1 OpenShift Container Storage 4.9.1 ocs-operator.v4.9.0 Succeeded odf-operator.v4.9.1 OpenShift Data Foundation 4.9.1 odf-operator.v4.9.0 Succeeded $ oc get installplan -n openshift-storage NAME CSV APPROVAL APPROVED install-fxzw9 ocs-operator.v4.9.0 Manual true install-vq8rk noobaa-operator.v4.9.1 Manual true Hence marking as verified |
Description of problem (please be detailed as possible and provide log snippests): When trying manual approval strategy from UI I am hitting serious issues. When you deploy it is asking you to approve install plan for ODF. But when I approved the other two operators OCS and NOOBAA are unapproved. NAME CSV APPROVAL APPROVED install-4pkrh noobaa-operator.v4.9.0 Manual false install-lgcww odf-operator.v4.9.0 Manual true install-m52ct noobaa-operator.v4.9.0 Manual false Even though I see subscriptions are created with automatic approval strategy: $ oc describe subscription -n openshift-storage |grep Approval: f:installPlanApproval: Install Plan Approval: Automatic f:installPlanApproval: Install Plan Approval: Automatic f:installPlanApproval: Install Plan Approval: Manual But from one of screenshots which I will attach it mentioned that all operators under openshift-storage namespace will be created with manual approval strategy. Pods are in this shape: NAME READY STATUS RESTARTS AGE odf-console-55c5d75cc5-k7gcz 0/1 ContainerCreating 0 4m9s odf-operator-controller-manager-84b8d6d9cc-5sgqz 1/2 Running 3 (48s ago) 4m9s In operator hub under installed operators I see upgrade available for OCS operator and noobaa operator. Once I clicked on OCS and approved install plan under it I see that after I some time I get to this shape: $ oc get pod -n openshift-storage NAME READY STATUS RESTARTS AGE noobaa-operator-67cdd488bb-k9x7m 1/1 Running 0 10m ocs-metrics-exporter-57bc56c9d4-kvblj 1/1 Running 0 9m56s ocs-operator-669b7d4fbd-nkxdx 1/1 Running 0 9m57s odf-console-55c5d75cc5-k7gcz 1/1 Running 0 16m odf-operator-controller-manager-84b8d6d9cc-5sgqz 2/2 Running 6 (9m48s ago) 16m rook-ceph-operator-59b66c6dfc-htvrn 1/1 Running 0 9m56s but now from UI no other operator is visible. Only: OpenShift Data Foundation 4.9.0 provided by Red Hat So I cannot go from UI easily to approve install plan also for noobaa and I see it from CLI that it's still unapproved: $ oc get installPlan -n openshift-storage NAME CSV APPROVAL APPROVED install-4pkrh noobaa-operator.v4.9.0 Manual true install-lgcww odf-operator.v4.9.0 Manual true install-m52ct noobaa-operator.v4.9.0 Manual false Probably brought as dependency but install plan is still not approved which might make troubles I think. Version of all relevant components (if applicable): OCP 4.9 nightly ODF 4.9.0-195.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: Yes as we didn't have any issue with manual approval strategy in OCS 4.8 and before Steps to Reproduce: 1. Install ODF from UI with manual approval strategy 2. 3. Actual results: Described above Expected results: Approve install plan for ODF and do not care about other operators. Additional info: Will attach some output from CLI and screenshots.