https://bugzilla.redhat.com/show_bug.cgi?id=1885676 Description of problem (please be detailed as possible and provide log snippests): --------------------------------------------------------------------- External Mode: OCS uninstall stuck with following error message. The storage cluster deletion is stuck rook-operator logs snip ===================== 2020-10-09 10:50:06.611358 E | ceph-object-controller: failed to delete object store. users for objectstore "ocs-external-storagecluster-cephobjectstore" in namespace "openshift-storage" are not cleaned up. remaining users: [noobaa-ceph-objectstore-user] ocs-operator log snip ========================= 2020-10-09T10:40:31.859672296Z {"level":"info","ts":"2020-10-09T10:40:31.859Z","logger":"controller_storagecluster","msg":"Uninstall: CephObjectStoreUser not found","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster","CephObjectStoreUser Name":"ocs-external-storagecluster-cephobjectstoreuser"} 2020-10-09T10:40:31.859679762Z {"level":"info","ts":"2020-10-09T10:40:31.859Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting cephObjectStore","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster","CephObjectStore Name":"ocs-external-storagecluster-cephobjectstore"} 2020-10-09T10:40:31.878084801Z {"level":"error","ts":"2020-10-09T10:40:31.878Z","logger":"controller-runtime.controller","msg":"Reconciler error","controller":"storagecluster-controller","request":"openshift-storage/ocs-external-storagecluster","error":"Uninstall: Waiting for cephObjectStore ocs-external-storagecluster-cephobjectstore to be deleted","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/remote-source/app/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"} Note: The cluster was in good shape before uninstall was triggered. . Version of all relevant components (if applicable): ------------------------------------------------------- OCP = 4.7.0-0.ci-2020-10-09-055453 OCS = ocs-operator.v4.6.0-590.ci (ocs-registry:4.6.0-119.ci) - Last build which passed OCS-CI acceptance tests Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? ------------------------------------------------------------- Yes. Unable to proceed with uninstall, which blocks re-install. Is there any workaround available to the best of your knowledge? ----------------------------------------------------------------- Not sure Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? ------------------------------------------------------------------ 3 Can this issue reproducible? ------------------------------ tested once on this OCS build. Can this issue reproduce from the UI? -------------------------------------- NA If this is a regression, please provide more details to justify this: --------------------------------------------------------------------- Uninstall feature has undergone changes in OCS 4.6 Steps to Reproduce: --------------------------- 1. Create an OCS external mode cluster. the cluster is in Connected state 2. Trigger OCS uninstall a) Delete all PVCs/OBCs b) Trigger OCS uninstall by deleting the storage cluster from UI or CLI. The default annotations were not changed. UI -> Installed Operators->OCS-> Storage Cluster-> ocs-external-storagecluster-> Delete Storage cluster Actual results: -------------------- Storage cluster deletion is stuck as cephobjectsore deletion is not succeeding 2020-10-09 10:50:06.611358 E | ceph-object-controller: failed to delete object store. users for objectstore "ocs-external-storagecluster-cephobjectstore" in namespace "openshift-storage" are not cleaned up. remaining users: [noobaa-ceph-objectstore-user] Expected results: -------------------- Uninstall should clean up all resources. Additional info: -------------------- Fri Oct 9 10:49:59 UTC 2020 -------------- ========CSV ====== NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.6.0-593.ci OpenShift Container Storage 4.6.0-593.ci ocs-operator.v4.6.0-590.ci Succeeded -------------- =======PODS ====== NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES csi-cephfsplugin-fflxv 3/3 Running 0 63m 10.1.160.96 compute-0 <none> <none> csi-cephfsplugin-provisioner-7d5fb4d7cd-kv8mb 6/6 Running 0 31m 10.128.2.10 compute-1 <none> <none> csi-cephfsplugin-provisioner-7d5fb4d7cd-qxnpj 6/6 Running 0 38m 10.131.0.6 compute-0 <none> <none> csi-cephfsplugin-td4jk 3/3 Running 0 63m 10.1.160.149 compute-2 <none> <none> csi-cephfsplugin-wk72s 3/3 Running 0 63m 10.1.160.76 compute-1 <none> <none> csi-rbdplugin-2zpf6 3/3 Running 0 63m 10.1.160.96 compute-0 <none> <none> csi-rbdplugin-krpbg 3/3 Running 0 63m 10.1.160.76 compute-1 <none> <none> csi-rbdplugin-provisioner-54ff9fbd95-mljks 6/6 Running 0 38m 10.131.0.7 compute-0 <none> <none> csi-rbdplugin-provisioner-54ff9fbd95-z2wkw 6/6 Running 0 31m 10.128.2.9 compute-1 <none> <none> csi-rbdplugin-tjw8n 3/3 Running 0 63m 10.1.160.149 compute-2 <none> <none> noobaa-operator-7b9d89779f-xp42l 1/1 Running 0 31m 10.128.2.7 compute-1 <none> <none> ocs-metrics-exporter-79cbfc99d9-p4kfr 1/1 Running 0 31m 10.131.0.17 compute-0 <none> <none> ocs-operator-68db4bfc8d-zzwmw 1/1 Running 0 31m 10.128.2.5 compute-1 <none> <none> rook-ceph-operator-59fcc7f5cc-x6q4h 1/1 Running 0 38m 10.131.0.5 compute-0 <none> <none> -------------- ======= PVC ========== No resources found in openshift-storage namespace. -------------- ======= storagecluster ========== NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-external-storagecluster 19h Deleting true 2020-10-08T15:23:57Z 4.6.0 -------------- ======= cephcluster ========== NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH ocs-external-storagecluster-cephcluster 19h Connected Cluster connected successfully HEALTH_OK ======= PV ==== No resources found ======= backingstore ========== No resources found in openshift-storage namespace. ======= bucketclass ========== No resources found in openshift-storage namespace. ======= obc ========== No resources found in openshift-storage namespace. Storagecluster.yaml = ./quay-io-rhceph-dev-ocs-must-gather-sha256-6d8aab40e985fb3e08836349018e833fe489397a27a1fd4e9326a81e2cc54373/namespaces/openshift-storage/oc_output/storagecluster.yaml Noobaa operator log = quay-io-rhceph-dev-ocs-must-gather-sha256-6d8aab40e985fb3e08836349018e833fe489397a27a1fd4e9326a81e2cc54373/namespaces/openshift-storage/pods/noobaa-operator-7b9d89779f-xp42l/noobaa-operator/noobaa-operator/logs/current.log
Seeing the same behavior with `internal-attached devices` Deleted the user from the toolbox pod using the `--purge-data` and the user got deleted after a few minutes and uninstall was completed. ToolBox: ``` $ oc exec -it rook-ceph-tools-78cdfd976c-q5lhv bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead. bash-4.4$ radosgw-admin radosgw-admin: -h or --help for usage bash-4.4$ radosgw-admin user rm --uid=noobaa-ceph-objectstore-user could not remove user: unable to remove user, must specify purge data to remove user with buckets bash-4.4$ radosgw-admin user rm --uid=noobaa-ceph-objectstore-user --purge-data bash-4.4$ radosgw-admin user rm --uid=noobaa-ceph-objectstore-user --purge-data could not remove user: unable to remove user, user does not exist bash-4.4$ ``` Snippet from operator logs: ``` 2020-10-12 10:04:41.286377 I | op-mon: parsing mon endpoints: a=172.30.14.78:6789,b=172.30.38.79:6789,c=172.30.169.120:6789 2020-10-12 10:04:41.288213 I | op-k8sutil: ROOK_OBC_WATCH_OPERATOR_NAMESPACE="true" (env var) 2020-10-12 10:04:41.289973 I | ceph-object-controller: no buckets found for objectstore "ocs-storagecluster-cephobjectstore" in namespace "openshift-storage" 2020-10-12 10:04:41.291770 E | ceph-object-controller: failed to delete object store. users for objectstore "ocs-storagecluster-cephobjectstore" in namespace "openshift-storage" are not cleaned up. remaining users: [noobaa-ceph-objectstore-user] 2020-10-12 10:04:48.676171 I | op-mon: parsing mon endpoints: a=172.30.14.78:6789,b=172.30.38.79:6789,c=172.30.169.120:6789 2020-10-12 10:04:48.676242 I | ceph-object-store-user-controller: CephObjectStore "ocs-storagecluster-cephobjectstore" found 2020-10-12 10:04:48.676461 I | ceph-object-store-user-controller: CephObjectStore "ocs-storagecluster-cephobjectstore" found 2020-10-12 10:04:48.761842 I | ceph-object-store-user-controller: ceph object user "noobaa-ceph-objectstore-user" deleted successfully 2020-10-12 10:04:48.761860 I | ceph-spec: removing finalizer "cephobjectstoreuser.ceph.rook.io" on "noobaa-ceph-objectstore-user" 2020-10-12 10:04:48.789423 I | ceph-spec: object "rook-ceph-object-user-ocs-storagecluster-cephobjectstore-noobaa-ceph-objectstore-user" matched on delete, reconciling 2020-10-12 10:04:51.297127 I | op-mon: parsing mon endpoints: a=172.30.14.78:6789,b=172.30.38.79:6789,c=172.30.169.120:6789 ```
@Travis couple of questions: 1. Is it safe to call --purge-data for deleting object users in rook? Say, - If cleanup policy is set, then delete object user with --purge-data flag. 2. Instead, should Nooba be deleting these user buckets if it has created them in the first place? @Romy: Please see question 2 above.
Rook has the expected behavior of blocking the user removal if there are any buckets associated with the user. How was the bucket created? Was it not with an OBC? If the OBC was deleted, the bucket should be deleted, and then the user would be deleted. But if the bucket hasn't deleted, it's dangerous to always purge the bucket when deleting the user. If the yes-really-destroy-data policy is set on the cluster CR, agreed that we can go ahead and purge the user.
I believe there might be a sequencing issue here, Talur, what's the deletion sequence on the ocs-op? Which resources does get deleted first? If we want to remove everything, the CephCluster CR should be deleted first. Thanks @talur
@leseb Deletion Sequence in ocs: https://github.com/openshift/ocs-operator/blob/master/pkg/controller/storagecluster/uninstall_reconciler.go#L328 1. set uninstall policy to rook 2. Set uninstall policy to nooba 3. delete nooba systems 4. delete ceph object store users 5. delete ceph object stores 6. delete ceph file systems 7. delete ceph block pools. 8. delete ceph cluster 9. delete snapshot classes 10. delete storage classes 11. delete node taints.
(In reply to Travis Nielsen from comment #8) > Rook has the expected behavior of blocking the user removal if there are any > buckets associated with the user. How was the bucket created? Was it not > with an OBC? If the OBC was deleted, the bucket should be deleted, and then > the user would be deleted. But if the bucket hasn't deleted, it's dangerous > to always purge the bucket when deleting the user. Doesn't look like the bucket is created by OBC. There were no OBC present during the uninstall. bash-4.4$ radosgw-admin user list [ "noobaa-ceph-objectstore-user", "rook-ceph-internal-s3-user-checker-7dfb9ca5-1f97-4421-b5c6-d77f20c7fa05" ] bash-4.4$ radosgw-admin bucket list [ "nb.1602567598484.origin-ci-int-aws.dev.rhcloud.com", "rook-ceph-bucket-checker-7dfb9ca5-1f97-4421-b5c6-d77f20c7fa05" ] > > If the yes-really-destroy-data policy is set on the cluster CR, agreed that > we can go ahead and purge the user.
Thanks Santosh, the order should change as explained in my previous comment, the CephCluster CR must be deleted first, then all the other Ceph ressources. This is not a Rook issue, moving to OCS operator.
Providing dev ack as the fix for this should be same as https://bugzilla.redhat.com/show_bug.cgi?id=1886859 Fix is under test and we should have a PR soon.
Backport PR is not yet merged.
Verified the fix on OCS 4.6 4.6.0-144.ci external mode cluster. Will test in internal mode too, before moving the BZ to verified state 1. Created an OCS external mode cluster. the cluster is in Connected state 2. Triggered OCS uninstall Observation: The storage cluster deletion is no longer stuck on cephobjectoreuser still exists. OCP = 4.6.0-0.nightly-2020-10-22-034051 OCS = ocs-operator.v4.6.0-144.ci _________________________________________________________________________________________________ Before triggering uninstall ========================= Wed Oct 28 16:45:23 UTC 2020 -------------- ========CSV ====== NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.6.0-144.ci OpenShift Container Storage 4.6.0-144.ci Succeeded -------------- =======PODS ====== NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES csi-cephfsplugin-hh85d 3/3 Running 0 35m 10.1.160.165 compute-0 <none> <none> csi-cephfsplugin-n7rgp 3/3 Running 0 35m 10.1.160.180 compute-2 <none> <none> csi-cephfsplugin-nvnmn 3/3 Running 0 35m 10.1.160.161 compute-1 <none> <none> csi-cephfsplugin-provisioner-56455449bd-6cmhn 6/6 Running 0 35m 10.131.0.205 compute-1 <none> <none> csi-cephfsplugin-provisioner-56455449bd-bnnvk 6/6 Running 0 35m 10.129.2.94 compute-2 <none> <none> csi-rbdplugin-68wgt 3/3 Running 0 35m 10.1.160.165 compute-0 <none> <none> csi-rbdplugin-6xfvz 3/3 Running 0 35m 10.1.160.180 compute-2 <none> <none> csi-rbdplugin-7wjdv 3/3 Running 0 35m 10.1.160.161 compute-1 <none> <none> csi-rbdplugin-provisioner-586fc6cfc-d55ds 6/6 Running 0 35m 10.128.2.68 compute-0 <none> <none> csi-rbdplugin-provisioner-586fc6cfc-nh2br 6/6 Running 0 35m 10.131.0.204 compute-1 <none> <none> noobaa-core-0 1/1 Running 0 35m 10.128.2.69 compute-0 <none> <none> noobaa-db-0 1/1 Running 0 35m 10.131.0.206 compute-1 <none> <none> noobaa-endpoint-58dc95697d-4gnzc 1/1 Running 0 34m 10.131.0.207 compute-1 <none> <none> noobaa-operator-7bcf846c94-h722m 1/1 Running 0 36m 10.131.0.203 compute-1 <none> <none> ocs-metrics-exporter-777dc7b97f-4v4hm 1/1 Running 0 36m 10.129.2.93 compute-2 <none> <none> ocs-operator-86846df567-gmp25 1/1 Running 0 36m 10.129.2.91 compute-2 <none> <none> rook-ceph-operator-f44db9fbf-4bkrh 1/1 Running 0 36m 10.129.2.92 compute-2 <none> <none> -------------- ======= PVC ========== NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE db-noobaa-db-0 Bound pvc-4c1a12e0-d866-4fe0-842d-95061698db86 50Gi RWO ocs-external-storagecluster-ceph-rbd 35m -------------- ======= storagecluster ========== NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-external-storagecluster 35m Ready true 2020-10-28T16:10:00Z 4.6.0 >> while true; do oc get cephobjectstore -n openshift-storage ; oc get cephobjectstoreuser; sleep 5; done NAME AGE ocs-external-storagecluster-cephobjectstore 35m NAME AGE noobaa-ceph-objectstore-user 35m 2. deleted the storage cluster $ date --utc; oc delete -n openshift-storage storagecluster --all --wait=true Wed Oct 28 16:45:42 UTC 2020 storagecluster.ocs.openshift.io "ocs-external-storagecluster" deleted >> rook-log snip 2020-10-28 16:46:01.516215 E | ceph-object-store-user-controller: failed to reconcile failed to delete ceph object user "noobaa-ceph-objectstore-user": failed to delete ceph object user "noobaa-ceph-objectstore-user". . could not remove user: unable to remove user, must specify purge data to remove user with buckets: failed to delete s3 user: exit status 17 2020-10-28 16:46:02.575081 I | ceph-spec: object "rook-ceph-config" matched on delete, reconciling 2020-10-28 16:46:02.575201 I | ceph-spec: removing finalizer "cephcluster.ceph.rook.io" on "ocs-external-storagecluster-cephcluster" 2020-10-28 16:46:02.591833 E | clusterdisruption-controller: cephcluster "openshift-storage/ocs-external-storagecluster-cephcluster" seems to be deleted, not requeuing until triggered again 2020-10-28 16:46:02.639919 I | ceph-spec: object "rook-ceph-mgr-external" matched on delete, reconciling 2020-10-28 16:46:02.711974 E | clusterdisruption-controller: cephcluster "openshift-storage/" seems to be deleted, not requeuing until triggered again 2020-10-28 16:46:02.712153 I | ceph-spec: removing finalizer "cephobjectstore.ceph.rook.io" on "ocs-external-storagecluster-cephobjectstore" 2020-10-28 16:46:02.739777 E | clusterdisruption-controller: cephcluster "openshift-storage/" seems to be deleted, not requeuing until triggered again 2020-10-28 16:46:02.755733 I | ceph-spec: object "rook-ceph-rgw-ocs-external-storagecluster-cephobjectstore" matched on delete, reconciling 2020-10-28 16:46:02.795772 E | ceph-object-store-user-controller: failed to reconcile failed to populate cluster info: not expected to create new cluster info and did not find existing secret 2020-10-28 16:46:03.796028 I | ceph-spec: removing finalizer "cephobjectstoreuser.ceph.rook.io" on "noobaa-ceph-objectstore-user" 2020-10-28 16:46:03.825505 I | ceph-spec: object "rook-ceph-object-user-ocs-external-storagecluster-cephobjectstore-noobaa-ceph-objectstore-user" matched on delete, reconciling >> ocs-op snip {"level":"info","ts":"2020-10-28T16:46:02.712Z","logger":"controller_storagecluster","msg":"Uninstall in progress","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster","Status":"Uninstall: Waiting for cephObjectStore ocs-external-storagecluster-cephobjectstore to be deleted"} {"level":"info","ts":"2020-10-28T16:46:02.756Z","logger":"controller_storagecluster","msg":"Reconciling external StorageCluster","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster"} {"level":"info","ts":"2020-10-28T16:46:02.798Z","logger":"controller_storagecluster","msg":"Uninstall: CephCluster not found, can't set the cleanup policy and uninstall mode","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster"} {"level":"info","ts":"2020-10-28T16:46:02.798Z","logger":"controller_storagecluster","msg":"Uninstall: NooBaa not found, can't set UninstallModeForced","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster"} {"level":"info","ts":"2020-10-28T16:46:02.798Z","logger":"controller_storagecluster","msg":"NooBaa and noobaa-core PVC not found.","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster"} {"level":"info","ts":"2020-10-28T16:46:02.798Z","logger":"controller_storagecluster","msg":"Uninstall: CephCluster not found","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster"} {"level":"info","ts":"2020-10-28T16:46:02.798Z","logger":"controller_storagecluster","msg":"Uninstall: CephObjectStoreUser not found","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster","CephObjectStoreUser Name":"ocs-external-storagecluster-cephobjectstoreuser"} {"level":"info","ts":"2020-10-28T16:46:02.798Z","logger":"controller_storagecluster","msg":"Uninstall: CephObjectStore not found","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster","CephObjectStore Name":"ocs-external-storagecluster-cephobjectstore"} {"level":"info","ts":"2020-10-28T16:46:02.898Z","logger":"controller_storagecluster","msg":"Uninstall: CephFilesystem not found","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster","CephFilesystem Name":"ocs-external-storagecluster-cephfilesystem"} {"level":"info","ts":"2020-10-28T16:46:02.999Z","logger":"controller_storagecluster","msg":"Uninstall: CephBlockPool not found","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster","CephBlockPool Name":"ocs-external-storagecluster-cephblockpool"} >>while true; do oc get cephobjectstore -n openshift-storage ; oc get cephobjectstoreuser; sleep 5; done No resources found in openshift-storage namespace. No resources found in openshift-storage namespace.
Created attachment 1724898 [details] rook-logs Verified the same on an internal mode cluster on Vmware, version = ocs-operator.v4.6.0-147.ci Steps performed 1. Created 1 OBC, 2 PVCs, 2 Volumesnapshots 2. deleted storagecluster but it was stuck as there were OBCs/PVCs still existing ======= storagecluster ========== NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 91m Deleting 2020-10-28T17:01:52Z 4.6.0 -------------- ======= cephcluster ========== NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH ocs-storagecluster-cephcluster /var/lib/rook 3 91m Deleting Failed to delete cluster HEALTH_OK 3. deleted OBC and PVC and storagecluster deletion progressed. 4. the cleanup pods were created and went to completed state Wed Oct 28 18:38:55 UTC 2020 -------------- ========CSV ====== -------------- ======= storagecluster ========== -------------- ======= cephcluster ========== $ date --utc ; time oc delete -n openshift-storage storagecluster --all --wait=true; date --utc Wed Oct 28 18:20:14 UTC 2020 storagecluster.ocs.openshift.io "ocs-storagecluster" deleted real 18m39.408s user 0m0.506s sys 0m0.125s Wed Oct 28 18:38:54 UTC 2020 [nberry@localhost before]$ 5. the cephobjectstoreuser (noobaa-ceph-objectstore-user) ultimately got deleted, after ~ 6 mins of storagecluster deletion ceph obejctstore user is ultimately deleted >>rook-op-log snip 2020-10-28 18:39:30.800540 I | ceph-cluster-controller: all ceph daemons are cleaned up 2020-10-28 18:39:30.800544 I | ceph-cluster-controller: starting clean up job on node "compute-0" 2020-10-28 18:39:30.838485 I | ceph-cluster-controller: starting clean up job on node "compute-2" 2020-10-28 18:39:30.857747 I | ceph-cluster-controller: starting clean up job on node "compute-1" 2020-10-28 18:44:08.026977 I | ceph-spec: removing finalizer "cephobjectstoreuser.ceph.rook.io" on "noobaa-ceph-objectstore-user" 2020-10-28 18:44:08.104127 I | ceph-spec: object "rook-ceph-object-user-ocs-storagecluster-cephobjectstore-noobaa-ceph-objectstore-user" matched on delete, reconciling >>ocs-op logs {"level":"info","ts":"2020-10-28T18:38:54.178Z","logger":"controller_storagecluster","msg":"Uninstall: CephObjectStore not found","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","CephObjectStore Name":"ocs-storagecluster-cephobjectstore"} {"level":"info","ts":"2020-10-28T18:38:54.178Z","logger":"controller_storagecluster","msg":"Uninstall: CephFilesystem not found","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","CephFilesystem Name":"ocs-storagecluster-cephfilesystem"} {"level":"info","ts":"2020-10-28T18:38:54.178Z","logger":"controller_storagecluster","msg":"Uninstall: CephBlockPool not found","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","CephBlockPool Name":"ocs-storagecluster-cephblockpool"} {"level":"info","ts":"2020-10-28T18:38:54.178Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting SnapshotClass ocs-storagecluster-rbdplugin-snapclass","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.188Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting SnapshotClass ocs-storagecluster-cephfsplugin-snapclass","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.194Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting StorageClass ocs-storagecluster-cephfs","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.200Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting StorageClass ocs-storagecluster-ceph-rbd","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.207Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting StorageClass ocs-storagecluster-ceph-rgw","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.216Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting OCS NodeTolerationKey from the node compute-2","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.227Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting OCS NodeTolerationKey from the node compute-0","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.236Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting OCS NodeTolerationKey from the node compute-1","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.245Z","logger":"controller_storagecluster","msg":"Removing finalizer","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.262Z","logger":"controller_storagecluster","msg":"Object is terminated, skipping reconciliation","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} {"level":"info","ts":"2020-10-28T18:38:54.274Z","logger":"controller_storagecluster","msg":"No StorageCluster resource","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"} 6. deleted the namespace and deletion succeeded $ oc delete namespace openshift-storage namespace "openshift-storage" deleted [nberry@localhost oct28-147.ci]$ oc get project openshift-storage -o yaml ; date --utc Error from server (NotFound): namespaces "openshift-storage" not found Hence, moving the BZ to verified state. ___________________________________________________________ pods after deletion of storagecluster ===================================== Wed Oct 28 18:53:16 UTC 2020 -------------- ========CSV ====== NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.6.0-147.ci OpenShift Container Storage 4.6.0-147.ci Succeeded -------------- =======PODS ====== NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cluster-cleanup-job-compute-0-vrgw5 0/1 Completed 0 13m 10.128.2.120 compute-0 <none> <none> cluster-cleanup-job-compute-1-rpd25 0/1 Completed 0 13m 10.131.0.222 compute-1 <none> <none> cluster-cleanup-job-compute-2-vc6q5 0/1 Completed 0 13m 10.129.2.121 compute-2 <none> <none> compute-0-debug 1/1 Running 0 3m15s 10.1.160.165 compute-0 <none> <none> csi-cephfsplugin-hzpqm 3/3 Running 0 111m 10.1.160.161 compute-1 <none> <none> csi-cephfsplugin-provisioner-98d99f679-c7kvm 6/6 Running 0 111m 10.131.0.213 compute-1 <none> <none> csi-cephfsplugin-provisioner-98d99f679-krjjm 6/6 Running 0 111m 10.129.2.109 compute-2 <none> <none> csi-cephfsplugin-tlpm5 3/3 Running 0 111m 10.1.160.180 compute-2 <none> <none> csi-cephfsplugin-z8lt4 3/3 Running 0 111m 10.1.160.165 compute-0 <none> <none> csi-rbdplugin-bmwl4 3/3 Running 0 111m 10.1.160.180 compute-2 <none> <none> csi-rbdplugin-cmb8l 3/3 Running 0 111m 10.1.160.165 compute-0 <none> <none> csi-rbdplugin-m7gv6 3/3 Running 0 111m 10.1.160.161 compute-1 <none> <none> csi-rbdplugin-provisioner-7d5fc5cf64-f65tb 6/6 Running 0 111m 10.128.2.74 compute-0 <none> <none> csi-rbdplugin-provisioner-7d5fc5cf64-jh7kt 6/6 Running 0 111m 10.131.0.212 compute-1 <none> <none> noobaa-operator-549d7c6f56-vvlwj 1/1 Running 0 112m 10.129.2.107 compute-2 <none> <none> ocs-metrics-exporter-674fccb975-pkdld 1/1 Running 0 112m 10.129.2.108 compute-2 <none> <none> ocs-operator-67d7b745bd-h5k2n 1/1 Running 0 112m 10.129.2.106 compute-2 <none> <none> rook-ceph-operator-6994879bbf-n9qvf 1/1 Running 0 112m 10.131.0.210 compute-1 <none> <none> -------------- ======= PVC ========== NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ocs-deviceset-thin-0-data-0-njqt8 Bound pvc-534a030a-3b71-4c72-8b3c-f20f1dcc117a 512Gi RWO thin 108m ocs-deviceset-thin-1-data-0-g7v7n Bound pvc-c8d1aa8d-bb6c-45a3-99e0-e1e078c31e72 512Gi RWO thin 108m ocs-deviceset-thin-2-data-0-p6wkq Bound pvc-315fcb81-ffbd-4398-a67e-66b7e43f6251 512Gi RWO thin 108m -------------- ======= storagecluster ========== No resources found in openshift-storage namespace. -------------- ======= cephcluster ========== No resources found in openshift-storage namespace.
Moving the BZ to verified state based on outputs and observations in comment#21 and comment#22
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5605