1886859 – OCS 4.6: Uninstall stuck indefinitely if any Ceph pods are in Pending state before uninstall

Bug 1886859 - OCS 4.6: Uninstall stuck indefinitely if any Ceph pods are in Pending state before uninstall

Summary: OCS 4.6: Uninstall stuck indefinitely if any Ceph pods are in Pending state b...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	rook
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	OCS 4.6.0
Assignee:	Santosh Pillai
QA Contact:	Oded
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-10-09 14:30 UTC by Neha Berry
Modified:	2020-12-17 06:25 UTC (History)
CC List:	12 users (show)
Fixed In Version:	4.6.0-144.ci
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-12-17 06:24:47 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift ocs-operator pull 855	None	closed	Bug 1886873: fix ceph cluster delete ordering	2021-02-21 09:20:16 UTC
Github	rook rook pull 6719	None	closed	ceph: cleanup should ignore ceph daemon pods that are not scheduled on any node.	2021-02-21 09:20:16 UTC
Red Hat Product Errata	RHSA-2020:5605	None	None	None	2020-12-17 06:25:07 UTC

Description Neha Berry 2020-10-09 14:30:44 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1885676


Description of problem (please be detailed as possible and provide log
snippests):
---------------------------------------------------------------------
Uninstall gets stuck if attempted on an OCS cluster with some Pending ceph pods.

Scenario: To test UI Bug 1885676, intentionally created an AWS cluster with W nodes having only 4 CPU each. Created Storagecluster on the same, even though minimal deployment was triggered due to less total CPU and memory, still some pods(both MDS pods) were in pending state due to Insufficient CPU and memory.

Attempted uninstall on the same cluster by deleting the storage cluster. But storagecluster deletion is stuck on attempt to delete CephFilesystem. 

Observation: In case the OCS cluster was not properly installed, it is seen that uninstall is getting stuck while deleting ceph resources. But this is the scenario where we most definitely want uninstall to work (e.g. to remove failed/incoomplete deployments)

Logs from ocs-operator
=======================

{"level":"info","ts":"2020-10-09T14:09:44.576Z","logger":"controller_storagecluster","msg":"Uninstall: Deleting cephFilesystem","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","CephFilesystem Name":"ocs-storagecluster-cephfilesystem"}
{"level":"error","ts":"2020-10-09T14:09:44.587Z","logger":"controller-runtime.controller","msg":"Reconciler error","controller":"storagecluster-controller","request":"openshift-storage/ocs-storagecluster","error":"Uninstall: Waiting for cephFilesystem ocs-storagecluster-cephfilesystem to be deleted","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/remote-source/app/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/remote-source/app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}



--------------
======= storagecluster ==========
NAME                 AGE    PHASE      EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   127m   Deleting              2020-10-09T12:05:39Z   4.6.0
--------------
======= cephcluster ==========
NAME                             DATADIRHOSTPATH   MONCOUNT   AGE    PHASE   MESSAGE                        HEALTH
ocs-storagecluster-cephcluster   /var/lib/rook     3          128m   Ready   Cluster created successfully   HEALTH_ERR






Version of all relevant components (if applicable):
-------------------------------------------------------

OCP = 4.7.0-0.ci-2020-10-09-055453

OCS = ocs-operator.v4.6.0-590.ci (ocs-registry:4.6.0-119.ci)


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
-------------------------------------------------------------

Yes. Unable to proceed with uninstall, which blocks re-install.

Is there any workaround available to the best of your knowledge?
-----------------------------------------------------------------
Not sure


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
------------------------------------------------------------------
3

Can this issue reproducible?
------------------------------
tested once, but probably yes, same problem would be seen even if OSDs or other ceph resources are not installed correctly

Can this issue reproduce from the UI?
--------------------------------------
NA

If this is a regression, please provide more details to justify this:
---------------------------------------------------------------------
Uninstall feature has undergone changes in OCS 4.6

Steps to Reproduce:
---------------------------
1. Create a cluster, say with m4.xlarge W node, i.e. say 4 CPU and 16 GB instance
2. Attempt to install OCS by creating Storage cluster(which falls back to minimal deployment)
3. Due to insufficient memory and CPU, some pods would be in pending state, esp. MDS
4. To recover the cluster, start uninstall
   UI-> Installed Operators -> OCS -> Storage CLuster-> Delete Storage Cluster
5. Check ocs-operator logs and also keep a check if the storagecluster deletion succeeds.


Actual results:
--------------------
Storage cluster deletion is stuck as attempt to delete CephFilesystem is stuck

Expected results:
--------------------

Even Pending or not fully created cluster should get uninstalled successfully.


Additional info:
--------------------

Pending pods before uninstall itself
--------------------------------

rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-898765765brpl   0/1     Pending     0          123m   <none>         <none>                         <none>           <none>
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-5df979496vppt   0/1     Pending     0          123m   <none>         <none>                         <none>           <none>


Events:
  Type     Reason            Age        From  Message
  ----     ------            ----       ----  -------
  Warning  FailedScheduling  <unknown>        0/6 nodes are available: 2 node(s) didn't match node selector, 3 Insufficient cpu, 4 Insufficient memory.
  Warning  FailedScheduling  <unknown>        0/6 nodes are available: 2 node(s) didn't match node selector, 3 Insufficient cpu, 4 Insufficient memory.
  Warning  FailedScheduling  <unknown>        0/7 nodes are available: 3 Insufficient cpu, 3 node(s) didn't match node selector, 4 Insufficient memory.
  Warning  FailedScheduling  <unknown>        0/8 nodes are available: 3 Insufficient cpu, 4 Insufficient memory, 4 node(s) didn't match node selector.
  Warning  FailedScheduling  <unknown>        0/9 nodes are available: 3 Insufficient cpu, 4 Insufficient memory, 5 node(s) didn't match node selector.



$ oc get machines -o wide -A
NAMESPACE               NAME                                                PHASE     TYPE        REGION      ZONE         AGE   NODE                           PROVIDERID                              STATE
openshift-machine-api   ci-ln-t9nhm3k-d5d6b-kwbgx-master-0                  Running   m5.xlarge   us-east-1   us-east-1b   49m   ip-10-0-167-14.ec2.internal    aws:///us-east-1b/i-0ebf1da2260559b6e   running
openshift-machine-api   ci-ln-t9nhm3k-d5d6b-kwbgx-master-1                  Running   m5.xlarge   us-east-1   us-east-1c   49m   ip-10-0-202-74.ec2.internal    aws:///us-east-1c/i-0a772f1ec88e85af9   running
openshift-machine-api   ci-ln-t9nhm3k-d5d6b-kwbgx-master-2                  Running   m5.xlarge   us-east-1   us-east-1b   49m   ip-10-0-129-147.ec2.internal   aws:///us-east-1b/i-0543254acee6876e7   running
openshift-machine-api   ci-ln-t9nhm3k-d5d6b-kwbgx-worker-us-east-1b-wd5mj   Running   m4.xlarge   us-east-1   us-east-1b   44m   ip-10-0-160-43.ec2.internal    aws:///us-east-1b/i-00f26906bdbbf25e4   running
openshift-machine-api   ci-ln-t9nhm3k-d5d6b-kwbgx-worker-us-east-1b-xddvh   Running   m4.xlarge   us-east-1   us-east-1b   44m   ip-10-0-162-122.ec2.internal   aws:///us-east-1b/i-0685bc3136c35f4e8   running
openshift-machine-api   ci-ln-t9nhm3k-d5d6b-kwbgx-worker-us-east-1c-ms2xr   Running   m4.xlarge   us-east-1   us-east-1c   44m   ip-10-0-225-47.ec2.internal    aws:///us-east-1c/i-0a62c91acd1d463a6   running



  Warning  FailedScheduling  <unknown>        0/9 nodes are available: 3 Insufficient cpu, 4 Insufficient memory, 5 node(s) didn't match node selector.
[nberry@localhost ~]$ 


From rook operator logs
===========================

2020-10-09T13:05:24.562416987Z 2020-10-09 13:05:24.562344 I | ceph-spec: ceph-file-controller: CephCluster "ocs-storagecluster-cephcluster" found but skipping reconcile since ceph health is &{"HEALTH_ERR" map["MDS_ALL_DOWN":{"HEALTH_ERR" "1 filesystem is offline"} "MDS_UP_LESS_THAN_MAX":{"HEALTH_WARN" "1 filesystem is online with fewer MDS than max_mds"}] "2020-10-09T13:05:09Z" "2020-10-09T12:10:40Z" "HEALTH_OK"}

Comment 2 Neha Berry 2020-10-09 14:33:52 UTC

Proposing as a blocker since uninstall is getting suck in a situation when it is most definitely needed to work

Comment 4 Raghavendra Talur 2020-10-12 14:39:43 UTC

We have a root cause.

When MDS pods are not ready, the cephCluster status is set to HEALTH_ERR and rook stops reconciling.

We have a couple of solutions but are still debating on which approach to use. We do think this needs to be fixed in 4.6 because failed installs are one of the important scenarios to handle for uninstall.

Moving this to assigned and giving devel ack, please consider this a blocker for 4.6.

Comment 10 Oded 2020-11-26 19:43:18 UTC

SetUp:
Provider: AWS_IPI
Instance type: m4.xlarge
OCP Version:4.6.0-0.nightly-2020-10-20-172149


Test Process:
1.Check OCP status via UI.
2.Install OCS via UI (4.6.0-169.ci)
3.Check pods status and some pods move to pending state (include rook-ceph-osd-0 pod)
$ oc get pods -n openshift-storage | grep -i Pending
noobaa-core-0                                                     0/1     Pending     0          28m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-56447bf8bxngt   0/1     Pending     0          28m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-645f6488584rm   0/1     Pending     0          28m
rook-ceph-osd-0-5fb754b877-f8tsf                                  0/1     Pending     0          28m

4.Uninstall OCS:
a.Check PVC and OBC status:
$ oc get pvc -n openshift-image-registry 
No resources found in openshift-image-registry namespace.
$ oc get pvc -n -n openshift-monitoring
Error from server (NotFound): namespaces "-n" not found
$ oc get pvc -n openshift-logging
No resources found in openshift-logging namespace.
$ oc get obc -A
No resources found


b.Delete storagecluster [stuck]
$ oc delete -n openshift-storage storagecluster --all --wait=true
storagecluster.ocs.openshift.io "ocs-storagecluster" deleted
[stuck]

test procedure detailed:
https://docs.google.com/document/d/1MFRQ3j65uBm3CirM6M6uL4-ybVoFcMmRGI2SFtqvbzI/edit


must-gather:
http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-1886859/

Comment 11 Oded 2020-11-27 10:23:25 UTC

Uninstall stuck because MDS pods and osd-0 pod are in Pending state before uninstall

Comment 15 Yaniv Kaul 2020-11-30 11:52:42 UTC

Doesn't look like a high severity issue for me:
1. It's uninstall.
2. It's a negative scenario in that flow.

Perhaps I'm missing the severity here?

Comment 17 Travis Nielsen 2020-11-30 18:50:02 UTC

Talur's fix to change the ordering of the cephcluster first has fixed the original issue. Now a new issue is showing up where the uninstall of the cephcluster is failing if any of the ceph pods are in pending state (and not assigned to a node). This is observed by the following entry in the operator log [1].

2020-11-26T18:37:59.510065295Z 2020-11-26 18:37:59.510012 E | ceph-cluster-controller: failed to reconcile. failed to find valid ceph hosts in the cluster "openshift-storage": failed to get hostname from node "": resource name may not be empty

This would only affect clusters with ceph pods stuck pending. Uninstall of a normal cluster would proceed as expected. Since this is an uninstall issue in a failed scenario as mentioned by Yaniv, agreed it's not a blocker.

Rook should ignore the error of a pod being in pending state and continue with the uninstall for any pods that are not in pending state. Moving to 4.6.z since this would be a simple and low risk fix.


[1] http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-1886859/must-gather.local.676632254385058491/quay-io-rhceph-dev-ocs-must-gather-sha256-e25ccd49c5519f2fc4a4c3c1a57f31737a6539a0e5afbd0ea927341c500d9e1b/namespaces/openshift-storage/pods/rook-ceph-operator-776564f669-zkhn4/rook-ceph-operator/rook-ceph-operator/logs/current.log

Comment 18 Michael Adam 2020-12-01 10:01:55 UTC

(In reply to Travis Nielsen from comment #17)
> Talur's fix to change the ordering of the cephcluster first has fixed the
> original issue. Now a new issue is showing up where the uninstall of the
> cephcluster is failing if any of the ceph pods are in pending state (and not
> assigned to a node). This is observed by the following entry in the operator
> log [1].
> 
> 2020-11-26T18:37:59.510065295Z 2020-11-26 18:37:59.510012 E |
> ceph-cluster-controller: failed to reconcile. failed to find valid ceph
> hosts in the cluster "openshift-storage": failed to get hostname from node
> "": resource name may not be empty
> 
> This would only affect clusters with ceph pods stuck pending. Uninstall of a
> normal cluster would proceed as expected. Since this is an uninstall issue
> in a failed scenario as mentioned by Yaniv, agreed it's not a blocker.
> 
> Rook should ignore the error of a pod being in pending state and continue
> with the uninstall for any pods that are not in pending state. Moving to
> 4.6.z since this would be a simple and low risk fix.

What we discussed yesterday was that *if* we need another RC for 4.6.0 anyway
and *if* the fix is simple and can be provided quickly, *then* we can consider
adding it to 4.6.0 itself.

@Santosh - is a fix in sight or will it take a few more days?

Comment 19 Santosh Pillai 2020-12-01 10:17:06 UTC

> @Santosh - is a fix in sight or will it take a few more days?

PR with a fix should be ready today.

Comment 20 Santosh Pillai 2020-12-01 10:50:01 UTC

Rook PR to ignore pending ceph daemon pods - https://github.com/rook/rook/pull/6719

Comment 21 Mudit Agarwal 2020-12-08 16:11:08 UTC

Moving it back tp 4.6.0 as discussed in the program meeting.

Santosh, please merge the backport PR to 4.6

Comment 25 Oded 2020-12-14 19:37:36 UTC

Bug fixed.

Install/Uninstall doc:
https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.6/html-single/deploying_openshift_container_storage_using_amazon_web_services/index?lb_target=preview#assembly_uninstalling-openshift-container-storage_rhocs

SetUp:
Provider: AWS_IPI
Instance type: m4.xlarge
OCP Version:4.6.0-0.nightly-2020-12-14-082246


Test Process:
1.Check OCP status via UI:
2.Deploy OLM 4.6.0-195.ci(On OCP4.6, OCS Operator won't be shown in the operator hub without this command)
$ oc create -f install_olm.yaml
namespace/openshift-storage created
operatorgroup.operators.coreos.com/openshift-storage-operatorgroup created
catalogsource.operators.coreos.com/ocs-catalogsource created
3.Istall OCS 4.6:
4.Check pod status:
Check Pod status:
$ oc get pods -n openshift-storage | grep -i Pending
noobaa-core-0                                                     0/1     Pending     0          6m12s
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-68b94b44wsjbj   0/1     Pending     0          5m54s
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-6bbd878bwcgvq   0/1     Pending     0          5m53s
rook-ceph-osd-1-d67f9486f-sv85f                                   0/1     Pending     0          7m14s

$ oc get pods -n openshift-storage
NAME                                                              READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-66r7n                                            3/3     Running     0          11m
csi-cephfsplugin-g6jmn                                            3/3     Running     0          11m
csi-cephfsplugin-provisioner-56d4c79c8-54rhh                      6/6     Running     0          11m
csi-cephfsplugin-provisioner-56d4c79c8-f9mlx                      6/6     Running     0          11m
csi-cephfsplugin-t6dbc                                            3/3     Running     0          11m
csi-rbdplugin-5xmzk                                               3/3     Running     0          11m
csi-rbdplugin-p42km                                               3/3     Running     0          11m
csi-rbdplugin-provisioner-85c448cfc-fnmzq                         6/6     Running     0          11m
csi-rbdplugin-provisioner-85c448cfc-wpmqx                         6/6     Running     0          11m
csi-rbdplugin-rr68d                                               3/3     Running     0          11m
noobaa-core-0                                                     0/1     Pending     0          6m48s
noobaa-db-0                                                       1/1     Running     0          6m48s
noobaa-operator-75b79c46d7-qlnls                                  1/1     Running     0          14m
ocs-metrics-exporter-d47cd54ff-mr6jf                              1/1     Running     0          14m
ocs-operator-7cdfc88b6d-ttszl                                     0/1     Running     0          14m
rook-ceph-crashcollector-ip-10-0-159-51-5576c4dc59-hd7jg          1/1     Running     0          9m38s
rook-ceph-crashcollector-ip-10-0-185-92-75dcdc855c-fbnrn          1/1     Running     0          9m15s
rook-ceph-crashcollector-ip-10-0-197-98-94c8b747b-gpb88           1/1     Running     0          8m45s
rook-ceph-drain-canary-2b829a34cbf20b0892bae76197d5649e-57qjm4n   1/1     Running     0          6m50s
rook-ceph-drain-canary-f473cf35a5491c9aceb4df0f63881604-76v7l2j   1/1     Running     0          7m58s
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-68b94b44wsjbj   0/1     Pending     0          6m30s
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-6bbd878bwcgvq   0/1     Pending     0          6m29s
rook-ceph-mgr-a-7f5d849db7-j9qlc                                  1/1     Running     0          8m24s
rook-ceph-mon-a-6f6bd7dd67-vv5jp                                  1/1     Running     0          9m39s
rook-ceph-mon-b-78998fb464-8hfsd                                  1/1     Running     0          9m16s
rook-ceph-mon-c-7f8547897c-vbl48                                  1/1     Running     0          8m45s
rook-ceph-operator-5df7cd94d6-grwlb                               1/1     Running     0          14m
rook-ceph-osd-0-7d4fdbd474-2vgnl                                  1/1     Running     0          7m58s
rook-ceph-osd-1-d67f9486f-sv85f                                   0/1     Pending     0          7m50s
rook-ceph-osd-2-5d8c4c4fbc-xv5g9                                  1/1     Running     0          6m50s
rook-ceph-osd-prepare-ocs-deviceset-gp2-0-data-0-mtnj6-zkdbg      0/1     Completed   0          8m23s
rook-ceph-osd-prepare-ocs-deviceset-gp2-1-data-0-c9fcb-mzgkp      0/1     Completed   0          8m23s
rook-ceph-osd-prepare-ocs-deviceset-gp2-2-data-0-dhhwk-xdjrd      0/1     Completed   0          8m22s


$ oc get storagecluster -n openshift-storage
NAME                 AGE   PHASE         EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   12m   Progressing              2020-12-14T18:56:43Z   4.6.0


Uninstall:
1.Get PVC
$ oc get pvc -n openshift-image-registry 
No resources found in openshift-image-registry namespace.
$ oc get pvc -n openshift-monitoring
No resources found in openshift-monitoring namespace.
$ oc get pvc -n openshift-logging
No resources found in openshift-logging namespace.

2.Delete storagecluster
$ oc delete -n openshift-storage storagecluster --all --wait=true
storagecluster.ocs.openshift.io "ocs-storagecluster" deleted
[take 2 minutes]

$ oc get storagecluster -n openshift-storage
No resources found in openshift-storage namespace.

3.Check for cleanup pods:
$ oc get pods -n openshift-storage | grep -i cleanup
cluster-cleanup-job-ip-10-0-159-51-zbm9q       0/1     Completed   0          4m18s
cluster-cleanup-job-ip-10-0-185-92-4npmh       0/1     Completed   0          4m18s
cluster-cleanup-job-ip-10-0-197-98-vjw9q       0/1     Completed   0          4m18s

4.Confirm that the directory /var/lib/rook is now empty.

5.Unlabel the storage nodes.
$ oc label nodes  --all cluster.ocs.openshift.io/openshift-storage-
label "cluster.ocs.openshift.io/openshift-storage" not found.
node/ip-10-0-144-130.us-east-2.compute.internal not labeled
node/ip-10-0-159-51.us-east-2.compute.internal labeled
node/ip-10-0-185-92.us-east-2.compute.internal labeled
label "cluster.ocs.openshift.io/openshift-storage" not found.
node/ip-10-0-187-77.us-east-2.compute.internal not labeled
label "cluster.ocs.openshift.io/openshift-storage" not found.
node/ip-10-0-193-88.us-east-2.compute.internal not labeled
node/ip-10-0-197-98.us-east-2.compute.internal labeled

$ oc label nodes  --all topology.rook.io/rack-
label "topology.rook.io/rack" not found.
node/ip-10-0-144-130.us-east-2.compute.internal not labeled
label "topology.rook.io/rack" not found.
node/ip-10-0-159-51.us-east-2.compute.internal not labeled
label "topology.rook.io/rack" not found.
node/ip-10-0-185-92.us-east-2.compute.internal not labeled
label "topology.rook.io/rack" not found.
node/ip-10-0-187-77.us-east-2.compute.internal not labeled
label "topology.rook.io/rack" not found.
node/ip-10-0-193-88.us-east-2.compute.internal not labeled
label "topology.rook.io/rack" not found.
node/ip-10-0-197-98.us-east-2.compute.internal not labeled


6.Remove the OpenShift Container Storage taint if the nodes were tainted.

7.Confirm all PVs provisioned using OpenShift Container deleted
$ oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                STORAGECLASS   REASON   AGE
pvc-2e6c0aaa-959b-4b22-bf5e-fb1902c2b57e   512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-gp2-0-data-0-mtnj6   gp2                     24m
pvc-51771e31-6098-4931-80dc-0ed96401904d   512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-gp2-2-data-0-dhhwk   gp2                     24m
pvc-e3d72c50-d478-40ac-b1d7-6e064c733577   512Gi      RWO            Delete           Bound    openshift-storage/ocs-deviceset-gp2-1-data-0-c9fcb   gp2                     24m

8.Delete the Multicloud Object Gateway storageclass.

9.Remove CustomResourceDefinitions.
$ oc delete crd backingstores.noobaa.io bucketclasses.noobaa.io cephblockpools.ceph.rook.io cephclusters.ceph.rook.io cephfilesystems.ceph.rook.io cephnfses.ceph.rook.io cephobjectstores.ceph.rook.io cephobjectstoreusers.ceph.rook.io noobaas.noobaa.io ocsinitializations.ocs.openshift.io  storageclusterinitializations.ocs.openshift.io storageclusters.ocs.openshift.io cephclients.ceph.rook.io cephobjectrealms.ceph.rook.io, cephobjectzonegroups.ceph.rook.io cephobjectzones.ceph.rook.io cephrbdmirrors.ceph.rook.io --wait=true --timeout=5m
customresourcedefinition.apiextensions.k8s.io "backingstores.noobaa.io" deleted
customresourcedefinition.apiextensions.k8s.io "bucketclasses.noobaa.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephblockpools.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephclusters.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephfilesystems.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephnfses.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephobjectstores.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephobjectstoreusers.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "noobaas.noobaa.io" deleted
customresourcedefinition.apiextensions.k8s.io "ocsinitializations.ocs.openshift.io" deleted
customresourcedefinition.apiextensions.k8s.io "storageclusters.ocs.openshift.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephclients.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephobjectzonegroups.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephobjectzones.ceph.rook.io" deleted
customresourcedefinition.apiextensions.k8s.io "cephrbdmirrors.ceph.rook.io" deleted
Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io "storageclusterinitializations.ocs.openshift.io" not found
Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io "cephobjectrealms.ceph.rook.io," not found

10.Delete the namespace and wait till the deletion is complete
$ oc project default
Now using project "default" on server "https://api.oviner-awsbug14.qe.rh-ocs.com:6443".
$ oc delete project openshift-storage --wait=true --timeout=5m
project.project.openshift.io "openshift-storage" deleted
$ oc get project openshift-storage
Error from server (NotFound): namespaces "openshift-storage" not found

11.Ensure that OpenShift Container Storage is uninstalled completely (via UI)

test procedure detailed:
https://docs.google.com/document/d/1MFRQ3j65uBm3CirM6M6uL4-ybVoFcMmRGI2SFtqvbzI/edit

Comment 27 errata-xmlrpc 2020-12-17 06:24:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5605

Note You need to log in before you can comment on or make changes to this bug.