Description of problem (please be detailed as possible and provide log snippests): ---------------------------------------------------------------------- noobaa-db PV stays behind in Released state on deleting the storagecluster & then openshift-storage namespace. Reason: With Bug 1849105 and 849532#c5, automatic deletion of the StorageClasses is expected on StorageCluster Deletion. But, in the absence of SC, the nooba-pv stays behind in Released state. with the message added below: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning VolumeFailedDelete 20s (x12 over 13m) openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-bfd6f845d-mh7vd_751011e3-143e-477b-9389-91ca947f8313 rpc error: code = Internal desc = provided secret is empty $ oc get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-36be30b5-a90a-42a7-93ed-144b7ecc31e9 512Gi RWO Delete Bound openshift-storage/ocs-deviceset-2-data-0-2x4w2 thin 2d2h pvc-acee0891-6f46-4996-874b-c68922e5e804 512Gi RWO Delete Bound openshift-storage/ocs-deviceset-1-data-0-gg4hr thin 2d2h pvc-b41c4457-ace7-4749-bae8-3e05c02c43f5 512Gi RWO Delete Bound openshift-storage/ocs-deviceset-0-data-0-5hjlx thin 2d2h >> pvc-c57b8874-4869-4861-a104-b050c90ceec0 60Gi RWO Delete Released openshift-storage/db-noobaa-db-0 ocs-storagecluster-ceph-rbd 2d2h Version of all relevant components (if applicable): ---------------------------------------------------------------------- OCS = 4.5.0-494.ci OCP = 4.5.0-0.nightly-2020-07-22-074214 Ceph = RHCS 4.1.z1 (14.2.8-81.el8cp) Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? ---------------------------------------------------------------------- No Is there any workaround available to the best of your knowledge? ---------------------------------------------------------------------- We can delete the PV manually, but then this would need a documentation. Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? ---------------------------------------------------------------------- 3 Can this issue reproducible? ---------------------------------------------------------------------- Yes Can this issue reproduce from the UI? ---------------------------------------------------------------------- Yes If this is a regression, please provide more details to justify this: ---------------------------------------------------------------------- Uninstall has undergone changes . Earlier, we used to delete the namespace(which deleted the PVC and PV) before deleting the RBD StorageClass Steps to Reproduce: ---------------------------------------------------------------------- Doc link for reference: [1] as OCS 4.5 Uninstall is not yet ready 1. Labelled the Storagecluster with cleanup.ocs.openshift.io=yes-really-destroy-data $ oc label -n openshift-storage storagecluster --all cleanup.ocs.openshift.io=yes-really-destroy-data storagecluster.ocs.openshift.io/ocs-storagecluster labeled 2. Deleted all PVCs and OBCs as per current Step #3 in OCS 4.4 docs 3. Followed the Uninstall as per https://bugzilla.redhat.com/show_bug.cgi?id=1849532#c5 and deleted the StorageCLuster from UI (the storagecluster was already patched with cleanup.ocs.openshift.io=yes-really-destroy-data) 4. Deleted the Storagecluster from UI : Installed Operators->OCS Operator->Storage Cluster-> Delete StorageCLuster Service->OK 5. Check the state of the items deleted due to storagecluster deletion. It is seen that the PVC noobaa-db-0 gets deleted but corresponding PV fails to get deleted due to absence of the StorageCLass(secret is empty). [1] - https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.4/html-single/deploying_openshift_container_storage/index?lb_target=preview#assembly_uninstalling-openshift-container-storage_aws-vmware Actual results: ---------------------------------------------------------------------- The noobaa-db PVC is deleted(along with noobaa-db pod) but the PV stays back in released state. Expected results: ---------------------------------------------------------------------- With these new changes in Uninstall process, we need to handle the deletion of the noobaa PVC and PV more gracefully. Additional info: ---------------------------------------------------------------------- following things are expected to be removed on deleting a Storagecluster: 1. Default StorageClass 2. OCS node labels 3. OCS node taints 4. cleanup the cluster namespace on the dataDirHostPath 5. Delete all the ceph monitor directories on the dataDirHostPath. For example mon-a, mon-b, etc. 6. Clean up the devices on each node.
Talur, if I am not wrong one of the commits in https://github.com/openshift/ocs-operator/pull/645 will fix this issue as well.
(In reply to Mudit Agarwal from comment #4) > Talur, if I am not wrong one of the commits in > https://github.com/openshift/ocs-operator/pull/645 will fix this issue as > well. Not yet. I did attempt to fix it but the PV is still seen in the released state. I am still debugging.
As discussed in a meeting today between engineering and QE, moving this to OCS 4.6. We will document a workaround for OCS 4.5.
jrivera, rtalur, per Comment 7 and Comment 8, it seems that this BZ is already fixed in OCS 4.5. Is this correct? Was it fixed with by Bug 1849105?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3754
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days