Bug 2263468

Summary: Manual Deletion of storageconsumer deletes the storageclassrequests but not the underlying pool and subvolumegroup(due to volumes present under them)
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Neha Berry <nberry>
Component: ocs-operatorAssignee: Leela Venkaiah Gangavarapu <lgangava>
Status: CLOSED ERRATA QA Contact: Jilju Joy <jijoy>
Severity: high Docs Contact:
Priority: medium    
Version: 4.15CC: jijoy, lgangava, mrajanna, muagarwa, odf-bz-bot, rohgupta
Target Milestone: ---   
Target Release: ODF 4.16.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: isf-provider
Fixed In Version: 4.16.0-84 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-07-17 13:13:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Neha Berry 2024-02-09 06:20:41 UTC
Description of problem (please be detailed as possible and provide log
snippests):
=======================================================
On a provider client setup, deleted an entire HCP cluster (which had ocs-storageclient operator installed along with multiple RBD and CephFS PVCs and PODS).

The deletion of the HCP cluster succeeded, however as expected, due to ungraceful deletion of ocs-storageclient operator, following resources were leftover on the provider side for that particular HCP cluster_ID : bd97ae8e-57e8-4082-9f4a-889f7f464788


1. Storageconsumer
2. Storageclassrequests
3. Cephclients
4. Cephblockpool
5. cephFilesystemsubvolumegroup


As per the suggested steps, I deleted the storage consumer manually @Thu Feb  8 15:38:40 UTC 2024

# oc delete storageconsumer -n openshift-storage storageconsumer-
bd97ae8e-57e8-4082-9f4a-889f7f464788
storageconsumer.ocs.openshift.io "storageconsumer-bd97ae8e-57e8-4082-9f4a-889f7f464788" deleted
Post this step, following resources for  cluster_ID bd97ae8e-57e8-4082-9f4a-889f7f464788 were deleted
1. Storageconsumer
2. Storageclassrequests
3. Cephclients

However the following resources remained on the provider side and in the ceph backend
—————————————————————————

4. Cephblockpool
5. cephFilesystemsubvolumegroup

=====Cephfilesystemsubvolumegroup=======
NAMESPACE           NAME                                                                                         PHASE
openshift-storage   cephfilesystemsubvolumegroup-storageconsumer-71eea242-935a-49f5-be03-084d2843a95e-3af5665d   Ready
openshift-storage   cephfilesystemsubvolumegroup-storageconsumer-a696155d-fc47-49a1-8cee-a4462cc21e88-bfc15763   Ready

>>openshift-storage   cephfilesystemsubvolumegroup-storageconsumer-bd97ae8e-57e8-4082-9f4a-889f7f464788-5341db75   Ready

openshift-storage   cephfilesystemsubvolumegroup-storageconsumer-f0ee5b2a-e786-44e4-b299-3344d5fb7941-f7f9b3e5   Ready
=====cephblockpool====
NAMESPACE           NAME                                                                          PHASE
openshift-storage   cephblockpool-storageconsumer-71eea242-935a-49f5-be03-084d2843a95e-78f6850c   Ready
openshift-storage   cephblockpool-storageconsumer-a696155d-fc47-49a1-8cee-a4462cc21e88-6fbcf432   Ready
>> openshift-storage   cephblockpool-storageconsumer-bd97ae8e-57e8-4082-9f4a-889f7f464788-3c7b4b5a   Ready
openshift-storage   cephblockpool-storageconsumer-f0ee5b2a-e786-44e4-b299-3344d5fb7941-530a656d   Ready




Version of all relevant components (if applicable):
=======================================================
ODF 4.15 and needs to be back ported to 4.14.5 and above

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
=======================================================
No. We do have a WA but for that customer needs provider access

Is there any workaround available to the best of your knowledge?
=======================================================
Yes.

Delete the subvolumegroup and Blackpool resource manually after cleaning up the sub volumes and rbd volumes manually from the ceph side

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
=======================================================
4

Can this issue reproducible?
=======================================================
Yes

Can this issue reproduce from the UI?
=======================================================
Yes

If this is a regression, please provide more details to justify this:
=======================================================
No.

Steps to Reproduce:
=======================================================
1. Create a provider/client configuration following the docs [1] and [2]
2. Install the ocs-client-operator from the CLI [yaml at [3] as it is not listed under the Operator Hub 
3. Check for the operator listed in the client namespace we create or even under all-namespaces
4. Create PVCs of both cephFS and RBD within the HCP client which we plan to delete
5. Delete the HCP server from UI
Hosting cluster-> All Clusters-> Select the HCP cluster (here hcp414-cc) -> Actions-> Delete
6. Wait for few minutes for deletion to complete

[1] https://docs.google.com/document/d/1uVjzJQFv_t1Fymb-hgMhNd6pve8fkbrxp1TadU5wv9I/edit?pli=1

[2] https://docs.google.com/document/d/1RfDNQi4B3x4kv9PXx2lqGSD2V2UGideTCIXqLYJFg0Y/edit?pli=1#heading=h.bebinah79b8e

Observations

1. The hcp cluster is deleted 
2. Resources are left behind at the provider side due to non-graceful uninstall
3. Deleted the storageconsumer
 > storageclassrequests and cephclients are automatically deleted for that cluster ID



Actual results:
=======================================================
The cephblockpool and subvolumegroup are not deleted even on deletion of storageclassrequests and currently in presence of volumes within them, rook doesn’t delete them

OUTPUTS ADDED HERE https://ibm.ent.box.com/notes/1439310719724
Expected results:
=======================================================
As per discussion here [3] and [4], we need to add this functionality to delete the 

[3] https://ibm-systems-storage.slack.com/archives/C05RJB6H0LQ/p1707407485799189

[4] https://ibm-systems-storage.slack.com/archives/C05RJB6H0LQ/p1707407662255889

Additional info:
=======================================================

sh-5.1$ ceph fs subvolume ls ocs-storagecluster-cephfilesystem --group_name cephfilesystemsubvolumegroup-storageconsumer-bd97ae8e-57e8-4082-9f4a-889f7f464788-5341db75
[
    {
        "name": "csi-vol-f70c949a-ba5a-4fbc-85ed-148a7fd58d1a"
    },
    {
        "name": "csi-vol-445e8941-67b1-4b3a-83c0-b74460ec1d09"
    },
    {
        "name": "csi-vol-85bfbf79-cfc4-465b-a718-f0a58a1259e2"
    },
    {
        "name": "csi-vol-3f06b704-31ef-4500-b0a5-60d698ed3d4f"
    },
    {
        "name": "csi-vol-7f1a7136-d584-45a5-a159-1c08c9b405bd"
    },
    {
        "name": "csi-vol-48b312ac-9ff1-4b09-8dd5-efd814d6663c"
    },
    {
        "name": "csi-vol-7199fecc-1430-431d-bdf5-063c15509a3f"
    }
]
sh-5.1$ ceph fs subvolume ls ocs-storagecluster-cephfilesystem --group_name cephfilesystemsubvolumegroup-storageconsumer-bd97ae8e-57e8-4082-9f4a-889f7f464788-5341db75
[
    {
        "name": "csi-vol-f70c949a-ba5a-4fbc-85ed-148a7fd58d1a"
    },
    {
        "name": "csi-vol-445e8941-67b1-4b3a-83c0-b74460ec1d09"
    },
    {
        "name": "csi-vol-85bfbf79-cfc4-465b-a718-f0a58a1259e2"
    },
    {
        "name": "csi-vol-3f06b704-31ef-4500-b0a5-60d698ed3d4f"
    },
    {
        "name": "csi-vol-7f1a7136-d584-45a5-a159-1c08c9b405bd"
    },
    {
        "name": "csi-vol-48b312ac-9ff1-4b09-8dd5-efd814d6663c"
    },
    {
        "name": "csi-vol-7199fecc-1430-431d-bdf5-063c15509a3f"
    }
]
sh-5.1$ ceph fs  subvolumegroup ls  ocs-storagecluster-cephfilesystem
[
    {
        "name": "cephfilesystemsubvolumegroup-storageconsumer-f0ee5b2a-e786-44e4-b299-3344d5fb7941-f7f9b3e5"
    },
    {
        "name": "cephfilesystemsubvolumegroup-storageconsumer-bd97ae8e-57e8-4082-9f4a-889f7f464788-5341db75"
    },
    {
        "name": "cephfilesystemsubvolumegroup-storageconsumer-71eea242-935a-49f5-be03-084d2843a95e-3af5665d"
    },
    {
        "name": "csi"
    },
    {
        "name": "cephfilesystemsubvolumegroup-storageconsumer-a696155d-fc47-49a1-8cee-a4462cc21e88-bfc15763"
    }
]
sh-5.1$ rbd ls -p cephblockpool-storageconsumer-bd97ae8e-57e8-4082-9f4a-889f7f464788-3c7b4b5a
csi-vol-24b9359c-8b69-4288-98d9-634a726c8ab6
csi-vol-40661b24-3e29-43c7-b3ce-13a063ccbdb8
csi-vol-523c9033-30a0-4bc4-aff9-8d78bc1b014a
csi-vol-6cf31ba7-33e7-4f76-bd9d-bf4c8e58c720
csi-vol-ec47c4d6-a780-442e-87c8-d9e419f212c7
csi-vol-f9c287a0-86be-4e2e-af94-39702950a0c7
sh-5.1$

Comment 5 Rohan Gupta 2024-04-08 10:48:59 UTC
@Leela does this bug need to be a tracker bug?

Comment 6 Leela Venkaiah Gangavarapu 2024-04-12 06:48:10 UTC
yes, this can be a tracker.

Comment 11 Jilju Joy 2024-06-24 07:08:39 UTC
This is verified in https://bugzilla.redhat.com/show_bug.cgi?id=2280813#c6. The link have the bug verification steps and outcomes. Deletion of cephblockpoolradosnamespace is verified instead of cephblockpool because these is no cephblockpool per consumer now.
Verified in version:
OCP 4.16.0-ec.6
ODF 4.16.0-110

Comment 14 errata-xmlrpc 2024-07-17 13:13:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.16.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:4591