Bug 2069389
| Summary: | Consumer cluster deletion succeeds(even with existing PVCs) but provider still lists the storageconsumer and related resources | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Neha Berry <nberry> | ||||
| Component: | odf-managed-service | Assignee: | Dhruv Bindra <dbindra> | ||||
| Status: | VERIFIED --- | QA Contact: | suchita <sgatfane> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.10 | CC: | aeyal, dbindra, kbader, nigoyal, odf-bz-bot, sgatfane, srai | ||||
| Target Milestone: | --- | Keywords: | Tracking | ||||
| Target Release: | --- | Flags: | nberry:
needinfo?
(kbader) |
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | Bug | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Neha Berry
2022-03-28 20:09:05 UTC
I tried to reproduce the bug but I was not able to Few things which I found: Your provider cluster is using an older image of deployer: quay.io/osd-addons/ocs-osd-deployer:2.0.0-2 and your consumer cluster is using a new image of deployer: quay.io/osd-addons/ocs-osd-deployer:2.0.0-5 The doc that you are using has some steps that do not need to be followed now as the deployer was updated, I have added comments to the doc. The behavior I observed while reproducing the bug: Deployer doesn't allow uninstallation if PVCs are using OCS storage classes. As soon as I delete the PVCs using OCS storage classes, the consumer offboarding starts, and the complete openshift-storage namespace is deleted in some time. When I checked the Provider cluster for storageConsumer resource, the resource still exists for the consumer that was offboarded. After debugging found that the PV was utilizing storage on the consumer cluster using cephfs storage class(this PV was created when I created PVCs with OCS storage class i.e. cephrbd and cephfs) When I manually deleted the PV on the consumer and deleted the corresponding subvolume on the provider, the storageConsumer resource was removed. So the deployer needs to uninstall when there is no PV using OCS StorageClasses instead of PVC, I have raised PR for that: https://github.com/red-hat-storage/ocs-osd-deployer/pull/152 If I understand correctly, the problem is when we are deleting the entire consumer instead of following offboarding process, the consumer on the provider side still exists. IIRC, there will stale blockPools, filesystem and cephClient(confirmed with Druv). We need to delete 1. blockPool, in this case `cephblockpool-storageconsumer-326dfd52-773c-4c72-ac1c-6576380bfe37 10d` notice the block pool name has storageconsumer name in the end. 2. filesystem, `cephfilesystemsubvolumegroup-storageconsumer-326dfd52-773c-4c72-ac1c-6576380bfe37 10d` notice the block pool name has storageconsumer name in the end. 3. cephClients, for deleting the cephClients linked to perticluar consumer, you list cephClient and check the `annotation` of cephClients with key `StorageConsumerAnnotation` which will have consumer name. Moving the BZ to ON_QA as the tracker issue and fix in deployer were merged. This is verified on the earlier Deployer version (v2.0.10). No storageconsumer observed if we deleted the PVC from consumer and openshift-storage project deleted successfully from consumer. |