Created attachment 1701576 [details] Backingstore-creation-UI POD and PVC stay back in terminating state on deleting a UI-based-Backingstore (Provider:PVC) Description of problem (please be detailed as possible and provide log snippests): ---------------------------------------------------------------------- Created 2 Backingstores with Provider:PVC and SC = ocs-storagecluster-ceph-rbd , one from UI and one from CLI Then deleted both the backingstores. Following are some of the observations: 1. On deletion of BS created from CLI : Backingstore, PVC and POD are successfully removed from the cluster 2. On deletion of BS created from UI: Backingstore gets deleted. But the POD and PVC stay back in Terminating state. Snip of the outputs ************************** Created 2 BS: ================== Fri Jul 17 17:25:02 UTC 2020 ========PVC============ nb-bs1-ui-noobaa-pvc-633bce14 Bound pvc-75c6a31b-2960-4caa-8046-9448996e407b 50Gi RWO ocs-storagecluster-ceph-rbd 2m43s nb-bs2-cli-noobaa-pvc-756bd564 Bound pvc-814ed9cf-4f69-4b88-be94-5e6c2d3304a7 16Gi RWO ocs-storagecluster-ceph-rbd 3m23s ======POD========= nb-bs1-ui-noobaa-pod-633bce14 1/1 Running 0 2m45s 10.129.2.12 compute-1 <none> <none> nb-bs2-cli-noobaa-pod-756bd564 1/1 Running 0 3m25s 10.129.2.11 compute-1 <none> <none> =======BS===== NAME TYPE PHASE AGE nb-bs1-ui pv-pool Ready 2m49s nb-bs2-cli pv-pool Ready 3m31s noobaa-default-backing-store s3-compatible Ready 11h =====bucketclass========== NAME PLACEMENT PHASE AGE noobaa-default-bucket-class map[tiers:[map[backingStores:[noobaa-default-backing-store]]]] Ready 11h ----------------------------------- >> Deleted both the BS using oc command. The pod and PVC for UI-based-BS are in Terminating state. --------------------------------------- AFAIU oc describe doesnot show any significant reason for the same. Fri Jul 17 17:43:03 UTC 2020 ========PVC============ nb-bs1-ui-noobaa-pvc-633bce14 Terminating pvc-75c6a31b-2960-4caa-8046-9448996e407b 50Gi RWO ocs-storagecluster-ceph-rbd 20m ======POD========= nb-bs1-ui-noobaa-pod-633bce14 0/1 Terminating 0 20m 10.129.2.12 compute-1 <none> <none> =======BS===== NAME TYPE PHASE AGE noobaa-default-backing-store s3-compatible Ready 11h =====bucketclass========== NAME PLACEMENT PHASE AGE noobaa-default-bucket-class map[tiers:[map[backingStores:[noobaa-default-backing-store]]]] Ready 11h Version of all relevant components (if applicable): ---------------------------------------------------------------------- Tested on 2 setups OCS : ocs-operator.v4.5.0-493.ci OCP : 4.5.0-0.nightly-2020-07-17-032241 INFO[0000] CLI version: 2.3.0 INFO[0000] noobaa-image: noobaa/noobaa-core:5.5.0-rc3 INFO[0000] operator-image: noobaa/noobaa-operator:2.3.0 INFO[0000] Namespace: openshift-storage Ceph = 14.2.8-59.el8cp Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? ---------------------------------------------------------------------- No. Is there any workaround available to the best of your knowledge? ---------------------------------------------------------------------- Yes. patch the pod with the finalizer: null e.g. $ oc patch pod/nb-bs1-ui-noobaa-pod-633bce14 -n openshift-storage --type=merge -p '{"metadata": {"finalizers":null}}' pod/nb-bs1-ui-noobaa-pod-633bce14 patched Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? ---------------------------------------------------------------------- 3 Can this issue reproducible? ---------------------------------------------------------------------- Yes. Tested it multiple times with diffrent flow of events. Same outcome Can this issue reproduce from the UI? ---------------------------------------------------------------------- Yes If this is a regression, please provide more details to justify this: ---------------------------------------------------------------------- Not sure Steps to Reproduce: ---------------------------------------------------------------------- 1. Create a BS with Provider:PVC from CLI with noobaa CLI </usr/local/bin/nooba-cli> backingstore create pv-pool nb-bs2-cli --num-volumes 1 --pv-size-gb 16 --storage-class ocs-storagecluster-ceph-rbd |tee cli-BS-creation.txt 2. Create a BS with Provider:PVC via UI. Attached screenshot . 3. Check both the backingstores and their corresponding PVC and PODs are created. 4. Delete the above 2 Backinsgores, either from CLI or UI. (I tested both and observed same behavior) To delete from CLI : oc delete backingstore <BS name> To delete from UI: Installed Operators->OCS Operator-> Backing Store-> Click on 3 dots against the Backingstores -> Delete backingstore 5. Check if both the backingstore and their corresponding PODs and PVCs are deleted successfully. Actual results: ---------------------------------------------------------------------- The PODs and PVCs for the deleted "UI-created-BS" stay back in terminating state. Even force deletion of the pod does not work. One has to pacth the finalizer for the pod to null. Expected results: ---------------------------------------------------------------------- Deletion Behavior should be the same for backingstores created via UI and CLI. BS, pods and PVCs should be successfully removed upon deletion. Additional info: ---------------------------------------------------------------------- Following combinations were tested and the outcome was the same.. the PODs and PVCS stuck in terminating state for the deleted UI-created-BS 1. Created 1 BS from CLI and 1 from UI . Deleted both from CLI. Observation: ------------------- a) Backingstores get deleted. b) The PODs and PVCs belonging to the BS created from UI were stuck in terminating state. Versions: OCS : ocs-operator.v4.5.0-493.ci OCP : 4.5.0-0.nightly-2020-07-17-032241 2. Created 1 BS from CLI and 1 from UI. Deleted both the Backingstores from UI Installed Operators->BackingStore->Delete Backing Store Observation: ------------------- a) Backingstores get deleted. b) The PODs and PVCs belonging to the BS created from UI were stuck in terminating state. Versions: OCS -4.5.0-487.ci OCP - 4.5.0-0.nightly-2020-07-14-213353
Proposing as a blocker for 4.5 as PV backingstore is the default fallback and as it is going to be fully supported in this version.
It seems I found the reason for the problem The Backing store created from the UI is missing some metadata, the noobaa finalizer, the noobaa label and an ownerRef. For the sake of this bug the problem is the missing finalizer. Without it the deleting the backing store will take effect immediately and the noobaa operator will not have the opertunity to run the proper steps to allow deletion of the resources behind the backing store. Usualy we add the finalizer during our reconcile loop, but it seems we are not handling an edge case manifest in this bug. I will update the code to handle this edge case, will issue an upstream PR, and will update here
A PR with a fix was issued on the upstream project (see links section)
Created attachment 1703083 [details] backingstore yamls Verified the deletion of Backingstore created from UI is successful in OCS build - ocs-operator.v4.5.0-508.ci OCP build - 4.5.0-0.nightly-2020-07-30-213620 $ ../nooba508 version INFO[0000] CLI version: 2.3.0 INFO[0000] noobaa-image: noobaa/noobaa-core:5.5.0 INFO[0000] operator-image: noobaa/noobaa-operator:2.3.0 tested on External Mode cluster(OCP on vmware) and Internam Mode cluster(AWS) 1. Created a BS with Provider:PVC from CLI with noobaa CLI </usr/local/bin/nooba-cli> backingstore create pv-pool nb-bs2-cli --num-volumes 1 --pv-size-gb 16 --storage-class ocs-storagecluster-ceph-rbd |tee cli-BS-creation.txt 2. Created a BS with Provider:PVC via UI. 3. Checked both the backingstores and their corresponding PVC and PODs are created. 4. Deleted the above 2 Backinsgores, either from CLI or UI. (I tested both and observed same behavior) Observation: Backingstore deletion -Success POd and PVC Deletion - success ___________________________________________________________________ Checked the BS created from UI and it had following new things: (Attached in the BZ) 1. Labels 2. Finalizers @Ohad But the UI based backingstore still doesnt have any OwnerReference set(Comment#4). Is this expected? ____________________________________________________________________________ After creation of 2 BS -------------------------- ========CSV ====== NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.5.0-508.ci OpenShift Container Storage 4.5.0-508.ci Succeeded -------------- =======PODS ====== nb-bs2-cli-noobaa-pod-eb3d5c27 1/1 Running 0 14m 10.129.3.21 ip-10-0-135-115.us-east-2.compute.internal <none> <none> nb-ui-bs1-noobaa-pod-e51a27c7 1/1 Running 0 14m 10.131.0.251 ip-10-0-180-109.us-east-2.compute.internal <none> <none> -------------- ======= PVC ========== db-noobaa-db-0 Bound pvc-b3cecc87-2537-4d7d-93d2-276e4f21df7d 50Gi RWO ocs-storagecluster-ceph-rbd 4h16m nb-bs2-cli-noobaa-pvc-eb3d5c27 Bound pvc-e3530a31-a2f9-4e87-b202-33b422ef2c93 16Gi RWO ocs-storagecluster-ceph-rbd 14m nb-ui-bs1-noobaa-pvc-e51a27c7 Bound pvc-99c20328-0dfa-415f-aba1-c0f37eb8f532 50Gi RWO ocs-storagecluster-ceph-rbd 14m -------------- ======= backingstore ========== NAME TYPE PHASE AGE nb-bs2-cli pv-pool Ready 14m nb-ui-bs1 pv-pool Ready 14m After deletion of the 2 BS --------------------------- $ oc delete backingstore nb-ui-bs1 backingstore.noobaa.io "nb-ui-bs1" deleted $ oc delete backingstore nb-bs2-cli backingstore.noobaa.io "nb-bs2-cli" deleted _ $ oc get pods -o wide -n openshift-storage|grep nobbaa-pod [nberry@localhost akrai]$ oc get pvc -o wide -n openshift-storage|grep nobbaa-pvc [nberry@localhost akrai]$ oc get backingstore -o wide -n openshift-storage NAME TYPE PHASE AGE noobaa-default-backing-store aws-s3 Ready 4h34m __________________________________________________________________ --- apiVersion: noobaa.io/v1alpha1 kind: BackingStore metadata: creationTimestamp: "2020-07-31T09:13:38Z" finalizers: - noobaa.io/finalizer generation: 2 labels: app: noobaa managedFields: - apiVersion: noobaa.io/v1alpha1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:pvPool: .: {} f:numVolumes: {} f:resources: .: {} f:requests: .: {} f:storage: {} f:storageClass: {} f:type: {} manager: Mozilla operation: Update time: "2020-07-31T09:13:38Z" - apiVersion: noobaa.io/v1alpha1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:finalizers: {} f:labels: .: {} f:app: {} f:spec: f:pvPool: f:secret: {} f:status: .: {} f:conditions: {} f:mode: .: {} f:modeCode: {} f:timeStamp: {} f:phase: {} manager: noobaa-operator operation: Update time: "2020-07-31T09:20:27Z" name: nb-ui-bs1 namespace: openshift-storage resourceVersion: "225075" selfLink: /apis/noobaa.io/v1alpha1/namespaces/openshift-storage/backingstores/nb-ui-bs1 uid: 0f3a3cb9-a622-49f9-922e-872be1a6c8d8 spec: pvPool: numVolumes: 1 resources: requests: storage: 50Gi secret: {} ___________________________________________________________________________________________________________________
One small query on Ohad (In reply to Ohad from comment #4) > It seems I found the reason for the problem > > The Backing store created from the UI is missing some metadata, the noobaa > finalizer, the noobaa label and an ownerRef. Hi Ohad, is there a plan to add OwnerReference for the BS created from UI in next release ? that is still a difference we can see between the CLI and UI based yamls, hence wanted to confirm.
Verified based on Comment#9. If a new bug for OwnerReference is needed, shall raise one based on Ohad's reply in Comment#10
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3754