Bug 1858385 - POD and PVC stay back in terminating state on deleting a UI-based-Backingstore (Provider:PVC)
Summary: POD and PVC stay back in terminating state on deleting a UI-based-Backingstor...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: OCS 4.5.0
Assignee: Ohad
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-17 18:11 UTC by Neha Berry
Modified: 2020-09-23 09:06 UTC (History)
6 users (show)

Fixed In Version: v4.5.0-30.ci
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-15 10:18:18 UTC
Embargoed:


Attachments (Terms of Use)
Backingstore-creation-UI (118.40 KB, image/png)
2020-07-17 18:11 UTC, Neha Berry
no flags Details
backingstore yamls (3.03 KB, application/zip)
2020-07-31 09:52 UTC, Neha Berry
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github noobaa noobaa-operator pull 372 0 None closed Remove app label and finalizer from CR templates and add them in code 2020-09-08 05:21:06 UTC
Github noobaa noobaa-operator pull 373 0 None closed Remove app label and finalizer from CR templates and add them in code 2020-09-08 05:21:06 UTC
Red Hat Product Errata RHBA-2020:3754 0 None None None 2020-09-15 10:18:47 UTC

Description Neha Berry 2020-07-17 18:11:51 UTC
Created attachment 1701576 [details]
Backingstore-creation-UI

POD and PVC stay back in terminating state on deleting a UI-based-Backingstore (Provider:PVC) 

Description of problem (please be detailed as possible and provide log
snippests):
----------------------------------------------------------------------
Created 2 Backingstores with Provider:PVC and SC = ocs-storagecluster-ceph-rbd , one from UI and one from CLI

Then deleted both the backingstores. Following are some of the observations:

1. On deletion of BS created from CLI : Backingstore, PVC and POD are successfully removed from the cluster

2. On deletion of BS created from UI: Backingstore gets deleted. But the POD and PVC stay back in Terminating state.


Snip of the outputs
**************************

Created 2 BS:
==================
Fri Jul 17 17:25:02 UTC 2020
========PVC============
nb-bs1-ui-noobaa-pvc-633bce14    Bound    pvc-75c6a31b-2960-4caa-8046-9448996e407b   50Gi       RWO            ocs-storagecluster-ceph-rbd   2m43s
nb-bs2-cli-noobaa-pvc-756bd564   Bound    pvc-814ed9cf-4f69-4b88-be94-5e6c2d3304a7   16Gi       RWO            ocs-storagecluster-ceph-rbd   3m23s

======POD=========
nb-bs1-ui-noobaa-pod-633bce14                                     1/1     Running   0          2m45s   10.129.2.12   compute-1   <none>           <none>
nb-bs2-cli-noobaa-pod-756bd564                                    1/1     Running   0          3m25s   10.129.2.11   compute-1   <none>           <none>

=======BS=====
NAME                           TYPE            PHASE   AGE
nb-bs1-ui                      pv-pool         Ready   2m49s
nb-bs2-cli                     pv-pool         Ready   3m31s
noobaa-default-backing-store   s3-compatible   Ready   11h

=====bucketclass==========
NAME                          PLACEMENT                                                        PHASE   AGE
noobaa-default-bucket-class   map[tiers:[map[backingStores:[noobaa-default-backing-store]]]]   Ready   11h

-----------------------------------


>> Deleted both the BS using oc command. The pod and PVC for UI-based-BS are in Terminating state. 
---------------------------------------
AFAIU oc describe doesnot show any significant reason for the same.


Fri Jul 17 17:43:03 UTC 2020
========PVC============
nb-bs1-ui-noobaa-pvc-633bce14   Terminating   pvc-75c6a31b-2960-4caa-8046-9448996e407b   50Gi       RWO            ocs-storagecluster-ceph-rbd   20m

======POD=========
nb-bs1-ui-noobaa-pod-633bce14                                     0/1     Terminating   0          20m   10.129.2.12   compute-1   <none>           <none>

=======BS=====

NAME                           TYPE            PHASE   AGE
noobaa-default-backing-store   s3-compatible   Ready   11h


=====bucketclass==========
NAME                          PLACEMENT                                                        PHASE   AGE
noobaa-default-bucket-class   map[tiers:[map[backingStores:[noobaa-default-backing-store]]]]   Ready   11h


Version of all relevant components (if applicable):
----------------------------------------------------------------------
Tested on 2 setups

OCS : ocs-operator.v4.5.0-493.ci    
OCP : 4.5.0-0.nightly-2020-07-17-032241 

INFO[0000] CLI version: 2.3.0                           
INFO[0000] noobaa-image: noobaa/noobaa-core:5.5.0-rc3   
INFO[0000] operator-image: noobaa/noobaa-operator:2.3.0 
INFO[0000] Namespace: openshift-storage  

Ceph = 14.2.8-59.el8cp

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
----------------------------------------------------------------------
No.

Is there any workaround available to the best of your knowledge?
----------------------------------------------------------------------
Yes. patch the pod  with the finalizer: null
e.g.  
$ oc patch pod/nb-bs1-ui-noobaa-pod-633bce14 -n openshift-storage  --type=merge -p '{"metadata": {"finalizers":null}}'
pod/nb-bs1-ui-noobaa-pod-633bce14 patched




Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
----------------------------------------------------------------------
3

Can this issue reproducible?
----------------------------------------------------------------------
Yes. Tested it multiple times with diffrent flow of events. Same outcome

Can this issue reproduce from the UI?
----------------------------------------------------------------------
Yes

If this is a regression, please provide more details to justify this:
----------------------------------------------------------------------
Not sure

Steps to Reproduce:
----------------------------------------------------------------------
1. Create a BS with Provider:PVC from CLI with noobaa CLI
 </usr/local/bin/nooba-cli> backingstore create pv-pool nb-bs2-cli --num-volumes 1 --pv-size-gb 16 --storage-class ocs-storagecluster-ceph-rbd |tee cli-BS-creation.txt

2. Create a BS with Provider:PVC via UI. Attached screenshot .

3. Check both the backingstores and their corresponding PVC and PODs are created.

4. Delete the above 2 Backinsgores, either from CLI or UI. (I tested both and observed same behavior)

To delete from CLI : oc delete backingstore <BS name>

To delete from UI: Installed Operators->OCS Operator-> Backing Store-> Click on 3 dots against the Backingstores -> Delete backingstore

5. Check if both the backingstore and their corresponding PODs and PVCs are deleted successfully.

Actual results:
----------------------------------------------------------------------
The PODs and PVCs for the deleted "UI-created-BS" stay back in terminating state. Even force deletion of the pod does not work. One has to pacth the finalizer for the pod to null.


Expected results:
----------------------------------------------------------------------
Deletion Behavior should be the same for backingstores created via UI and CLI. BS, pods and PVCs should be successfully removed upon deletion.

Additional info:
----------------------------------------------------------------------
Following combinations were tested and the outcome was the same.. the PODs and PVCS stuck in terminating state for the deleted UI-created-BS 

1. Created 1 BS from CLI and 1 from UI . Deleted both from CLI. 

Observation:
-------------------
a) Backingstores get deleted.
b) The PODs and PVCs belonging to the BS created from UI were stuck in terminating state.

Versions:

OCS : ocs-operator.v4.5.0-493.ci    
OCP : 4.5.0-0.nightly-2020-07-17-032241 



2. Created 1 BS from CLI and 1 from UI. Deleted both the Backingstores from UI Installed Operators->BackingStore->Delete Backing Store

Observation:
-------------------

a) Backingstores get deleted.
b) The PODs and PVCs belonging to the BS created from UI were stuck in terminating state.

Versions:
OCS -4.5.0-487.ci
OCP - 4.5.0-0.nightly-2020-07-14-213353

Comment 2 Elad 2020-07-17 18:16:45 UTC
Proposing as a blocker for 4.5 as PV backingstore is the default fallback and as it is going to be fully supported in this version.

Comment 4 Ohad 2020-07-19 12:12:32 UTC
It seems I found the reason for the problem

The Backing store created from the UI is missing some metadata, the noobaa finalizer, the noobaa label and an ownerRef.
For the sake of this bug the problem is the missing finalizer. Without it the deleting the backing store will take effect immediately and the noobaa operator will not have the opertunity to run the proper steps to allow deletion of the resources behind the backing store. 

Usualy we add the finalizer during our reconcile loop, but it seems we are not handling an edge case manifest in this bug. 
I will update the code to handle this edge case, will issue an upstream PR, and will update here

Comment 5 Ohad 2020-07-19 20:15:49 UTC
A PR with a fix was issued on the upstream project (see links section)

Comment 9 Neha Berry 2020-07-31 09:52:34 UTC
Created attachment 1703083 [details]
backingstore yamls

Verified the deletion of Backingstore created from UI is successful in OCS build - ocs-operator.v4.5.0-508.ci

OCP build - 4.5.0-0.nightly-2020-07-30-213620

$ ../nooba508 version
INFO[0000] CLI version: 2.3.0                           
INFO[0000] noobaa-image: noobaa/noobaa-core:5.5.0       
INFO[0000] operator-image: noobaa/noobaa-operator:2.3.0 




tested on External Mode cluster(OCP on vmware) and Internam Mode cluster(AWS)

1. Created a BS with Provider:PVC from CLI with noobaa CLI
 </usr/local/bin/nooba-cli> backingstore create pv-pool nb-bs2-cli --num-volumes 1 --pv-size-gb 16 --storage-class ocs-storagecluster-ceph-rbd |tee cli-BS-creation.txt

2. Created a BS with Provider:PVC via UI. 
3. Checked both the backingstores and their corresponding PVC and PODs are created.

4. Deleted the above 2 Backinsgores, either from CLI or UI. (I tested both and observed same behavior)

Observation:

Backingstore deletion -Success
POd and PVC Deletion - success

___________________________________________________________________

Checked the BS created from UI and it had following new things: (Attached in the BZ)

1.  Labels
2. Finalizers

@Ohad But the UI based backingstore still doesnt have any OwnerReference set(Comment#4). Is this expected? 





____________________________________________________________________________

After creation of 2 BS
--------------------------

========CSV ======
NAME                         DISPLAY                       VERSION        REPLACES   PHASE
ocs-operator.v4.5.0-508.ci   OpenShift Container Storage   4.5.0-508.ci              Succeeded
--------------
=======PODS ======
nb-bs2-cli-noobaa-pod-eb3d5c27                                    1/1     Running     0          14m     10.129.3.21    ip-10-0-135-115.us-east-2.compute.internal   <none>           <none>
nb-ui-bs1-noobaa-pod-e51a27c7                                     1/1     Running     0          14m     10.131.0.251   ip-10-0-180-109.us-east-2.compute.internal   <none>           <none>

--------------
======= PVC ==========
db-noobaa-db-0                   Bound    pvc-b3cecc87-2537-4d7d-93d2-276e4f21df7d   50Gi       RWO            ocs-storagecluster-ceph-rbd   4h16m
nb-bs2-cli-noobaa-pvc-eb3d5c27   Bound    pvc-e3530a31-a2f9-4e87-b202-33b422ef2c93   16Gi       RWO            ocs-storagecluster-ceph-rbd   14m
nb-ui-bs1-noobaa-pvc-e51a27c7    Bound    pvc-99c20328-0dfa-415f-aba1-c0f37eb8f532   50Gi       RWO            ocs-storagecluster-ceph-rbd   14m
--------------

======= backingstore ==========
NAME                           TYPE      PHASE   AGE
nb-bs2-cli                     pv-pool   Ready   14m
nb-ui-bs1                      pv-pool   Ready   14m


After deletion of the 2 BS
---------------------------

$  oc delete backingstore nb-ui-bs1
backingstore.noobaa.io "nb-ui-bs1" deleted


$  oc delete backingstore nb-bs2-cli
backingstore.noobaa.io "nb-bs2-cli" deleted

_
$ oc get pods -o wide -n openshift-storage|grep nobbaa-pod
[nberry@localhost akrai]$ oc get pvc -o wide -n openshift-storage|grep nobbaa-pvc
[nberry@localhost akrai]$ oc get backingstore -o wide -n openshift-storage
NAME                           TYPE     PHASE   AGE
noobaa-default-backing-store   aws-s3   Ready   4h34m


__________________________________________________________________
---
apiVersion: noobaa.io/v1alpha1
kind: BackingStore
metadata:
  creationTimestamp: "2020-07-31T09:13:38Z"
  finalizers:
  - noobaa.io/finalizer
  generation: 2
  labels:
    app: noobaa
  managedFields:
  - apiVersion: noobaa.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:pvPool:
          .: {}
          f:numVolumes: {}
          f:resources:
            .: {}
            f:requests:
              .: {}
              f:storage: {}
          f:storageClass: {}
        f:type: {}
    manager: Mozilla
    operation: Update
    time: "2020-07-31T09:13:38Z"
  - apiVersion: noobaa.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers: {}
        f:labels:
          .: {}
          f:app: {}
      f:spec:
        f:pvPool:
          f:secret: {}
      f:status:
        .: {}
        f:conditions: {}
        f:mode:
          .: {}
          f:modeCode: {}
          f:timeStamp: {}
        f:phase: {}
    manager: noobaa-operator
    operation: Update
    time: "2020-07-31T09:20:27Z"
  name: nb-ui-bs1
  namespace: openshift-storage
  resourceVersion: "225075"
  selfLink: /apis/noobaa.io/v1alpha1/namespaces/openshift-storage/backingstores/nb-ui-bs1
  uid: 0f3a3cb9-a622-49f9-922e-872be1a6c8d8
spec:
  pvPool:
    numVolumes: 1
    resources:
      requests:
        storage: 50Gi
    secret: {}

___________________________________________________________________________________________________________________

Comment 10 Neha Berry 2020-07-31 09:57:47 UTC
One small query on Ohad (In reply to Ohad from comment #4)
> It seems I found the reason for the problem
> 
> The Backing store created from the UI is missing some metadata, the noobaa
> finalizer, the noobaa label and an ownerRef.

Hi Ohad, is there a plan to add OwnerReference for the BS created from UI in next release ? that is still a difference we can see between the CLI and UI based yamls, hence wanted to confirm.

Comment 11 Neha Berry 2020-07-31 10:00:27 UTC
Verified based on Comment#9. If a new bug for OwnerReference is needed, shall raise one based on Ohad's reply in Comment#10

Comment 14 errata-xmlrpc 2020-09-15 10:18:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3754


Note You need to log in before you can comment on or make changes to this bug.