Bug 1820238 - Manila shares are not deleted when the cluster is destroyed
Summary: Manila shares are not deleted when the cluster is destroyed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.8.z
Assignee: Matthew Booth
QA Contact: rlobillo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-02 14:59 UTC by Mike Fedosin
Modified: 2021-10-27 08:15 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-27 08:15:28 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cloud-provider-openstack pull 38 0 None Merged Bug 1820238: Fetch latest changes from upstream 2021-10-05 09:27:20 UTC
Github openshift csi-driver-manila-operator pull 73 0 None Merged Bug 1820238: add metadata with cluster ID to generated storage classes 2021-10-05 09:27:19 UTC
Github openshift csi-driver-manila-operator pull 98 0 None Merged Bug 1820238: add cluster id to share metadata 2021-10-05 09:27:15 UTC
Github openshift installer pull 4952 0 None Merged Bug 1820238: delete manila shares and snapshots along with the cluster 2021-10-05 09:27:23 UTC
Red Hat Product Errata RHBA-2021:3927 0 None None None 2021-10-27 08:15:53 UTC

Description Mike Fedosin 2020-04-02 14:59:25 UTC
When I remove the cluster all previously created manila shares remain in the system.

Comment 1 Pierre Prinetti 2020-05-07 14:28:44 UTC
The team considers this bug as valid. Considering this bug priority and our capacity, we are deferring this bug to an upcoming sprint. If there are reasons for us to reprioritise, please let us know.

Comment 2 Pierre Prinetti 2020-05-14 14:09:07 UTC
Considering the priority assigned to this bug and our team capacity, we are deferring this bug to an upcoming sprint. Please let us know if there are reasons for us to reprioritize.

Comment 4 Pierre Prinetti 2020-06-18 14:38:08 UTC
This known issue has been documented in 4.5[1]. A proper fix is planned for 4.6.

[1]: https://github.com/openshift/openshift-docs/pull/22199

Comment 5 Martin André 2020-06-25 14:38:02 UTC
Considering the priority assigned to this bug and our team capacity, we are deferring this bug to an upcoming sprint. Please let us know if there are reasons for us to reprioritize.

Planned for 4.6.

Comment 7 Jan Safranek 2020-09-09 13:39:27 UTC
We need to take some holistic approach what to do with volumes / snapshots when a cluster is destroyed. Volumes of some in-tree volume plugins are destroyed (AWS), some are not (vSphere?).

Comment 8 Jan Safranek 2020-09-17 15:59:32 UTC
Result: 'openshift-install destroy cluster' should delete volumes & snapshots provisioned in the cluster.

Therefore we need:
1. The CSI driver to label snapshots accordingly.
2. openshift-install to list & delete them (this may be already implemented for Cinder).

Comment 9 Martin André 2020-12-10 09:09:52 UTC
We're missing a patch in the installer to properly set metadata to volumes created by manila.

Comment 11 Wei Duan 2021-01-13 12:48:51 UTC
Currently, there is something wrong when get the OCP ClusterID in storageclass, so manila share did not be removed when destroying the cluster.

parameters:
  appendShareMetadata: '{"openshiftClusterID": ""}'

Assign it back.

Comment 15 Tom Barron 2021-05-21 13:15:08 UTC
I wonder if this behavior should be configurable.  I recall discussion upstream of use cases where a user wants to tear down their K8s cluster and then bring up a new one later using persistent storage from the original cluster.  This could make sense (in concert with a Retain reclaim policy) in clouds where there may be economies in only paying for Compute and Network when there is work to do but paying all the time for Storage to keep state in the idle intervals.

Do other provisioners all remove all volumes and snapshots when the k8s cluster is torn down?

Comment 21 Pierre Prinetti 2021-10-05 09:14:08 UTC
This bug needs a new round of triage.

Resetting bugzilla keyworks to retrigger pre-triage assignment.

Comment 26 Matthew Booth 2021-10-05 18:36:23 UTC
If this fails QA again, please could you also include the output of cluster destroy?

Comment 27 Matthew Booth 2021-10-06 16:02:17 UTC
manila-csi-driver has supported --cluster-id since 4.8
csi-driver-manila-operator has added --cluster-id to manila-csi-driver since 4.8
installer has supported deleting manila shares by tag since 4.8

The bug should be fixed in 4.8. I verified that it is fixed in 4.9.

With this bugfix, all manila volumes created after the fix was applied will be cleaned up by the installer when the cluster is destroyed, regardless of storageclass. Manila volumes created before the fix was applied will not be cleaned up by the installer.

Comment 28 Max Bridges 2021-10-06 19:02:10 UTC
To be clear, this was never listed as a "Known Issue" in the RNs for any version?

Comment 30 Emilien Macchi 2021-10-13 15:22:29 UTC
(In reply to Max Bridges from comment #28)
> To be clear, this was never listed as a "Known Issue" in the RNs for any
> version?

Right, it was never documented. We need to do it for 4.7, 4.6, 4.5.

And if possible QE would verify it on 4.8, 4.9 etc, where the code was implemented.

Thanks

Comment 31 rlobillo 2021-10-19 17:27:04 UTC
Verified on OCP4.8.15 on top of RHOS-16.1-RHEL-8-20210818.n.0 with Manila enabled. 

1. Action: Install cluster with IPI 

DEBUG Time elapsed per stage:                                                                                      
DEBUG     Infrastructure: 1m52s                                                                                       
DEBUG Bootstrap Complete: 22m34s                                                                                                          
DEBUG                API: 3m53s                                                                                                  
DEBUG  Bootstrap Destroy: 33s                                                                                                    
DEBUG  Cluster Operators: 23m9s                                                                                    
INFO Time elapsed: 49m4s                                                         

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.15    True        False         20m     Cluster version is 4.8.15


2. Action: run:

$ sh multiple_pvc_manila.sh

where multiple_pvc_manila.sh contains: 

oc new-project manila-test
for i in {1..49}
do
  cat <<EOF | oc apply -f -
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-manila-$i
parameters:
  csi.storage.k8s.io/node-publish-secret-name: csi-manila-secrets
  csi.storage.k8s.io/node-publish-secret-namespace: openshift-manila-csi-driver
  csi.storage.k8s.io/node-stage-secret-name: csi-manila-secrets
  csi.storage.k8s.io/node-stage-secret-namespace: openshift-manila-csi-driver
  csi.storage.k8s.io/provisioner-secret-name: csi-manila-secrets
  csi.storage.k8s.io/provisioner-secret-namespace: openshift-manila-csi-driver
  type: default
provisioner: manila.csi.openstack.org
reclaimPolicy: Delete
volumeBindingMode: Immediate
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: "pvc-manila-$i"
  namespace: "manila-test"
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: csi-manila-$i
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo-$i
  namespace: "manila-test"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: demo-$i
  template:
    metadata:
      labels:
        app: demo-$i
    spec:
      containers:
      - name: demo
        image: quay.io/kuryr/demo
        ports:
        - containerPort: 80
          protocol: TCP
        volumeMounts:
          - mountPath: /var/lib/www/data
            name: mydata
      volumes:
        - name: mydata
          persistentVolumeClaim:
            claimName: pvc-manila-$i
            readOnly: false
EOF
done

3. Checks:
- Confirm that the SCs are created.

$ oc get sc | grep -e csi-manila-[0-9] | wc -l
50

- Confirm that the PVCs are created and Bound.

$ oc get pvc -n manila-test | grep Bound | grep -e csi-manila-[0-9]  | wc -l
50


- Confirm that pods are up and running.

$ oc get pods -n manila-test | grep Running | wc -l
50

- Confirm that shares are created on OSP (manila list) and they have the expected metadata:

(shiftstack) $ for i in $(manila list --columns ID | grep -v -e Id -e + | awk -F\| '{print $2}'); do manila metadata-show $i; done | grep -v -e Property -e + | sort | uniq -c
     50 | manila.csi.openstack.org/cluster | ostest-pghlt |


4. Action: Create manually an extra share through OSP API:

(shiftstack) $ manila create NFS 1 --name personal_nfs --metadata personal=true


4. Action: Destroy the OCP cluster and check:
- Confirm that shares created by OCP were removed from OSP 
- Only the share created manually persist.

DEBUG Purging asset "Metadata" from disk           
DEBUG Purging asset "Master Ignition Customization Check" from disk 
DEBUG Purging asset "Worker Ignition Customization Check" from disk 
DEBUG Purging asset "Terraform Variables" from disk 
DEBUG Purging asset "Kubeconfig Admin Client" from disk 
DEBUG Purging asset "Kubeadmin Password" from disk 
DEBUG Purging asset "Certificate (journal-gatewayd)" from disk 
DEBUG Purging asset "Cluster" from disk            
INFO Time elapsed: 1m10s                          

(shiftstack) $ for i in $(manila list --columns ID | grep -v -e Id -e + | awk -F\| '{print $2}'); do manila metadata-show $i; done | grep -v -e Property -e + | sort | uniq -c
      1 | personal | true  |


5. Remove share:

(shiftstack) $ manila delete personal_nfs
(shiftstack) $ manila list
+----+------+------+-------------+--------+-----------+-----------------+------+-------------------+
| ID | Name | Size | Share Proto | Status | Is Public | Share Type Name | Host | Availability Zone |
+----+------+------+-------------+--------+-----------+-----------------+------+-------------------+
+----+------+------+-------------+--------+-----------+-----------------+------+-------------------+

Comment 34 errata-xmlrpc 2021-10-27 08:15:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.17 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3927


Note You need to log in before you can comment on or make changes to this bug.