Bug 2313203

Summary: After deleting the pods and PVCs on a 4.16 hosted cluster the associated PVs stuck in a Terminating state instead of being deleted
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Itzhak <ikave>
Component: ceph-csi-operatorAssignee: Leela Venkaiah Gangavarapu <lgangava>
Status: CLOSED ERRATA QA Contact: Itzhak <ikave>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.17CC: jijoy, lgangava, muagarwa, nberry, odf-bz-bot, omitrani
Target Milestone: ---   
Target Release: ODF 4.17.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: isf-provider
Fixed In Version: 4.17.0-105 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-10-30 14:35:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Itzhak 2024-09-18 10:11:22 UTC
Description of problem:

As part of the test, we created PVCs, PVs, and pods associated with the PVCs on one of the 4.16-hosted clusters. After the test, we deleted the pods and PVCs, but the related PVs were stuck in a Terminating state instead of being deleted.

Version-Release number of selected component (if applicable):
Provider OCP/ODF 4.17 and hosted cluster OCP 4.16.

How reproducible:
Create PVCs, PVs, and pods associated with the PVCs on one of the 4.16-hosted clusters.
Delete the pods and PVCs and observe that the PVs are stuck in a Terminating state.
$ oc get pv | grep test | head -n 5
pvc-008acf6c-eba1-41f4-bffd-2a0a70d37081   20Gi       RWO            Delete           Terminating   namespace-test-157145d4389a470a862bdbe5b/pvc-test-9aa2739045104a3fbcddf53f44b0ed7   storage-client-ceph-rbd   <unset>                          33m
pvc-02895d5a-c3ad-4440-9d44-9c697a4748a2   25Gi       RWX            Delete           Terminating   namespace-test-157145d4389a470a862bdbe5b/pvc-test-c854bc87139046ff94bc350d170dc26   storage-client-cephfs     <unset>                          33m
pvc-045ea2d1-9502-424d-90de-8d97827384c3   25Gi       RWX            Delete           Terminating   namespace-test-157145d4389a470a862bdbe5b/pvc-test-158b97c8b62a42efb7d47bb880ed169   storage-client-cephfs     <unset>                          33m
pvc-0794ad8b-1ef8-46a2-9598-e450035591ec   5Gi        RWO            Delete           Terminating   namespace-test-f8459ec7d759437ca241c39a6/pvc-test-dec9a24095d24acb93abf68d874cfce   storage-client-ceph-rbd   <unset>                          3h30m
pvc-081a7303-e139-4a5b-ae5d-9bccec3b54e9   1Gi        RWO            Delete           Terminating   namespace-test-b1fe96d84f3345cbb5f88e226/pvc-test-1233a9b40a95417b9dfd3a2b1d028e6   storage-client-ceph-rbd   <unset>                          4h1m


Steps to Reproduce:
1. Create PVCs, PVs, and pods associated with the PVCs on one of the 4.16-hosted clusters.
2. Delete the pods and PVCs.
3. Check the PVs state.

Actual results:
The PVs are stuck in a Terminating state.

Expected results:
The PVs should deleted successfully.

Additional info:

Link to the Jenkins job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-odf-multicluster/2656/

Provider and client logs: 
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/hcp416-bm1-b/hcp416-bm1-b_20240917T060102/logs/testcases_1726591157/

Provider basic versions: 
OC version:
Client Version: 4.10.24
Server Version: 4.17.0-rc.2
Kubernetes Version: v1.30.4

OCS version:
ocs-operator.v4.17.0-102.stable              OpenShift Container Storage        4.17.0-102.stable                                             Succeeded

Cluster version
NAME      VERSION       AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.17.0-rc.2   True        False         41h     Cluster version is 4.17.0-rc.2

Rook version:
2024/09/18 10:00:59 maxprocs: Leaving GOMAXPROCS=32: CPU quota undefined
rook: v4.17.0-0.e6141b3a7feaa29a7c995c9cfaa13d9e4f6d358d
go: go1.22.5 (Red Hat 1.22.5-1.el9)

Ceph version:
ceph version 18.2.1-229.el9cp (ef652b206f2487adfc86613646a4cac946f6b4e0) reef (stable) 


Hosted cluster versions:

NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.16.11   True        False         5h43m   Cluster version is 4.16.11
openshift-console          console                   console-openshift-console.apps.hcp416-bm1-b.apps.ibm-baremetal1.qe.rh-ocs.com                                  console             https   reencrypt/Redirect   None
openshift-console          downloads                 downloads-openshift-console.apps.hcp416-bm1-b.apps.ibm-baremetal1.qe.rh-ocs.com                                downloads           http    edge/Redirect        None
========CSV ======
NAME                                         DISPLAY                            VERSION             REPLACES   PHASE
cephcsi-operator.v4.17.0-103.stable          CephCSI operator                   4.17.0-103.stable              Succeeded
ocs-client-operator.v4.17.0-103.stable       OpenShift Data Foundation Client   4.17.0-103.stable              Succeeded
odf-csi-addons-operator.v4.17.0-103.stable   CSI Addons                         4.17.0-103.stable              Succeeded
--------------
=======PODS ======
NAME                                                              READY   STATUS      RESTARTS   AGE     IP             NODE                          NOMINATED NODE   READINESS GATES
ceph-csi-controller-manager-9548d44f8-v9jt4                       2/2     Running     0          5h34m   10.133.0.63    hcp416-bm1-b-2624094b-pvd2m   <none>           <none>
csi-addons-controller-manager-7fd6479b54-rz8mq                    2/2     Running     0          5h34m   10.133.0.64    hcp416-bm1-b-2624094b-pvd2m   <none>           <none>
ocs-client-operator-console-59b954978b-jgqg2                      1/1     Running     0          5h34m   10.133.0.66    hcp416-bm1-b-2624094b-pvd2m   <none>           <none>
ocs-client-operator-controller-manager-7dccfcb7c8-wkf7d           2/2     Running     0          5h34m   10.133.0.65    hcp416-bm1-b-2624094b-pvd2m   <none>           <none>
openshift-storage.cephfs.csi.ceph.com-ctrlplugin-747cffd642xhjv   7/7     Running     0          5h32m   10.132.0.25    hcp416-bm1-b-2624094b-dn4gn   <none>           <none>
openshift-storage.cephfs.csi.ceph.com-ctrlplugin-747cffd647xgwp   7/7     Running     0          5h32m   10.133.0.68    hcp416-bm1-b-2624094b-pvd2m   <none>           <none>
openshift-storage.cephfs.csi.ceph.com-nodeplugin-7snq5            3/3     Running     0          5h32m   10.128.3.248   hcp416-bm1-b-2624094b-dn4gn   <none>           <none>
openshift-storage.cephfs.csi.ceph.com-nodeplugin-wp89q            3/3     Running     0          5h32m   10.131.1.157   hcp416-bm1-b-2624094b-pvd2m   <none>           <none>
openshift-storage.rbd.csi.ceph.com-ctrlplugin-6444559dc-746mh     7/7     Running     0          5h32m   10.133.0.67    hcp416-bm1-b-2624094b-pvd2m   <none>           <none>
openshift-storage.rbd.csi.ceph.com-ctrlplugin-6444559dc-jzc2j     7/7     Running     0          5h32m   10.132.0.24    hcp416-bm1-b-2624094b-dn4gn   <none>           <none>
openshift-storage.rbd.csi.ceph.com-nodeplugin-fxk52               4/4     Running     0          5h32m   10.131.1.157   hcp416-bm1-b-2624094b-pvd2m   <none>           <none>
openshift-storage.rbd.csi.ceph.com-nodeplugin-stlw7               4/4     Running     0          5h32m   10.128.3.248   hcp416-bm1-b-2624094b-dn4gn   <none>           <none>
storageclient-e5fb1f06bee2517f-status-reporter-28776436-wv2cb     0/1     Completed   0          56s     10.132.1.173   hcp416-bm1-b-2624094b-dn4gn   <none>           <none>
--------------
======= PVC ==========
--------------
======= storageclient ==========
NAME             PHASE       CONSUMER
storage-client   Connected   40cc7af0-3172-45bf-be13-9c7970cd4d75
======= Storageclasses ==========
NAME                                   PROVISIONER                             RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
kubevirt-csi-infra-default (default)   csi.kubevirt.io                         Delete          Immediate           false                  5h50m
storage-client-ceph-rbd                openshift-storage.rbd.csi.ceph.com      Delete          Immediate           true                   5h30m
storage-client-cephfs                  openshift-storage.cephfs.csi.ceph.com   Delete          Immediate           true                   5h30m
--------------
======= Storageclaims ==========
NAME                      STORAGETYPE   STORAGEPROFILE   STORAGECLIENTNAME   PHASE
storage-client-ceph-rbd   block                          storage-client      Ready
storage-client-cephfs     sharedfile                     storage-client      Ready

Comment 2 Sunil Kumar Acharya 2024-09-18 11:53:45 UTC
Moving the non-blocker BZ out of ODF-4.17.0 as part of development freeze.

Comment 10 Sunil Kumar Acharya 2024-09-27 06:46:45 UTC
Please update the RDT flag/text appropriately.

Comment 12 errata-xmlrpc 2024-10-30 14:35:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.17.0 Security, Enhancement, & Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:8676