Bug 2069367

Summary: [ODF to ODF] Volume snapshot creation failed
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Jilju Joy <jijoy>
Component: documentationAssignee: Anjana Suparna Sriram <asriram>
Status: CLOSED CURRENTRELEASE QA Contact: Neha Berry <nberry>
Severity: high Docs Contact:
Priority: high    
Version: 4.10CC: aeyal, etamir, mmuench, muagarwa, nberry, odf-bz-bot, owasserm, sostapov, srai
Target Milestone: ---Keywords: Automation
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2100326 (view as bug list) Environment:
Last Closed: 2024-07-18 05:11:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2100326    

Description Jilju Joy 2022-03-28 19:21:15 UTC
Description of problem:
Creation of volume snapshot is failing on consumer cluster on a ODF to ODF setup because the secret given in the volumesnapshotclass does not exist.

Failed testcases:
tests/manage/pv_services/pvc_snapshot/test_pvc_snapshot.py::TestPvcSnapshot::test_pvc_snapshot[CephBlockPool]
tests/manage/pv_services/pvc_snapshot/test_pvc_snapshot.py::TestPvcSnapshot::test_pvc_snapshot[CephFileSystem]

$ oc -n namespace-test-a2277b1d30a14b4bbafb102dd describe VolumeSnapshot snapshot-test-591b37c06338411688d6b248e4
Name:         snapshot-test-591b37c06338411688d6b248e4
Namespace:    namespace-test-a2277b1d30a14b4bbafb102dd
Labels:       <none>
Annotations:  <none>
API Version:  snapshot.storage.k8s.io/v1
Kind:         VolumeSnapshot
Metadata:
  Creation Timestamp:  2022-03-28T19:02:10Z
  Finalizers:
    snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
    snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
  Generation:  1
  Managed Fields:
    API Version:  snapshot.storage.k8s.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:source:
          .:
          f:persistentVolumeClaimName:
        f:volumeSnapshotClassName:
    Manager:      kubectl-create
    Operation:    Update
    Time:         2022-03-28T19:02:10Z
    API Version:  snapshot.storage.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection":
          v:"snapshot.storage.kubernetes.io/volumesnapshot-bound-protection":
    Manager:      snapshot-controller
    Operation:    Update
    Time:         2022-03-28T19:02:10Z
    API Version:  snapshot.storage.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:boundVolumeSnapshotContentName:
        f:error:
          .:
          f:message:
          f:time:
        f:readyToUse:
    Manager:         snapshot-controller
    Operation:       Update
    Subresource:     status
    Time:            2022-03-28T19:02:10Z
  Resource Version:  377868
  UID:               cad7fb45-355d-4c4f-85aa-2a1611fe11b5
Spec:
  Source:
    Persistent Volume Claim Name:  pvc-test-767b8ae5a2af43db9dbd09c8a9f8da0
  Volume Snapshot Class Name:      ocs-storagecluster-cephfsplugin-snapclass
Status:
  Bound Volume Snapshot Content Name:  snapcontent-cad7fb45-355d-4c4f-85aa-2a1611fe11b5
  Error:
    Message:     Failed to check and update snapshot content: failed to get input parameters to create snapshot for content snapcontent-cad7fb45-355d-4c4f-85aa-2a1611fe11b5: "cannot get credentials for snapshot content \"snapcontent-cad7fb45-355d-4c4f-85aa-2a1611fe11b5\""
    Time:        2022-03-28T19:02:10Z
  Ready To Use:  false
Events:
  Type    Reason            Age    From                 Message
  ----    ------            ----   ----                 -------
  Normal  CreatingSnapshot  5m44s  snapshot-controller  Waiting for a snapshot namespace-test-a2277b1d30a14b4bbafb102dd/snapshot-test-591b37c06338411688d6b248e4 to be created by the CSI driver.


Volumesnapshpt class for CephFS:

$ oc get volumesnapshotclass  ocs-storagecluster-cephfsplugin-snapclass -o yaml 
apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Delete
driver: openshift-storage.cephfs.csi.ceph.com
kind: VolumeSnapshotClass
metadata:
  creationTimestamp: "2022-03-28T13:47:42Z"
  generation: 1
  name: ocs-storagecluster-cephfsplugin-snapclass
  resourceVersion: "126026"
  uid: c7384b03-e9a4-4f8f-86f9-eec1c12cac3e
parameters:
  clusterID: openshift-storage
  csi.storage.k8s.io/snapshotter-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/snapshotter-secret-namespace: openshift-storage


Volumesnapshpt class for RBD:

$ oc get volumesnapshotclass  ocs-storagecluster-rbdplugin-snapclass -o yaml 
apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Delete
driver: openshift-storage.rbd.csi.ceph.com
kind: VolumeSnapshotClass
metadata:
  creationTimestamp: "2022-03-28T13:47:42Z"
  generation: 1
  name: ocs-storagecluster-rbdplugin-snapclass
  resourceVersion: "126028"
  uid: dccddcda-3b94-4b97-98fa-fdcefcb94d20
parameters:
  clusterID: openshift-storage
  csi.storage.k8s.io/snapshotter-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/snapshotter-secret-namespace: openshift-storage


logs: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jijoyc1/jijoyc1_20220328T143052/logs/failed_testcase_ocs_logs_1648492745/
==================================================================
Version-Release number of selected component (if applicable):

ODF 4.10.0-206
ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)
OCP 4.9.24

How reproducible:
Always

Steps to Reproduce:
1. Create CephFS and RBD PVC
2. Create snapshot of the PVCs

OR 
Run the test cases:

tests/manage/pv_services/pvc_snapshot/test_pvc_snapshot.py::TestPvcSnapshot::test_pvc_snapshot[CephBlockPool]
tests/manage/pv_services/pvc_snapshot/test_pvc_snapshot.py::TestPvcSnapshot::test_pvc_snapshot[CephFileSystem]

Actual results:
Snapshot creation failed

Expected results:
Snapshot creation should succeed 

Additional info:
Adding storageclass yaml output to check the provisioner-secret-name used.

$ oc get sc ocs-storagecluster-ceph-rbd -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    description: Provides RWO Filesystem volumes, and RWO and RWX Block volumes
  creationTimestamp: "2022-03-28T13:47:41Z"
  name: ocs-storagecluster-ceph-rbd
  resourceVersion: "126017"
  uid: 56315e7a-defa-4be6-b49d-844b123894e8
parameters:
  clusterID: openshift-storage
  csi.storage.k8s.io/controller-expand-secret-name: rook-ceph-client-5100077e60a918a622ffb836a583d116
  csi.storage.k8s.io/controller-expand-secret-namespace: openshift-storage
  csi.storage.k8s.io/fstype: ext4
  csi.storage.k8s.io/node-stage-secret-name: rook-ceph-client-85823cb0ce1c7827af1012f29b725f7c
  csi.storage.k8s.io/node-stage-secret-namespace: openshift-storage
  csi.storage.k8s.io/provisioner-secret-name: rook-ceph-client-5100077e60a918a622ffb836a583d116
  csi.storage.k8s.io/provisioner-secret-namespace: openshift-storage
  imageFeatures: layering
  imageFormat: "2"
  pool: cephblockpool-storageconsumer-223e3a6c-205d-40ce-861e-2021a7ac4b62
provisioner: openshift-storage.rbd.csi.ceph.com
reclaimPolicy: Delete
volumeBindingMode: Immediate


$ oc get sc ocs-storagecluster-cephfs -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    description: Provides RWO and RWX Filesystem volumes
  creationTimestamp: "2022-03-28T13:47:41Z"
  name: ocs-storagecluster-cephfs
  resourceVersion: "126018"
  uid: b004319a-587c-44e0-81a5-bd47da7871b8
parameters:
  clusterID: 2dfd45ae0156a580200415cc29d91016
  csi.storage.k8s.io/controller-expand-secret-name: rook-ceph-client-c4584ece0ba61e159af04c6f6fadeeb0
  csi.storage.k8s.io/controller-expand-secret-namespace: openshift-storage
  csi.storage.k8s.io/node-stage-secret-name: rook-ceph-client-50de094cb3b690e0b80632e4132f70c3
  csi.storage.k8s.io/node-stage-secret-namespace: openshift-storage
  csi.storage.k8s.io/provisioner-secret-name: rook-ceph-client-c4584ece0ba61e159af04c6f6fadeeb0
  csi.storage.k8s.io/provisioner-secret-namespace: openshift-storage
  fsName: ocs-storagecluster-cephfilesystem
provisioner: openshift-storage.cephfs.csi.ceph.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

Comment 1 Jilju Joy 2022-03-28 19:24:13 UTC
Add-on should create custom volumesnapshotclass for both CephFS and RBD as snapshot is supported in V2

Comment 2 Subham Rai 2022-03-30 08:18:44 UTC
it looks it is waiting for a different secret which is never been created.

ex: 
for rbd it is looking for   csi.storage.k8s.io/snapshotter-secret-name: rook-csi-rbd-provisioner but this is what it should be looking for `csi.storage.k8s.io/controller-expand-secret-name=rook-ceph-client-5100077e60a918a622ffb836a583d116`

it is just the case that we need to override the default snapshot classes with our configuration (the same way we did for storage classes.

Comment 3 Sahina Bose 2022-03-30 11:31:20 UTC
Subham, is this a change required in ocs-operator? Is anyone working on this?

Comment 5 Subham Rai 2022-03-30 15:09:26 UTC
(In reply to Sahina Bose from comment #3)
> Subham, is this a change required in ocs-operator? Is anyone working on this?

yes, change is required in ocs-operator. I'm working on it.

Comment 7 Mudit Agarwal 2022-04-04 05:32:55 UTC
This couldn't be fixed before the RC build of 4.10, as discussed in MS PGM moving it out of 4.10

Comment 8 Jilju Joy 2022-05-26 06:25:06 UTC
QE note:
This is the tracker issue to include the skipped snapshot test cases after the bug fix.
https://github.com/red-hat-storage/ocs-ci/issues/5948

Comment 9 Mudit Agarwal 2022-05-30 15:35:19 UTC
Not a blocker for core product.

Comment 13 Mudit Agarwal 2022-06-23 07:48:44 UTC
Madhu, please help doc team with the details.
Marking it as a blocker so that we don't miss it and this is required only in 4.10

Comment 16 Jilju Joy 2022-06-29 12:46:23 UTC
Volumesnapshotclass creation is tested (comment #15) in version:

ODF 4.10.2-3
OCP 4.10.18

$ oc get csv
NAME                                      DISPLAY                       VERSION           REPLACES                                  PHASE
mcg-operator.v4.10.4                      NooBaa Operator               4.10.4            mcg-operator.v4.10.3                      Succeeded
ocs-operator.v4.10.2                      OpenShift Container Storage   4.10.2            ocs-operator.v4.10.1                      Succeeded
ocs-osd-deployer.v2.0.2                   OCS OSD Deployer              2.0.2             ocs-osd-deployer.v2.0.1                   Succeeded
odf-csi-addons-operator.v4.10.4           CSI Addons                    4.10.4            odf-csi-addons-operator.v4.10.2           Succeeded
odf-operator.v4.10.2                      OpenShift Data Foundation     4.10.2            odf-operator.v4.10.1                      Succeeded
ose-prometheus-operator.4.10.0            Prometheus Operator           4.10.0            ose-prometheus-operator.4.8.0             Succeeded
route-monitor-operator.v0.1.422-151be96   Route Monitor Operator        0.1.422-151be96   route-monitor-operator.v0.1.420-b65f47e   Succeeded

Comment 36 Red Hat Bugzilla 2024-11-16 04:25:06 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days