Bug 2134040 - [RDR][CEPHFS] volsync-dd-io-pvc pvc's are taking a lot of time to come in Bound state
Summary: [RDR][CEPHFS] volsync-dd-io-pvc pvc's are taking a lot of time to come in Bou...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-dr
Version: 4.12
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ODF 4.14.0
Assignee: Benamar Mekhissi
QA Contact: Pratik Surve
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-12 08:10 UTC by Pratik Surve
Modified: 2023-11-08 18:51 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-08 18:49:51 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github RamenDR ramen issues 831 0 None open VolSync: Enhancing Snapshot Sync Process for Improved Data Consistency and Reliability 2023-07-03 12:30:05 UTC
Red Hat Product Errata RHSA-2023:6832 0 None None None 2023-11-08 18:51:03 UTC

Description Pratik Surve 2022-10-12 08:10:43 UTC
Description of problem (please be detailed as possible and provide log
snippets):
[RDR][CEPHFS] volsync-dd-io-pvc pvc's are taking a lot of time to come in Bound state


Version of all relevant components (if applicable):

OCP version:- 4.12.0-0.nightly-2022-10-05-053337
ODF version:- 4.12.0-70
CEPH version:- ceph version 16.2.10-41.el8cp (26bc3d938546adfb098168b7b565d4f9fa377775) pacific (stable)
ACM version:- 2.6.1
SUBMARINER version:- v0.13.0
VOLSYNC version:- volsync-product.v0.5.0


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
2

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy RDR cluster
2. Deploy CEPHFS DR Application
3. Observe volsyncPod and volsyncPVC 


Actual results:
volsync-dd-io-pvc-1-src   Pending                                                                        ocs-storagecluster-cephfs   27m
volsync-dd-io-pvc-2-src   Pending                                                                        ocs-storagecluster-cephfs   6m59s
volsync-dd-io-pvc-3-src   Pending                                                                        ocs-storagecluster-cephfs   37m
volsync-dd-io-pvc-4-src   Pending                                                                        ocs-storagecluster-cephfs   6m59s
volsync-dd-io-pvc-7-src   Pending                                                                        ocs-storagecluster-cephfs   37m


oc describe pvc volsync-dd-io-pvc-1-src                                     
Name:          volsync-dd-io-pvc-1-src
Namespace:     busybox-workloads-8
StorageClass:  ocs-storagecluster-cephfs
Status:        Pending
Volume:        
Labels:        app.kubernetes.io/created-by=volsync
               volsync.backube/cleanup=9e226d9a-226e-434e-8cfd-f8ae0d8cdb86
Annotations:   volume.beta.kubernetes.io/storage-provisioner: openshift-storage.cephfs.csi.ceph.com
               volume.kubernetes.io/storage-provisioner: openshift-storage.cephfs.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
DataSource:
  APIGroup:  snapshot.storage.k8s.io
  Kind:      VolumeSnapshot
  Name:      volsync-dd-io-pvc-1-src
Used By:     volsync-rsync-src-dd-io-pvc-1-qnhql
Events:
  Type     Reason                Age                    From                                                                                                                      Message
  ----     ------                ----                   ----                                                                                                                      -------
  Warning  ProvisioningFailed    18m (x12 over 27m)     openshift-storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-6c56f845fc-qg4xx_703843f9-ea2a-49cb-85f1-317d06dcfdc7  failed to provision volume with StorageClass "ocs-storagecluster-cephfs": rpc error: code = Aborted desc = clone from snapshot is pending
  Warning  ProvisioningFailed    8m23s (x2 over 13m)    openshift-storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-6c56f845fc-qg4xx_703843f9-ea2a-49cb-85f1-317d06dcfdc7  failed to provision volume with StorageClass "ocs-storagecluster-cephfs": rpc error: code = Aborted desc = clone from snapshot is already in progress
  Normal   Provisioning          6m54s (x15 over 27m)   openshift-storage.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-6c56f845fc-qg4xx_703843f9-ea2a-49cb-85f1-317d06dcfdc7  External provisioner is provisioning volume for claim "busybox-workloads-8/volsync-dd-io-pvc-1-src"
  Normal   ExternalProvisioning  2m30s (x104 over 27m)  persistentvolume-controller                                                                                               waiting for a volume to be created, either by external provisioner "openshift-storage.cephfs.csi.ceph.com" or manually created by system administrator


oc get volumesnapshots.snapshot.storage.k8s.io                         
NAME                      READYTOUSE   SOURCEPVC     SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS                               SNAPSHOTCONTENT                                    CREATIONTIME   AGE
volsync-dd-io-pvc-1-src   true         dd-io-pvc-1                           117Gi         ocs-storagecluster-cephfsplugin-snapclass   snapcontent-048024ce-42aa-43c1-81b2-e321702ac071   28m            28m
volsync-dd-io-pvc-2-src   true         dd-io-pvc-2                           143Gi         ocs-storagecluster-cephfsplugin-snapclass   snapcontent-126fe6bb-08a5-44c8-a1ee-0d61071d61f5   7m54s          8m2s
volsync-dd-io-pvc-3-src   true         dd-io-pvc-3                           134Gi         ocs-storagecluster-cephfsplugin-snapclass   snapcontent-1f540195-d5eb-4872-9154-c6d21ab88077   37m            38m
volsync-dd-io-pvc-4-src   true         dd-io-pvc-4                           106Gi         ocs-storagecluster-cephfsplugin-snapclass   snapcontent-712066ae-0fcc-4d50-ba14-953ed358cbed   7m59s          8m2s
volsync-dd-io-pvc-7-src   true         dd-io-pvc-7                           149Gi         ocs-storagecluster-cephfsplugin-snapclass   snapcontent-a4318685-6b6c-410d-95ed-70c65ac8a06d   37m            38m

oc describe volumesnapshots.snapshot.storage.k8s.io volsync-dd-io-pvc-1-src 
Name:         volsync-dd-io-pvc-1-src
Namespace:    busybox-workloads-8
Labels:       app.kubernetes.io/created-by=volsync
              volsync.backube/cleanup=9e226d9a-226e-434e-8cfd-f8ae0d8cdb86
Annotations:  <none>
API Version:  snapshot.storage.k8s.io/v1
Kind:         VolumeSnapshot
Metadata:
  Creation Timestamp:  2022-10-12T07:40:00Z
  Finalizers:
    snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
    snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
  Generation:  1
  Managed Fields:
    API Version:  snapshot.storage.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .:
          f:app.kubernetes.io/created-by:
          f:volsync.backube/cleanup:
        f:ownerReferences:
          .:
          k:{"uid":"9e226d9a-226e-434e-8cfd-f8ae0d8cdb86"}:
      f:spec:
        .:
        f:source:
          .:
          f:persistentVolumeClaimName:
        f:volumeSnapshotClassName:
    Manager:      manager
    Operation:    Update
    Time:         2022-10-12T07:40:00Z
    API Version:  snapshot.storage.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection":
          v:"snapshot.storage.kubernetes.io/volumesnapshot-bound-protection":
    Manager:      snapshot-controller
    Operation:    Update
    Time:         2022-10-12T07:40:01Z
    API Version:  snapshot.storage.k8s.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:boundVolumeSnapshotContentName:
        f:creationTime:
        f:readyToUse:
        f:restoreSize:
    Manager:      snapshot-controller
    Operation:    Update
    Subresource:  status
    Time:         2022-10-12T07:40:06Z
  Owner References:
    API Version:           volsync.backube/v1alpha1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  ReplicationSource
    Name:                  dd-io-pvc-1
    UID:                   9e226d9a-226e-434e-8cfd-f8ae0d8cdb86
  Resource Version:        6485976
  UID:                     048024ce-42aa-43c1-81b2-e321702ac071
Spec:
  Source:
    Persistent Volume Claim Name:  dd-io-pvc-1
  Volume Snapshot Class Name:      ocs-storagecluster-cephfsplugin-snapclass
Status:
  Bound Volume Snapshot Content Name:  snapcontent-048024ce-42aa-43c1-81b2-e321702ac071
  Creation Time:                       2022-10-12T07:40:01Z
  Ready To Use:                        true
  Restore Size:                        117Gi
Events:
  Type    Reason            Age   From                 Message
  ----    ------            ----  ----                 -------
  Normal  CreatingSnapshot  28m   snapshot-controller  Waiting for a snapshot busybox-workloads-8/volsync-dd-io-pvc-1-src to be created by the CSI driver.
  Normal  SnapshotCreated   28m   snapshot-controller  Snapshot busybox-workloads-8/volsync-dd-io-pvc-1-src was successfully created by the CSI driver.
  Normal  SnapshotReady     28m   snapshot-controller  Snapshot busybox-workloads-8/volsync-dd-io-pvc-1-src is ready to use.

oc get volumesnapshotcontents.snapshot.storage.k8s.io
NAME                                               READYTOUSE   RESTORESIZE    DELETIONPOLICY   DRIVER                                  VOLUMESNAPSHOTCLASS                         VOLUMESNAPSHOT            VOLUMESNAPSHOTNAMESPACE   AGE
snapcontent-048024ce-42aa-43c1-81b2-e321702ac071   true         125627793408   Delete           openshift-storage.cephfs.csi.ceph.com   ocs-storagecluster-cephfsplugin-snapclass   volsync-dd-io-pvc-1-src   busybox-workloads-8       28m
snapcontent-126fe6bb-08a5-44c8-a1ee-0d61071d61f5   true         153545080832   Delete           openshift-storage.cephfs.csi.ceph.com   ocs-storagecluster-cephfsplugin-snapclass   volsync-dd-io-pvc-2-src   busybox-workloads-8       8m39s
snapcontent-1f540195-d5eb-4872-9154-c6d21ab88077   true         143881404416   Delete           openshift-storage.cephfs.csi.ceph.com   ocs-storagecluster-cephfsplugin-snapclass   volsync-dd-io-pvc-3-src   busybox-workloads-8       38m
snapcontent-712066ae-0fcc-4d50-ba14-953ed358cbed   true         113816633344   Delete           openshift-storage.cephfs.csi.ceph.com   ocs-storagecluster-cephfsplugin-snapclass   volsync-dd-io-pvc-4-src   busybox-workloads-8       8m39s
snapcontent-a4318685-6b6c-410d-95ed-70c65ac8a06d   true         159987531776   Delete           openshift-storage.cephfs.csi.ceph.com   ocs-storagecluster-cephfsplugin-snapclass   volsync-dd-io-pvc-7-src   busybox-workloads-8       38m




Expected results:
PVC should not take this long to come in Bound state

Additional info:

Comment 9 Subham Rai 2022-10-12 09:21:33 UTC
FYI,

also, we have a similar bz https://bugzilla.redhat.com/show_bug.cgi?id=2115558 (https://github.com/rook/rook/issues/10619 upstream issue to track) the similar situations

Comment 14 Madhu Rajanna 2022-10-19 07:30:24 UTC
Closing this one as NOT A BUG as its expected result, please reopen if you think its not a bug.

Comment 29 Karolin Seeger 2023-03-06 14:02:49 UTC
This is a known issue, requires large effort in CephFS, lowering severity

Comment 32 krishnaram Karthick 2023-03-30 12:47:14 UTC
moving the bug to new as there is no fix and setting needinfo for Pratik to check this out

Comment 34 Karolin Seeger 2023-04-16 12:53:00 UTC
Based on comment #33 moving out to 4.14.

Comment 43 errata-xmlrpc 2023-11-08 18:49:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6832


Note You need to log in before you can comment on or make changes to this bug.