Bug 2239776
| Summary: | [Tracker ACM-7600][RDR] Source pods remain stuck on the primary cluster and sync stops for cephfs workloads | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Aman Agrawal <amagrawa> |
| Component: | odf-dr | Assignee: | Benamar Mekhissi <bmekhiss> |
| odf-dr sub component: | ramen | QA Contact: | kmanohar |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | urgent | ||
| Priority: | unspecified | CC: | bmekhiss, kseeger, muagarwa, srangana |
| Version: | 4.14 | ||
| Target Milestone: | --- | ||
| Target Release: | ODF 4.14.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-11-08 18:54:58 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Aman Agrawal
2023-09-20 06:46:21 UTC
ACM team addressed this issue, fixes are included in submariner-operator-bundle-container-v0.16.0-23 (and later). -> ON_QA Newer version available, please use submariner-operator-bundle-container-v0.16.0-25 (and later). VERIFICATION COMMENTS ===================== Steps to Reproduce: 1. On a RDR setup, deploy cephfs based DR protected workloads on both primary and secondary clusters. Do **not** perform any failover/relocate operations on the workloads. 2. Run IOs for a week or so and keep monitoring the pod/pvc status on primary and secondary managed clusters, lastGroupSyncTime on hub etc. Actual issue: Source pods remain stuck on the primary cluster and sync stops for cephfs workloads With the fix didn't observe the above behavior Output on C1 ------------ $ oc get replicationsources.volsync.backube NAME SOURCE LAST SYNC DURATION NEXT SYNC dd-io-pvc-1 dd-io-pvc-1 2023-10-30T04:41:28Z 1m28.064760303s 2023-10-30T04:50:00Z dd-io-pvc-2 dd-io-pvc-2 2023-10-30T04:41:27Z 1m27.962788093s 2023-10-30T04:50:00Z dd-io-pvc-3 dd-io-pvc-3 2023-10-30T04:41:21Z 1m21.964984125s 2023-10-30T04:50:00Z dd-io-pvc-4 dd-io-pvc-4 2023-10-30T04:41:24Z 1m24.495460567s 2023-10-30T04:50:00Z dd-io-pvc-5 dd-io-pvc-5 2023-10-30T04:41:22Z 1m22.898981791s 2023-10-30T04:50:00Z dd-io-pvc-6 dd-io-pvc-6 2023-10-30T04:41:29Z 1m29.050621317s 2023-10-30T04:50:00Z dd-io-pvc-7 dd-io-pvc-7 2023-10-30T04:41:30Z 1m30.858916464s 2023-10-30T04:50:00Z $ pods NAME READY STATUS RESTARTS AGE dd-io-1-5dbcfccf76-q4twv 1/1 Running 3 4d20h dd-io-2-684fc84b64-f4ztj 1/1 Running 2 2d17h dd-io-3-68bf99586d-7czc4 1/1 Running 3 4d20h dd-io-4-757c8d8b7b-2xgm2 1/1 Running 3 4d20h dd-io-5-74768ccf84-s9gqr 1/1 Running 3 4d20h dd-io-6-68d5769c76-qkfvm 1/1 Running 3 4d20h dd-io-7-67d87688b4-kpnnm 1/1 Running 2 2d17h $ pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE dd-io-pvc-1 Bound pvc-e479369e-8ea3-416c-8f51-1bee7c26b471 117Gi RWO ocs-storagecluster-cephfs 4d20h dd-io-pvc-2 Bound pvc-e17ca78e-e2bf-466e-be8e-d44ba14cc14d 143Gi RWO ocs-storagecluster-cephfs 4d20h dd-io-pvc-3 Bound pvc-f00f84ed-c662-4b91-a781-8cf27912e54f 134Gi RWO ocs-storagecluster-cephfs 4d20h dd-io-pvc-4 Bound pvc-9fc0cce1-18c7-4612-84c4-8cf4b839cc49 106Gi RWO ocs-storagecluster-cephfs 4d20h dd-io-pvc-5 Bound pvc-e7bf1a2d-3d77-4b76-92c5-1bce1854072e 115Gi RWO ocs-storagecluster-cephfs 4d20h dd-io-pvc-6 Bound pvc-9f1c932d-69d0-4f89-a938-717f7beaf516 129Gi RWO ocs-storagecluster-cephfs 4d20h dd-io-pvc-7 Bound pvc-b07966c2-83bf-489d-9c32-0656f3ea8622 149Gi RWO ocs-storagecluster-cephfs 4d20h $ oc get vrg NAME DESIREDSTATE CURRENTSTATE busybox-1-cephfs-c1-placement-drpc primary Primary On C2 ----- $ oc get replicationdestinations.volsync.backube NAME LAST SYNC DURATION NEXT SYNC dd-io-pvc-1 2023-10-30T04:41:32Z 9m43.583270288s dd-io-pvc-2 2023-10-30T04:41:36Z 9m53.64786527s dd-io-pvc-3 2023-10-30T04:41:23Z 9m38.1503665s dd-io-pvc-4 2023-10-30T04:41:29Z 10m14.653941103s dd-io-pvc-5 2023-10-30T04:41:23Z 9m40.087622682s dd-io-pvc-6 2023-10-30T04:41:35Z 10m3.789656323s dd-io-pvc-7 2023-10-30T04:41:39Z 10m29.950678547s $ pods NAME READY STATUS RESTARTS AGE volsync-rsync-tls-dst-dd-io-pvc-1-q6nhw 1/1 Running 0 8m3s volsync-rsync-tls-dst-dd-io-pvc-2-z8vk2 1/1 Running 0 7m58s volsync-rsync-tls-dst-dd-io-pvc-3-ssmn6 1/1 Running 0 8m12s volsync-rsync-tls-dst-dd-io-pvc-4-m5qnn 1/1 Running 0 8m6s volsync-rsync-tls-dst-dd-io-pvc-5-mk64p 1/1 Running 0 8m12s volsync-rsync-tls-dst-dd-io-pvc-6-7n4jh 1/1 Running 0 7m59s volsync-rsync-tls-dst-dd-io-pvc-7-798gl 1/1 Running 0 7m56s Verified on ----------- ODF - 4.14.0-150 OCP - 4.14.0-0.nightly-2023-10-17-113123 MCO - 4.14.0-150 Submariner - 0.16.0(brew.registry.redhat.io/rh-osbs/iib:599799) ACM - 2.9.0 (2.9.0-DOWNSTREAM-2023-10-03-20-08-35) Must gather ----------- C1 - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/keerthana/bz-v/bz-2239776/c1/ C2 - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/keerthana/bz-v/bz-2239776/c2/ HUB - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/keerthana/bz-v/bz-2239776/hub/ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6832 |