Bug 2134115

Summary: [RDR][CEPHFS] After performing Failover volsync-rsync-src are still running on Secondary cluster
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Pratik Surve <prsurve>
Component: odf-drAssignee: Benamar Mekhissi <bmekhiss>
odf-dr sub component: ramen QA Contact: Pratik Surve <prsurve>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: bmekhiss, kseeger, muagarwa, odf-bz-bot, rtalur, srangana
Version: 4.12   
Target Milestone: ---   
Target Release: ODF 4.14.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-08 18:49:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pratik Surve 2022-10-12 14:10:47 UTC
Description of problem (please be detailed as possible and provide log
snippests):
[RDR][CEPHFS] After performing Failover volsync-rsync-src are still running 

Version of all relevant components (if applicable):
OCP version:- 4.12.0-0.nightly-2022-10-05-053337
ODF version:- 4.12.0-70
CEPH version:- ceph version 16.2.10-41.el8cp
(26bc3d938546adfb098168b7b565d4f9fa377775) pacific (stable)
ACM version:- 2.6.1
SUBMARINER version:- v0.13.0
VOLSYNC version:- volsync-product.v0.5.0

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.Deploy RDR cluster
2.Deploy CEPH DR workload
3. After sometime perform Failover
4. Check pod status


Actual results:

### output from Secondray after failover
oc get pods
NAME                                  READY   STATUS    RESTARTS   AGE
volsync-rsync-src-dd-io-pvc-1-7tcp4   0/1     Pending   0          40m
volsync-rsync-src-dd-io-pvc-2-7ckzj   0/1     Pending   0          29m
volsync-rsync-src-dd-io-pvc-3-lsdr8   1/1     Running   0          59m
volsync-rsync-src-dd-io-pvc-4-g9ldq   0/1     Pending   0          40m
volsync-rsync-src-dd-io-pvc-5-dbkxm   0/1     Error     0          79m
volsync-rsync-src-dd-io-pvc-5-jpt7g   0/1     Error     0          7m5s
volsync-rsync-src-dd-io-pvc-5-zqqvr   0/1     Error     0          8m18s
volsync-rsync-src-dd-io-pvc-6-56qmw   0/1     Error     0          15m
volsync-rsync-src-dd-io-pvc-6-g2bzq   0/1     Error     0          16m
volsync-rsync-src-dd-io-pvc-6-wqbl7   0/1     Error     0          90m
volsync-rsync-src-dd-io-pvc-7-g9hxf   0/1     Error     0          90m
volsync-rsync-src-dd-io-pvc-7-rkn9m   0/1     Error     0          5m32s
volsync-rsync-src-dd-io-pvc-7-t44rc   0/1     Error     0          6m46s
 

### Output from Primary after failover
oc get pods                       
NAME                                  READY   STATUS    RESTARTS   AGE
dd-io-1-5857bfdcd9-wbx8r              1/1     Running   0          36m
dd-io-2-bcd6d9f65-9tzml               1/1     Running   0          36m
dd-io-3-5d6b4b84df-v2855              1/1     Running   0          36m
dd-io-4-6f6db89fbf-hgdq7              1/1     Running   0          36m
dd-io-5-7868bc6b5c-wbx2p              1/1     Running   0          36m
dd-io-6-58c98598d5-88fhz              1/1     Running   0          36m
dd-io-7-694958ff97-5cmql              1/1     Running   0          36m
volsync-rsync-src-dd-io-pvc-1-j76zr   1/1     Running   0          60s
volsync-rsync-src-dd-io-pvc-1-m654v   0/1     Error     0          2m12s
volsync-rsync-src-dd-io-pvc-2-9cgfw   0/1     Error     0          2m12s
volsync-rsync-src-dd-io-pvc-2-rfcxq   1/1     Running   0          60s
volsync-rsync-src-dd-io-pvc-3-frnl4   1/1     Running   0          41s
volsync-rsync-src-dd-io-pvc-3-hhf9v   0/1     Error     0          112s
volsync-rsync-src-dd-io-pvc-4-gd5c5   0/1     Error     0          112s
volsync-rsync-src-dd-io-pvc-4-qmjfl   1/1     Running   0          40s
volsync-rsync-src-dd-io-pvc-5-76rs2   0/1     Error     0          112s
volsync-rsync-src-dd-io-pvc-5-lr5ds   1/1     Running   0          41s
volsync-rsync-src-dd-io-pvc-6-rhsz7   1/1     Running   0          60s
volsync-rsync-src-dd-io-pvc-6-rpmtm   0/1     Error     0          2m11s
volsync-rsync-src-dd-io-pvc-7-bb7gn   0/1     Error     0          2m12s
volsync-rsync-src-dd-io-pvc-7-sgxn5   1/1     Running   0          60s


Expected results:


Additional info:

Comment 13 Karolin Seeger 2023-03-06 14:05:44 UTC
No clone optimisation work was done for 4.13, moving out to 4.14.

Comment 24 errata-xmlrpc 2023-11-08 18:49:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6832