Description of problem (please be detailed as possible and provide log snippests): When a workload has failed over or been relocated before Hub Recovery, the DRPC is restored from the hub backup without its previous known status. In this situation, the DRPC attempts to rebuild its status, which may involve generating the PlacementDecision before the managed cluster's restoration of PV/PVC is finished. This can result in a race condition where the application deploys before the restoration of PV/PVC on the managed cluster is completed, leading to the creation of a new PV instead of using the restored one. Version of all relevant components (if applicable): 4.13 4.14 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? There is a chance of data loss Is there any workaround available to the best of your knowledge? There is but not pretty Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 5 Can this issue reproducible? Possibly Can this issue reproduce from the UI? NA If this is a regression, please provide more details to justify this: This is not a regression. It was there all along Steps to Reproduce: 1. In order to reproduce this reliably, you can stage the target cluster to not have any access to the s3 store. Then recover the hub. Actual results: New PV/PVC is created instead of restoring them. Expected results: PV/PVC are restored from s3 stored before the application is redeployed.
Bringing this one back as a potential blocker for 4.14.z for now.
Was this backported to 4.15?
Found the backport PR
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:1383