Description of problem (please be detailed as possible and provide log snippests): A subscription application with a vm referencing a stand alone datavolume does not failover the datavolume, secret or vm. Only the pvc is failed over. Version of all relevant components (if applicable): oc get csv -n openshift-cnv NAME DISPLAY VERSION REPLACES PHASE kubevirt-hyperconverged-operator.v4.16.0 OpenShift Virtualization 4.16.0 kubevirt-hyperconverged-operator.v4.15.2 Succeeded odr-cluster-operator.v4.16.0-90.stable Openshift DR Cluster Operator 4.16.0-90.stable Succeeded openshift-gitops-operator.v1.12.1 Red Hat OpenShift GitOps 1.12.1 openshift-gitops-operator.v1.12.0 Succeeded recipe.v4.16.0-90.stable Recipe 4.16.0-90.stable Failed volsync-product.v0.9.1 VolSync 0.9.1 volsync-product.v0.9.0 Succeeded [cloud-user@ocp-psi-executor-xl vm16-pull-app]$ oc version Client Version: 4.16.0-ec.5 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: 4.16.0-ec.5 Kubernetes Version: v1.29.2+258f1d5 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Subscription App VMs with stand alone datavolumes aren't failing over Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: Steps to Reproduce: 1.Created the Sunscription app with VM using a stand alone datavolume - vm deployed on primary cluster 2.Applied the DR Policy 3.Accessed the VM on the primary cluster, wrote 2 text files and tail -10 /var/log/ramen.log 4.Fenced the primary cluster - vm is paused on the primary cluster 5.Failed over the subscription application >>>> VM is not failied over to the secondary cluster, still in paused state on the secondary cluster. 6.Checked the secondary cluster - the namespace was created, the pvc was failed over, public key was not failover, dv was not failed over, vm was not failed over 7.Also noted that the cdi.kubevirt.io/allowClaimAdoption: "true" annotation was not on the failed over pvc Had successfully failed over and relocated Appset pull vm app using stand alone datavolume before this Failing over datavolume templates worked fine. Actual results: The namespace is created on the secondary cluster during failover and only the pvc is failed over. Expected results: The pvc, secret, datavolume and vm should failover to the secondary cluster Additional info:
Update on failed subscription apps on Metro DR environment! Hi, I have done some additional troubleshooting with manual testing around the failed Subscription applications on the Metro DR environment to understand the scope and try to get to the root cause: I ran all variants of the subscription app: In all cases the drpc status is Failed over and stuck on Cleaning up. In all cases the namespace is created on the secondary 'c2' cluster and only contains the PVC. The VM, pod and or datavolume is not failed over. [1]Subscription app vm using a pvc - failed failover [2]Subscription app vm using a stand alone datavolume - failed failover (https://bugzilla.redhat.com/show_bug.cgi?id=2291343) [3]Subscription app using a datavolume template - failed failover (Note this specific test passed a few weeks ago on this same environment!) [4] Application set push app with PVC - Passed failover and relocate! [5] Application set push app with stand alone datavolume - Passed failover and relocate a few weeks ago - will retest again now [6] Application set pull app with datavolume template - Passed failover and relocate a few weeks ago - will retest again now I used the scenario below in each test: Created the Subscription pvc/dv/dvt application - deployed successfully on the primary cluster Accessed the VM and wrote a text file and ran 'tail -10 /var/log/ramen.log' Enrolled the Subscription application to the DRPolicy Fenced the primary cluster Failed over the Subscription application >>>>> Only the pvc is created, no pods and no vm is created Conclusion: [a]It seems We have a specific issue with failing over Subscription applications on our MDR environment [b]Subscription applications with datavolume templates Passed a few weeks ago on this same environment - retested and now it is failing! [c]Application push apps are passing failover and relocate. [d]So it seems that something has broken on our MDR environment! [e]Could this be another issue with the S3/secrets configuration being deleted when the operators are upgraded as we saw on the Regional DR environment?
Please update the RDT flag/text appropriately.