Bug 2219797 - [IBM Z/MDR]: With ACM 2.8 applying DRpolicy to subscription workload fails
Summary: [IBM Z/MDR]: With ACM 2.8 applying DRpolicy to subscription workload fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-dr
Version: 4.14
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ODF 4.14.0
Assignee: Shyamsundar
QA Contact: Shrivaibavi Raghaventhiran
URL:
Whiteboard:
Depends On: 2218181
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-05 12:03 UTC by Shyamsundar
Modified: 2023-11-08 18:54 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 2218181
Environment:
Last Closed: 2023-11-08 18:52:23 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2023:6832 0 None None None 2023-11-08 18:54:09 UTC

Description Shyamsundar 2023-07-05 12:03:04 UTC
+++ This bug was initially created as a clone of Bug #2218181 +++

---> Clone is to track the fix for 4.14.0 as well, as the initial fix is out with a 4.13.z release.

Description of problem (please be detailed as possible and provide log
snippests):
[IBM Z/MDR]: With ACM 2.8 applying DRpolicy to subscription workload fails

Version of all relevant components (if applicable):
OCP version:- 4.13.0
ODF version:- 4.13.0-218
ACM version:- 2.8

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
Yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy MDR environment
2. Create subscription based application from hub cluster
3. Apply DR policy to the application 
4. When the Apply DRPolicy modal is displayed, select the application and enter PVC label as appname=""

Actual results:
Failed to deploy DRPC and the drpc status on the hub cluster is not Deployed state


2023-06-28T11:10:11.589Z        ERROR   controller/controller.go:329    Reconciler error        {"controller": "drplacementcontrol", "controllerGroup": "ramendr.openshift.io", "controllerKind": "DRPlacementControl", "DRPlacementControl": {"name":"busybox-placement-1-drpc","namespace":"busybox-sample"}, "namespace": "busybox-sample", "name": "busybox-placement-1-drpc", "reconcileID": "9dc4b924-bf27-4df9-a999-0ada10d51efc", "error": "failed to create DRPC instance (ApplicationSet list: no matches for kind \"ApplicationSet\" in version \"argoproj.io/v1alpha1\") and (<nil>)"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.5/pkg/internal/controller/controller.go:235



Expected results:

DRPC should be in Deployed state

Additional info:

Must-gather logs:
https://drive.google.com/file/d/1IpiP2Wj4YOpObjxqL2_IT8xdA_0wmSCI/view?usp=sharing

--- Additional comment from RHEL Program Management on 2023-06-28 07:30:54 EDT ---

This bug having no release flag set previously, is now set with release flag 'odf‑4.14.0' to '?', and so is being proposed to be fixed at the ODF 4.14.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag.

--- Additional comment from RHEL Program Management on 2023-07-03 05:12:12 EDT ---

This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.

--- Additional comment from Shyamsundar on 2023-07-03 08:44:05 EDT ---

The ramen code is looking for ApplicationSet CRD unconditionally causing the issue reported: https://github.com/RamenDR/ramen/blob/9598d67fe09eacc2c6e2c406a38ee7832e3808bf/controllers/drplacementcontrol_controller.go#L1703-L1711

As a workaround installing openshift-gitops would bring in the ApplicationSet CRD into the setup, and the failure can be avoided. This is a little too invasive a workaround as this would also start up argoCD etc in the same namespace.

A fix for the same would be to detect AppSet CRDs as missing in the function pointed to and respond correctly from the code.

--- Additional comment from Shyamsundar on 2023-07-03 12:20:56 EDT ---

Note for QE:
- The code change would avoid looking for AppSets if the CRD is not present (i.e openshift-gitops is not installed)
- An upgrade to this version would only enable workloads being DR protected when AppSet CRD is absent, there are no other upgrade constraints
  - As a result an upgrade test is strictly not required in this case

--- Additional comment from rakesh on 2023-07-04 03:47:41 EDT ---

PR has been posted upstream: https://github.com/RamenDR/ramen/pull/954

--- Additional comment from Shyamsundar on 2023-07-04 10:19:25 EDT ---

Backport PR for 4.13 is present here: https://github.com/red-hat-storage/ramen/pull/115

If accepted by the program, we can merge and clone this BZ for 4.14 tracking of the fix.

--- Additional comment from Karolin Seeger on 2023-07-04 10:19:43 EDT ---

Updating flags and Internal Whiteboard as we target 4.13.1

--- Additional comment from RHEL Program Management on 2023-07-05 07:25:01 EDT ---

This BZ is being approved for an ODF 4.13.z z-stream update, upon receipt of the 3 ACKs (PM,Devel,QA) for the release flag 'odf‑4.13.z', and having been marked for an approved z-stream update

--- Additional comment from RHEL Program Management on 2023-07-05 07:25:01 EDT ---

Since this bug has been approved for ODF 4.13.1 release, through release flag 'odf-4.13.z+', and appropriate update number entry at the 'Internal Whiteboard', the Target Release is being set to 'ODF 4.13.1'

Comment 4 umanga 2023-07-10 05:46:17 UTC
Doesn't look like multicluster orchestrator needs to fix anything here.
What am I missing?

Comment 5 Shyamsundar 2023-07-11 12:49:48 UTC
Updated the component, was set incorrectly.

Comment 6 Shyamsundar 2023-08-01 01:46:31 UTC
Part of release-4.14 branch: Commit ID 098592209cecc54d62cc06499c507da048c0488d

Was merged as part of rebase to main and creating the 4.14 branch.

Comment 10 Shrivaibavi Raghaventhiran 2023-09-12 16:16:59 UTC
Tested Versions:
----------------
OCP - 4.14.0-0.nightly-2023-09-02-132842
ODF - 4.14.0-126.stable
ACM - 2.9.0-115

Test Steps:
----------
1. Create subscription apps without ArgoCD. 
Perform Failover and Failback without disruption

2. Install ArgoCD, Create subscription apps. 
Perform Failover and Failback without disruption

3. Delete ArgoCD, Delete Subscription apps
install ArgoCD, Create Appset apps. 
Perform Failover and Failback without disruption


In all the scenarios, able to create, assign policy and failover/failback apps. Moving to verified.

Comment 15 errata-xmlrpc 2023-11-08 18:52:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6832


Note You need to log in before you can comment on or make changes to this bug.