2189864 – [IBM Z] MDR policy creation fails unless the ocs-operator pod is restarted on the managed clusters

Bug 2189864 - [IBM Z] MDR policy creation fails unless the ocs-operator pod is restarted on the managed clusters

Summary: [IBM Z] MDR policy creation fails unless the ocs-operator pod is restarted on...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	ocs-operator
Sub Component:
Version:	4.13
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	umanga
QA Contact:	Elad
Docs Contact:
URL:
Whiteboard:
Depends On:	2182644
Blocks:
TreeView+	depends on / blocked

Reported:	2023-04-26 08:55 UTC by umanga
Modified:	2023-08-09 17:00 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	2182644
Environment:
Last Closed:	2023-04-26 11:46:18 UTC
Embargoed:

Attachments	(Terms of Use)

Description umanga 2023-04-26 08:55:20 UTC

+++ This bug was initially created as a clone of Bug #2182644 +++

Description of problem (please be detailed as possible and provide log
snippests):
Metro Disaster Recovery policy creation fails unless the ocs-operator pod is restarted on the managed clusters. All the clusters (hub and managed) have the same ODF version v4.13.0-110.stable

Version of all relevant components (if applicable):
openshift-install: 4.13.0-rc.0
ODF: v4.13.0-110.stable
odr-hub-operator: v4.13.0-110.stable
odf-multicluster-orchestrator: v4.13.0-110.stable
ACM: 2.7.2

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?
Restart the ocs-operator pod on managed clusters

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
Yes, reproduced every time (5 times)

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Create Metro DR setup with hub cluster and 2 managed clusters
2. Install ODF on managed clusters and connect to external storage
3. Install ODF Multicluster Operator on Hub cluster
4. Configure SSL access across clusters
5. Create Disaster recovery policy on hub cluster


Actual results:
Creation of DR policy reports unsupported version of ODF on the managed clusters, although the ODF version is supported and same on all the 3 clusters.
Disaster recovery policy creation fails unless the ocs-operator pod is restarted on the Managed clusters. 

Attaching the screenshot of the DR policy creation failure.

Expected results:
Disaster recovery policy creation should be possible without restarting the ocs-operator pod on Managed clusters.

Additional info:

--- Additional comment from RHEL Program Management on 2023-03-29 13:28:01 IST ---

This bug having no release flag set previously, is now set with release flag 'odf‑4.13.0' to '?', and so is being proposed to be fixed at the ODF 4.13.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag.

--- Additional comment from Sravika on 2023-03-29 13:31:46 IST ---



--- Additional comment from umanga on 2023-04-05 11:06:44 IST ---

Looking at the log, this does not look like a DR issue.
It's a bug somewhere in ocs-operator. We should try to recreate it without DR to isolate the issue.

Logs shows StorageCluster has this namespace/name: `"Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster"`
But, ocs-operator is looking for `"msg":"No StorageCluster resource.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit"`

Moving it to ocs-operator for RCA.

--- Additional comment from umanga on 2023-04-10 16:28:23 IST ---

This might have occurred because of another bug: https://bugzilla.redhat.com/show_bug.cgi?id=2185188.
Once that is fixed, we need to check if this issue still exists.

--- Additional comment from umanga on 2023-04-18 16:36:17 IST ---

Please verify the issue with "4.13.0-166" build.

--- Additional comment from Mudit Agarwal on 2023-04-24 14:12:42 IST ---

Sravika, please reopen if this still exists.

--- Additional comment from Sravika on 2023-04-25 15:11:07 IST ---

@uchapaga @muagarwa : Currently I don't have 4.13 environment, will verify the BZ once I move to 4.13 verification later

--- Additional comment from umanga on 2023-04-26 14:14:31 IST ---

Reopening the issue as we were able to reproduce it on latest builds.

--- Additional comment from umanga on 2023-04-26 14:16:13 IST ---

We need to fix this in 4.13 as it blocks DR workflows on clusters with existing ODF deployments.

Comment 2 umanga 2023-04-26 11:46:18 UTC

Based on latest information, we did not hit this issue with ODF 4.12.2 so marking this as NOTABUG.
Fix will be only on ODF 4.13.0.

Note You need to log in before you can comment on or make changes to this bug.