2187986 – [MDR] ramen-dr-cluster-operator pod is in CLBO after assigning dr policy to an appset based app

Bug 2187986 - [MDR] ramen-dr-cluster-operator pod is in CLBO after assigning dr policy to an appset based app

Summary: [MDR] ramen-dr-cluster-operator pod is in CLBO after assigning dr policy to a...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	odf-dr
Sub Component:
Version:	4.13
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	ODF 4.13.0
Assignee:	Benamar Mekhissi
QA Contact:	Parikshith
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2023-04-19 11:13 UTC by Parikshith
Modified:	2023-08-09 17:00 UTC (History)
CC List:	6 users (show)
Fixed In Version:	4.13.0-170
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-06-21 15:25:08 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2023:3742	0	None	None	None	2023-06-21 15:25:52 UTC

Description Parikshith 2023-04-19 11:13:28 UTC

Description of problem (please be detailed as possible and provide log
snippests):
After creating appset based app and assigning dr policy to it, ramen dr cluster operator pod is in clbo state on the respective managed cluster where app was deployed.

2023-04-19T10:11:04.953Z INFO controllers.VolumeReplicationGroup.vrginstance runtime/panic.go:838 Exiting processing VolumeReplicationGroup {"VolumeReplicationGroup": "temp-app/temp-app-placement-drpc", "rid": "f02417de-48f9-4cf6-b1ae-09f36d46f63b", "State": "primary"}
2023-04-19T10:11:04.953Z INFO controllers.VolumeReplicationGroup runtime/panic.go:838 Exiting reconcile loop {"VolumeReplicationGroup": "temp-app/temp-app-placement-drpc", "rid": "f02417de-48f9-4cf6-b1ae-09f36d46f63b"}
2023-04-19T10:11:04.953Z INFO controller/controller.go:117 Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference {"controller": "volumereplicationgroup", "controllerGroup": "ramendr.openshift.io", "controllerKind": "VolumeReplicationGroup", "VolumeReplicationGroup": {"name":"temp-app-placement-drpc","namespace":"temp-app"}, "namespace": "temp-app", "name": "temp-app-placement-drpc", "reconcileID": "f8691f72-03d4-46c1-9380-dc55744b0e95"}
2023-04-19T10:11:04.953Z INFO controllers.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:324 Entering reconcile loop {"VolumeReplicationGroup": "temp-app/temp-app-placement-drpc", "rid": "05f64826-1a91-4153-8cb3-769d2b210db9"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1ef1559]


Version of all relevant components (if applicable):
ocp: 4.13.0-0.nightly-2023-04-18-005127
odf: 4.13.0-168
acm: 2.7.2(GA)

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
yes, ramen-dr-cluster-operator pod should not be in cblo.

Is there any workaround available to the best of your knowledge?
no

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
1/1

If this is a regression, please provide more details to justify this:
yes, did not face this issue in earlier 4.13 build like 4.13.0-124

Steps to Reproduce:
1. Configure 3 OCP 4.13 clusters. hub, c1 and c2. 
2. Configure MDR. (Had installed OADP op on c1 and c2 to WA: https://bugzilla.redhat.com/show_bug.cgi?id=2176456)
3. Deploy AppSet based apps on c1. Apply dr policy to the apps.
4. On c1, check whether ramen-dr-cluster-operator pod is running. 

Actual results:
ramen-dr-cluster-operator pod is in clbo state.

Expected results:
ramen-dr-cluster-operator pod should be running.

Additional info:

Comment 5 Benamar Mekhissi 2023-04-19 12:37:09 UTC

https://github.com/RamenDR/ramen/pull/839

Comment 8 Shyamsundar 2023-04-19 13:17:37 UTC

Fix merged into release-4.13 branch: https://github.com/red-hat-storage/ramen/commit/15b154891a00a542453d0acc7e4509aad6a05812

Comment 13 errata-xmlrpc 2023-06-21 15:25:08 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:3742

Note You need to log in before you can comment on or make changes to this bug.