Bug 2301889

Summary: nooba-operator pod is is consistently entering CLBO
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Amrita Mahapatra <ammahapa>
Component: Multi-Cloud Object GatewayAssignee: aberner
Status: CLOSED ERRATA QA Contact: Amrita Mahapatra <ammahapa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.17CC: aberner, dosypenk, fbalak, lmauda, muagarwa, nbecker, nberry, nigoyal, odf-bz-bot
Target Milestone: ---   
Target Release: ODF 4.17.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: isf-provider
Fixed In Version: 4.17.0-87 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-10-30 14:29:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amrita Mahapatra 2024-07-31 04:41:32 UTC
Description of problem (please be detailed as possible and provide log
snippests):
nooba-operator pod is is consistently entering CLBO for a provider-client cluster created with ocp and odf 4.17.

➜  oc get pods -l app=noobaa -n openshift-storage
NAME                               READY   STATUS             RESTARTS          AGE
noobaa-db-pg-0                     1/1     Running            0                 14h
noobaa-operator-57b47b496c-hjltk   0/1     CrashLoopBackOff   153 (3m15s ago)   14h

The noobaa-operator pod yaml show the termination reason is Error:
```
➜ oc get pod noobaa-operator-57b47b496c-hjltk -o yaml -n openshift-storage| grep lastState -A6
    lastState:
      terminated:
        containerID: cri-o://2899ca9ad99ff3127c91db5541ddbb53117221a5f0820c24f27393cd1ca20952
        exitCode: 2
        finishedAt: "2024-07-31T04:22:34Z"
        reason: Error
        startedAt: "2024-07-31T04:22:01Z"

```

The nooba-operator log:
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1eeeb8f]

goroutine 2855 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:116 +0x1e5
panic({0x231e3c0?, 0x42faab0?})
	/usr/lib/golang/src/runtime/panic.go:770 +0x132
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).SetDesiredCoreApp(0xc00200a2c8)
	/remote-source/app/pkg/system/phase2_creating.go:455 +0x318f
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).reconcileObjectAndGetResult.func1()
	/remote-source/app/pkg/system/reconciler.go:655 +0x18
sigs.k8s.io/controller-runtime/pkg/controller/controllerutil.mutate(0x2ea2560?, {{0xc00231b800?, 0x0?}, {0xc000cfca50?, 0x2ec6890?}}, {0x2ee7d00, 0xc00202a508})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/controller/controllerutil/controllerutil.go:426 +0x49
sigs.k8s.io/controller-runtime/pkg/controller/controllerutil.CreateOrUpdate({0x2ec6890, 0x43aa080}, {0x2ed6f60, 0xc000c8e750}, {0x2ee7d00, 0xc00202a508}, 0xc0009f58c0)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/controller/controllerutil/controllerutil.go:282 +0x128
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).reconcileObjectAndGetResult(0xc00200a2c8, {0x2ee7d00, 0xc00202a508}, 0xc0009f5940, 0x0)
	/remote-source/app/pkg/system/reconciler.go:652 +0x15b
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).reconcileObject(...)
	/remote-source/app/pkg/system/reconciler.go:643
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).ReconcileObject(...)
	/remote-source/app/pkg/system/reconciler.go:634
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).ReconcilePhaseCreatingForMainClusters(0xc00200a2c8)
	/remote-source/app/pkg/system/phase2_creating.go:141 +0x4b7
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).ReconcilePhaseCreating(0xc00200a2c8)
	/remote-source/app/pkg/system/phase2_creating.go:55 +0xa8
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).ReconcilePhases(0xc00200a2c8)
	/remote-source/app/pkg/system/reconciler.go:557 +0x45
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).Reconcile(0xc00200a2c8)
	/remote-source/app/pkg/system/reconciler.go:428 +0x326
github.com/noobaa/noobaa-operator/v5/pkg/controller/noobaa.Add.func1({0xc002006420?, 0x0?}, {{{0xc00231b800?, 0xc001d57ce0?}, {0xc002085f70?, 0x411a9b?}}})
	/remote-source/app/pkg/controller/noobaa/noobaa_controller.go:53 +0xd6
sigs.k8s.io/controller-runtime/pkg/reconcile.Func.Reconcile(0x7f8ec2706a68?, {0x2ec69e0?, 0xc00200e090?}, {{{0xc00231b800?, 0x0?}, {0xc002085f70?, 0xc001d57d10?}}})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/reconcile/reconcile.go:113 +0x3d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x2eccaf8?, {0x2ec69e0?, 0xc00200e090?}, {{{0xc00231b800?, 0xb?}, {0xc002085f70?, 0x0?}}})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:119 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000fb2960, {0x2ec6a18, 0xc000b8d0e0}, {0x243db40, 0xc0007cea80})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:316 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000fb2960, {0x2ec6a18, 0xc000b8d0e0})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:266 +0x1be
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:227 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 349
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:223 +0x50c

Version of all relevant components (if applicable):
odf-operator.v4.17.0-56.stable
ocp: 4.17.0-ec.2

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)? This impact odf 4.17 cluster creation.


Is there any workaround available to the best of your knowledge? No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)? 3


Can this issue reproducible? Yes


Can this issue reproduce from the UI? Yes


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Create a provider-client cluster with ocp: 4.17.0-ec.2 and odf: v4.17.0-56.stable
2. Create storage-system
3. Check the nooba-operator pod is moving to CLBO


Actual results:
nooba-operator pod is moving to CLBO for a provider-client cluster created with 
ocp: 4.17.0-ec.2 and odf: v4.17.0-56.stable

Expected results:
nooba-operator pod should be in 'Running' status.


Additional info:

Comment 12 Sunil Kumar Acharya 2024-09-18 12:06:54 UTC
Please update the RDT flag/text appropriately.

Comment 14 errata-xmlrpc 2024-10-30 14:29:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.17.0 Security, Enhancement, & Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:8676