Bug 2301889 - nooba-operator pod is is consistently entering CLBO
Summary: nooba-operator pod is is consistently entering CLBO
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.17
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ODF 4.17.0
Assignee: aberner
QA Contact: Amrita Mahapatra
URL:
Whiteboard: isf-provider
: 2305920 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-07-31 04:41 UTC by Amrita Mahapatra
Modified: 2024-10-30 14:29 UTC (History)
9 users (show)

Fixed In Version: 4.17.0-87
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-10-30 14:29:38 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github noobaa noobaa-operator pull 1402 0 None Merged Fix provider env var panic 2024-08-27 15:05:40 UTC
Github noobaa noobaa-operator pull 1421 0 None Merged [backport into 5.17] Backports into 5.17 2024-08-27 15:05:41 UTC
Red Hat Issue Tracker OCSBZM-8778 0 None None None 2024-07-31 04:43:42 UTC
Red Hat Product Errata RHSA-2024:8676 0 None None None 2024-10-30 14:29:41 UTC

Description Amrita Mahapatra 2024-07-31 04:41:32 UTC
Description of problem (please be detailed as possible and provide log
snippests):
nooba-operator pod is is consistently entering CLBO for a provider-client cluster created with ocp and odf 4.17.

➜  oc get pods -l app=noobaa -n openshift-storage
NAME                               READY   STATUS             RESTARTS          AGE
noobaa-db-pg-0                     1/1     Running            0                 14h
noobaa-operator-57b47b496c-hjltk   0/1     CrashLoopBackOff   153 (3m15s ago)   14h

The noobaa-operator pod yaml show the termination reason is Error:
```
➜ oc get pod noobaa-operator-57b47b496c-hjltk -o yaml -n openshift-storage| grep lastState -A6
    lastState:
      terminated:
        containerID: cri-o://2899ca9ad99ff3127c91db5541ddbb53117221a5f0820c24f27393cd1ca20952
        exitCode: 2
        finishedAt: "2024-07-31T04:22:34Z"
        reason: Error
        startedAt: "2024-07-31T04:22:01Z"

```

The nooba-operator log:
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1eeeb8f]

goroutine 2855 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:116 +0x1e5
panic({0x231e3c0?, 0x42faab0?})
	/usr/lib/golang/src/runtime/panic.go:770 +0x132
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).SetDesiredCoreApp(0xc00200a2c8)
	/remote-source/app/pkg/system/phase2_creating.go:455 +0x318f
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).reconcileObjectAndGetResult.func1()
	/remote-source/app/pkg/system/reconciler.go:655 +0x18
sigs.k8s.io/controller-runtime/pkg/controller/controllerutil.mutate(0x2ea2560?, {{0xc00231b800?, 0x0?}, {0xc000cfca50?, 0x2ec6890?}}, {0x2ee7d00, 0xc00202a508})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/controller/controllerutil/controllerutil.go:426 +0x49
sigs.k8s.io/controller-runtime/pkg/controller/controllerutil.CreateOrUpdate({0x2ec6890, 0x43aa080}, {0x2ed6f60, 0xc000c8e750}, {0x2ee7d00, 0xc00202a508}, 0xc0009f58c0)
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/controller/controllerutil/controllerutil.go:282 +0x128
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).reconcileObjectAndGetResult(0xc00200a2c8, {0x2ee7d00, 0xc00202a508}, 0xc0009f5940, 0x0)
	/remote-source/app/pkg/system/reconciler.go:652 +0x15b
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).reconcileObject(...)
	/remote-source/app/pkg/system/reconciler.go:643
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).ReconcileObject(...)
	/remote-source/app/pkg/system/reconciler.go:634
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).ReconcilePhaseCreatingForMainClusters(0xc00200a2c8)
	/remote-source/app/pkg/system/phase2_creating.go:141 +0x4b7
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).ReconcilePhaseCreating(0xc00200a2c8)
	/remote-source/app/pkg/system/phase2_creating.go:55 +0xa8
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).ReconcilePhases(0xc00200a2c8)
	/remote-source/app/pkg/system/reconciler.go:557 +0x45
github.com/noobaa/noobaa-operator/v5/pkg/system.(*Reconciler).Reconcile(0xc00200a2c8)
	/remote-source/app/pkg/system/reconciler.go:428 +0x326
github.com/noobaa/noobaa-operator/v5/pkg/controller/noobaa.Add.func1({0xc002006420?, 0x0?}, {{{0xc00231b800?, 0xc001d57ce0?}, {0xc002085f70?, 0x411a9b?}}})
	/remote-source/app/pkg/controller/noobaa/noobaa_controller.go:53 +0xd6
sigs.k8s.io/controller-runtime/pkg/reconcile.Func.Reconcile(0x7f8ec2706a68?, {0x2ec69e0?, 0xc00200e090?}, {{{0xc00231b800?, 0x0?}, {0xc002085f70?, 0xc001d57d10?}}})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/reconcile/reconcile.go:113 +0x3d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x2eccaf8?, {0x2ec69e0?, 0xc00200e090?}, {{{0xc00231b800?, 0xb?}, {0xc002085f70?, 0x0?}}})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:119 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000fb2960, {0x2ec6a18, 0xc000b8d0e0}, {0x243db40, 0xc0007cea80})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:316 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000fb2960, {0x2ec6a18, 0xc000b8d0e0})
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:266 +0x1be
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:227 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 349
	/remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime.2/pkg/internal/controller/controller.go:223 +0x50c

Version of all relevant components (if applicable):
odf-operator.v4.17.0-56.stable
ocp: 4.17.0-ec.2

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)? This impact odf 4.17 cluster creation.


Is there any workaround available to the best of your knowledge? No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)? 3


Can this issue reproducible? Yes


Can this issue reproduce from the UI? Yes


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Create a provider-client cluster with ocp: 4.17.0-ec.2 and odf: v4.17.0-56.stable
2. Create storage-system
3. Check the nooba-operator pod is moving to CLBO


Actual results:
nooba-operator pod is moving to CLBO for a provider-client cluster created with 
ocp: 4.17.0-ec.2 and odf: v4.17.0-56.stable

Expected results:
nooba-operator pod should be in 'Running' status.


Additional info:

Comment 12 Sunil Kumar Acharya 2024-09-18 12:06:54 UTC
Please update the RDT flag/text appropriately.

Comment 14 errata-xmlrpc 2024-10-30 14:29:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.17.0 Security, Enhancement, & Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:8676


Note You need to log in before you can comment on or make changes to this bug.