Bug 2049509
| Summary: | ocs operator stuck on CrashLoopBackOff while installing with KMS | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | aberner |
| Component: | ocs-operator | Assignee: | Jiffin <jthottan> |
| Status: | CLOSED ERRATA | QA Contact: | aberner |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.10 | CC: | ikave, jthottan, madam, muagarwa, nberry, nibalach, ocs-bugs, odf-bz-bot, prasriva, rgeorge, sostapov |
| Target Milestone: | --- | Keywords: | AutomationBackLog, Regression, TestBlocker |
| Target Release: | ODF 4.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.10.0-141 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-04-13 18:52:41 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
aberner
2022-02-02 11:20:46 UTC
was able to reproduce in odf 4.10.0-137 the platform of both of the failures is vsphere and there was a successful deployment over aws with kms enabled therefore we suspect it to be platform related. Thanks Amit!!
{"level":"info","ts":1643849128.0077307,"logger":"cmd","msg":"Go Version: go1.16.6"}
{"level":"info","ts":1643849128.0080402,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
I0203 00:45:29.057166 1 request.go:668] Waited for 1.0379668s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/discovery.k8s.io/v1?timeout=32s
{"level":"info","ts":1643849130.7604914,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1643849130.7768652,"logger":"cmd","msg":"OCSInitialization resource already exists"}
{"level":"info","ts":1643849133.5396976,"logger":"cmd","msg":"starting manager"}
I0203 00:45:33.539974 1 leaderelection.go:243] attempting to acquire leader lease openshift-storage/ab76f4c9.openshift.io...
{"level":"info","ts":1643849133.5400252,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
I0203 00:45:51.972963 1 leaderelection.go:253] successfully acquired lease openshift-storage/ab76f4c9.openshift.io
{"level":"info","ts":1643849151.9731855,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Starting EventSource","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.9731698,"logger":"controller-runtime.manager.controller.storageconsumer","msg":"Starting EventSource","reconciler group":"ocs.openshift.io","reconciler kind":"StorageConsumer","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.9732287,"logger":"controller-runtime.manager.controller.persistentvolume","msg":"Starting EventSource","reconciler group":"","reconciler kind":"PersistentVolume","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.973263,"logger":"controller-runtime.manager.controller.persistentvolume","msg":"Starting Controller","reconciler group":"","reconciler kind":"PersistentVolume"}
{"level":"info","ts":1643849151.9732392,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Starting EventSource","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.9739225,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Starting EventSource","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.9739504,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Starting EventSource","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.9739673,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Starting EventSource","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.9739735,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Starting EventSource","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.9739833,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Starting Controller","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster"}
{"level":"info","ts":1643849151.974249,"logger":"controller-runtime.manager.controller.storageconsumer","msg":"Starting Controller","reconciler group":"ocs.openshift.io","reconciler kind":"StorageConsumer"}
{"level":"info","ts":1643849151.9743242,"logger":"controller-runtime.manager.controller.ocsinitialization","msg":"Starting EventSource","reconciler group":"ocs.openshift.io","reconciler kind":"OCSInitialization","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.9743755,"logger":"controller-runtime.manager.controller.ocsinitialization","msg":"Starting EventSource","reconciler group":"ocs.openshift.io","reconciler kind":"OCSInitialization","source":"kind source: /, Kind="}
{"level":"info","ts":1643849151.9743888,"logger":"controller-runtime.manager.controller.ocsinitialization","msg":"Starting Controller","reconciler group":"ocs.openshift.io","reconciler kind":"OCSInitialization"}
{"level":"info","ts":1643849152.0760622,"logger":"controller-runtime.manager.controller.storageconsumer","msg":"Starting workers","reconciler group":"ocs.openshift.io","reconciler kind":"StorageConsumer","worker count":1}
{"level":"info","ts":1643849152.0762343,"logger":"controller-runtime.manager.controller.storagecluster","msg":"Starting workers","reconciler group":"ocs.openshift.io","reconciler kind":"StorageCluster","worker count":1}
{"level":"info","ts":1643849152.0763292,"logger":"controller-runtime.manager.controller.persistentvolume","msg":"Starting workers","reconciler group":"","reconciler kind":"PersistentVolume","worker count":1}
{"level":"info","ts":1643849152.0763702,"logger":"controllers.StorageCluster","msg":"Reconciling StorageCluster.","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","StorageCluster":"openshift-storage/ocs-storagecluster"}
{"level":"info","ts":1643849152.0764022,"logger":"controllers.StorageCluster","msg":"Spec.AllowRemoteStorageConsumers is disabled","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"}
{"level":"info","ts":1643849152.0764332,"logger":"controller-runtime.manager.controller.ocsinitialization","msg":"Starting workers","reconciler group":"ocs.openshift.io","reconciler kind":"OCSInitialization","worker count":1}
{"level":"info","ts":1643849152.0765593,"logger":"controllers.OCSInitialization","msg":"Reconciling OCSInitialization.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","OCSInitialization":"openshift-storage/ocsinit"}
{"level":"info","ts":1643849152.082092,"logger":"controllers.OCSInitialization","msg":"Updating SecurityContextConstraint.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","SecurityContextConstraint":"rook-ceph"}
{"level":"info","ts":1643849152.0877435,"logger":"controllers.StorageCluster","msg":"Resource deletion for provider succeeded","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster"}
{"level":"info","ts":1643849152.091996,"logger":"controllers.OCSInitialization","msg":"Updating SecurityContextConstraint.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","SecurityContextConstraint":"rook-ceph-csi"}
{"level":"info","ts":1643849152.101062,"logger":"controllers.OCSInitialization","msg":"Updating SecurityContextConstraint.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","SecurityContextConstraint":"noobaa"}
{"level":"info","ts":1643849152.1127658,"logger":"controllers.OCSInitialization","msg":"Updating SecurityContextConstraint.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","SecurityContextConstraint":"noobaa-endpoint"}
{"level":"info","ts":1643849152.1313975,"logger":"controllers.OCSInitialization","msg":"Reconciling OCSInitialization.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","OCSInitialization":"openshift-storage/ocsinit"}
{"level":"info","ts":1643849152.1351314,"logger":"controllers.OCSInitialization","msg":"Updating SecurityContextConstraint.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","SecurityContextConstraint":"rook-ceph"}
{"level":"info","ts":1643849152.1491027,"logger":"controllers.OCSInitialization","msg":"Updating SecurityContextConstraint.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","SecurityContextConstraint":"rook-ceph-csi"}
{"level":"info","ts":1643849152.160247,"logger":"controllers.OCSInitialization","msg":"Updating SecurityContextConstraint.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","SecurityContextConstraint":"noobaa"}
{"level":"info","ts":1643849152.1706407,"logger":"controllers.OCSInitialization","msg":"Updating SecurityContextConstraint.","Request.Namespace":"openshift-storage","Request.Name":"ocsinit","SecurityContextConstraint":"noobaa-endpoint"}
{"level":"info","ts":1643849152.9305122,"logger":"controllers.StorageCluster","msg":"Restoring original CephBlockPool.","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","CephBlockPool":"openshift-storage/ocs-storagecluster-cephblockpool"}
{"level":"info","ts":1643849153.0419562,"logger":"controllers.StorageCluster","msg":"Restoring original CephFilesystem.","Request.Namespace":"openshift-storage","Request.Name":"ocs-storagecluster","CephFileSystem":"openshift-storage/ocs-storagecluster-cephfilesystem"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x170d20a]
goroutine 884 [running]:
github.com/red-hat-storage/ocs-operator/controllers/storagecluster.(*StorageClusterReconciler).newCephObjectStoreInstances(0xc00048a0c0, 0xc000c32000, 0xc000bc0b40, 0x1e2bf80, 0xc00059cb40, 0xc000bc0b40, 0x0, 0x0)
/remote-source/app/controllers/storagecluster/cephobjectstores.go:218 +0x96a
github.com/red-hat-storage/ocs-operator/controllers/storagecluster.(*ocsCephObjectStores).ensureCreated(0x2a22f90, 0xc00048a0c0, 0xc000c32000, 0x0, 0x0, 0x0, 0x0)
/remote-source/app/controllers/storagecluster/cephobjectstores.go:59 +0x12c
github.com/red-hat-storage/ocs-operator/controllers/storagecluster.(*StorageClusterReconciler).reconcilePhases(0xc00048a0c0, 0xc000c32000, 0xc000951140, 0x11, 0xc000951128, 0x12, 0x0, 0x0, 0xc000c32000, 0x0)
/remote-source/app/controllers/storagecluster/reconcile.go:394 +0xd08
github.com/red-hat-storage/ocs-operator/controllers/storagecluster.(*StorageClusterReconciler).Reconcile(0xc00048a0c0, 0x1e174b8, 0xc000e44f90, 0xc000951140, 0x11, 0xc000951128, 0x12, 0xc000e44f00, 0x0, 0x0, ...)
/remote-source/app/controllers/storagecluster/reconcile.go:161 +0x6c5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000ba25a0, 0x1e17410, 0xc0005b94c0, 0x19e0a40, 0xc0007ec080)
/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298 +0x30d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000ba25a0, 0x1e17410, 0xc0005b94c0, 0xc000c82f00)
/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2(0xc000e07b70, 0xc000ba25a0, 0x1e17410, 0xc0005b94c0)
/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 +0x6b
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210 +0x425
Jiffin/Pranshu, PTAL Are we supposed to test this on baremetal? Was it tested before or it's being tested first time in 4.10? Verified over odf version 4.10.0-143 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1372 |