Description of problem (please be detailed as possible and provide log snippests): Deploying new cluster on multiple cloud platforms and confirmed on versions ODF 4.13, 4.14, 4.15 ocs-storagecluster is never reaching progress state (over 4h) ocs-operator logs Reconciler Error ``` oc get storageclusters.ocs.openshift.io -A NAMESPACE NAME AGE PHASE EXTERNAL CREATED AT VERSION openshift-storage ocs-storagecluster 4h1m Progressing 2024-01-31T08:41:24Z 4.13.7 ``` ``` oc logs ocs-operator-868455d4bb-kpl8w -n openshift-storage | grep error {"level":"error","ts":"2024-01-31T09:21:19Z","msg":"Reconciler error","controller":"storagecluster","controllerGroup":"ocs.openshift.io","controllerKind":"StorageCluster","StorageCluster":{"name":"ocs-storagecluster","namespace":"openshift-storage"},"namespace":"openshift-storage","name":"ocs-storagecluster","reconcileID":"da39bff0-b0fb-4acd-96c1-e439c64fff80","error":"Operation cannot be fulfilled on storageclusters.ocs.openshift.io \"ocs-storagecluster\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235"} {"level":"error","ts":"2024-01-31T09:30:26Z","msg":"Reconciler error","controller":"storagecluster","controllerGroup":"ocs.openshift.io","controllerKind":"StorageCluster","StorageCluster":{"name":"ocs-storagecluster","namespace":"openshift-storage"},"namespace":"openshift-storage","name":"ocs-storagecluster","reconcileID":"e53ad2b1-c132-438d-b796-86298c180a2e","error":"Operation cannot be fulfilled on storageclusters.ocs.openshift.io \"ocs-storagecluster\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller) ``` ``` oc logs odf-operator-controller-manager-f56b885d5-sh287 -n openshift-storage Defaulted container "kube-rbac-proxy" out of: kube-rbac-proxy, manager Flag --logtostderr has been deprecated, will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components W0131 08:39:37.215604 1 kube-rbac-proxy.go:156] ==== Deprecation Warning ====================== Insecure listen address will be removed. Using --insecure-listen-address won't be possible! The ability to run kube-rbac-proxy without TLS certificates will be removed. Not using --tls-cert-file and --tls-private-key-file won't be possible! For more information, please go to https://github.com/brancz/kube-rbac-proxy/issues/187 =============================================== I0131 08:39:37.215763 1 kube-rbac-proxy.go:285] Valid token audiences: I0131 08:39:37.215834 1 kube-rbac-proxy.go:383] Generating self signed cert as no cert is provided I0131 08:39:37.729261 1 kube-rbac-proxy.go:447] Starting TCP socket on 0.0.0.0:8443 I0131 08:39:37.729650 1 kube-rbac-proxy.go:454] Listening securely on 0.0.0.0:8443 ``` Storage system is Ok, no Warnings, Errors. Cluster is in Working state Version of all relevant components (if applicable): Problem has been seen on multiple versions OC version: Client Version: 4.13.4 Kustomize Version: v4.5.7 Server Version: 4.13.0-0.nightly-2024-01-30-181028 Kubernetes Version: v1.26.13+77e61a2 -e OCS verison: ocs-operator.v4.13.7-rhodf OpenShift Container Storage 4.13.7-rhodf ocs-operator.v4.13.6-rhodf Succeeded -e Cluster version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.0-0.nightly-2024-01-30-181028 True False 4h14m Cluster version is 4.13.0-0.nightly-2024-01-30-181028 -e Rook version: rook: v4.13.7-0.42f43768ad57d91be47327f83653c05eeb721977 go: go1.19.13 -e Ceph version: ceph version 17.2.6-170.el9cp (59bbeb8815ec3aeb3c8bba1e1866f8f6729eb840) quincy (stable) Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? yes Can this issue reproduce from the UI? no If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy cluster on one of supported cloud providers 2. 3. Actual results: ocs-storageclass never becomes ready Expected results: ocs-storageclass is ready in reasonable time upon deployment Additional info: must-gather logs: https://drive.google.com/file/d/1yPNggohjwcb2Ndg_cnzKmgIkROFc7KcY/view?usp=sharing
retested with IBM Cloud and odf-operator.v4.15.0-134.stable, no issue oc get storageclusters.ocs.openshift.io -A NAMESPACE NAME AGE PHASE EXTERNAL CREATED AT VERSION openshift-storage ocs-storagecluster 67m Ready 2024-02-06T10:49:35Z 4.15.0 oc get noobaa -A NAMESPACE NAME S3-ENDPOINTS STS-ENDPOINTS IMAGE PHASE AGE openshift-storage noobaa ["https://10.240.0.4:30157"] ["https://10.240.0.4:30181"] registry.redhat.io/odf4/mcg-core-rhel9@sha256:1d79a2ac176ca6e69c3198d0e35537aaf29373440d214d324d0d433d1473d9a1 Ready 67m oc get backingstores.noobaa.io -A NAMESPACE NAME TYPE PHASE AGE openshift-storage noobaa-default-backing-store ibm-cos Ready 67m
I agree with comment 6
Please refer to #comment6
Sorry, pressed enter too early. Please #comment5 and #comment6 *** This bug has been marked as a duplicate of bug 2255557 ***