Description of problem (please be detailed as possible and provide log snippests): When deploying ODF with cluster-wide encryption enabled using Azure KMS, the Nooba service gets stuck in the 'NoobaaInitializing' state indefinitely. This issue occurs because the required Azure credentials (AZURE_SECRET_ID or AZURE_CLIENT_CERT_PATH) are not set. This prevents the Nooba service from initializing properly, leading to deployment failure. Version of all relevant components (if applicable): 4.16 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Yes Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to reproduce: 1. Start deployment of ODF cluster on Azure cloud platform with version 4.16 2. During storagecluster setup, configure the clusterwide encryption with Azure KMS service 3. Configure the required parameter for the Azure KMS connection. 4. Complete the steps and wait for the storagecluster to reach the 'Ready' state. Actual results: 1. storage cluster is stuck in a 'Progressing' state. 2. Nooba service stuck in 'NoobaaInitializing' state. Expected results: 1. Storage cluster should be in the 'Ready' state. Additional info: Cluster version info -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > ocs get csv NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.16.0-73.stable NooBaa Operator 4.16.0-73.stable Succeeded ocs-client-operator.v4.16.0-73.stable OpenShift Data Foundation Client 4.16.0-73.stable Succeeded ocs-operator.v4.16.0-73.stable OpenShift Container Storage 4.16.0-73.stable Succeeded odf-csi-addons-operator.v4.16.0-73.stable CSI Addons 4.16.0-73.stable Succeeded odf-operator.v4.16.0-73.stable OpenShift Data Foundation 4.16.0-73.stable Succeeded odf-prometheus-operator.v4.16.0-73.stable Prometheus Operator 4.16.0-73.stable Succeeded rook-ceph-operator.v4.16.0-73.stable Rook-Ceph 4.16.0-73.stable Succeeded Storage Cluster state -=-=-=-=-=-=-=-=-=-=-=-=-=- > sc NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 3d Progressing 2024-04-12T06:28:37Z 4.16.0 Storagecluster details -=-=-=-=-=-=-=-=-=-=-=--=-= > ocs describe storagecluster ocs-storagecluster Name: ocs-storagecluster Namespace: openshift-storage Labels: <none> Annotations: uninstall.ocs.openshift.io/cleanup-policy: delete uninstall.ocs.openshift.io/mode: graceful API Version: ocs.openshift.io/v1 Kind: StorageCluster Metadata: Creation Timestamp: 2024-04-12T06:28:37Z Finalizers: storagecluster.ocs.openshift.io Generation: 2 Owner References: API Version: odf.openshift.io/v1alpha1 Kind: StorageSystem Name: ocs-storagecluster-storagesystem UID: 57695244-3004-4620-b54a-c96b24fb9a2a Resource Version: 4496657 UID: f0a09c5a-6893-4f21-ba5c-a39f7b14b840 Spec: Arbiter: Encryption: Cluster Wide: true Enable: true Key Rotation: Schedule: @weekly Kms: Enable: true External Storage: Managed Resources: Ceph Block Pools: Ceph Cluster: Ceph Config: Ceph Dashboard: Ceph Filesystems: Data Pool Spec: Application: Erasure Coded: Coding Chunks: 0 Data Chunks: 0 Mirroring: Quotas: Replicated: Size: 0 Status Check: Mirror: Ceph Non Resilient Pools: Count: 1 Resources: Volume Claim Template: Metadata: Spec: Resources: Status: Ceph Object Store Users: Ceph Object Stores: Ceph RBD Mirror: Daemon Count: 1 Ceph Toolbox: Mirroring: Network: Connections: Encryption: Multi Cluster Service: Node Topologies: Resource Profile: balanced Storage Device Sets: Config: Count: 1 Data PVC Template: Metadata: Spec: Access Modes: ReadWriteOnce Resources: Requests: Storage: 512Gi Storage Class Name: managed-csi Volume Mode: Block Status: Name: ocs-deviceset-managed-csi Placement: Portable: true Prepare Placement: Replica: 3 Resources: Status: Conditions: Last Heartbeat Time: 2024-04-12T06:28:37Z Last Transition Time: 2024-04-12T06:28:37Z Message: Version check successful Reason: VersionMatched Status: False Type: VersionMismatch Last Heartbeat Time: 2024-04-15T06:54:42Z Last Transition Time: 2024-04-14T21:50:07Z Message: Reconcile completed successfully Reason: ReconcileCompleted Status: True Type: ReconcileComplete Last Heartbeat Time: 2024-04-12T06:28:37Z Last Transition Time: 2024-04-12T06:28:37Z Message: Initializing StorageCluster Reason: Init Status: False Type: Available Last Heartbeat Time: 2024-04-15T06:54:42Z Last Transition Time: 2024-04-12T06:28:37Z Message: Waiting on Nooba instance to finish initialization Reason: NoobaaInitializing Status: True Type: Progressing Last Heartbeat Time: 2024-04-12T06:28:37Z Last Transition Time: 2024-04-12T06:28:37Z Message: Initializing StorageCluster Reason: Init Status: False Type: Degraded Last Heartbeat Time: 2024-04-12T06:34:36Z Last Transition Time: 2024-04-12T06:32:34Z Message: CephCluster is creating: Processing OSD 2 on PVC "ocs-deviceset-managed-csi-0-data-0dh498" Reason: ClusterStateCreating Status: False Type: Upgradeable Current Mon Count: 3 Failure Domain: zone Failure Domain Key: topology.kubernetes.io/zone Failure Domain Values: eastus-1 eastus-2 eastus-3 Images: Ceph: Actual Image: registry.redhat.io/rhceph/rhceph-6-rhel9@sha256:500a744b3be913216d8164131ab97e1b29e112491709be65b30d8fb2d7f61ca0 Desired Image: registry.redhat.io/rhceph/rhceph-6-rhel9@sha256:500a744b3be913216d8164131ab97e1b29e112491709be65b30d8fb2d7f61ca0 Noobaa Core: Actual Image: registry.redhat.io/odf4/mcg-core-rhel9@sha256:b30a5087373a5b3378fd09807399dd0340973e891a410b1bfa74bac634926621 Desired Image: registry.redhat.io/odf4/mcg-core-rhel9@sha256:b30a5087373a5b3378fd09807399dd0340973e891a410b1bfa74bac634926621 Noobaa DB: Actual Image: registry.redhat.io/rhel9/postgresql-15@sha256:76ff2541e3ff13b7f5feb1662597f33283bf9dc80e110bef2fb39633e8bbac00 Desired Image: registry.redhat.io/rhel9/postgresql-15@sha256:76ff2541e3ff13b7f5feb1662597f33283bf9dc80e110bef2fb39633e8bbac00 Kms Server Connection: Kms Server Address: https://ocsqe-azure-kv.vault.azure.net/ Last Applied Resource Profile: balanced Node Topologies: Labels: kubernetes.io/hostname: pakamble-az-8k8vp-worker-eastus1-c4j6n pakamble-az-8k8vp-worker-eastus2-rn7q9 pakamble-az-8k8vp-worker-eastus3-cwhsz topology.kubernetes.io/region: eastus topology.kubernetes.io/zone: eastus-1 eastus-2 eastus-3 Phase: Progressing Related Objects: API Version: ceph.rook.io/v1 Kind: CephCluster Name: ocs-storagecluster-cephcluster Namespace: openshift-storage Resource Version: 4496144 UID: 6b1cc477-d876-4d35-af90-fafc53f1df71 API Version: noobaa.io/v1alpha1 Kind: NooBaa Name: noobaa Namespace: openshift-storage Resource Version: 4496639 UID: 976f5fa0-2c38-4f07-86ac-58931ba3d738 Version: 4.16.0 Events: <none> Nooba service details -=-=-=-=-=-=-=-=-=-=-=-=- > ocs describe noobaas.noobaa.io Name: noobaa Namespace: openshift-storage Labels: app=noobaa Annotations: <none> API Version: noobaa.io/v1alpha1 Kind: NooBaa Metadata: Creation Timestamp: 2024-04-12T06:32:27Z Finalizers: noobaa.io/graceful_finalizer Generation: 1 Owner References: API Version: ocs.openshift.io/v1 Block Owner Deletion: true Controller: true Kind: StorageCluster Name: ocs-storagecluster UID: f0a09c5a-6893-4f21-ba5c-a39f7b14b840 Resource Version: 4497145 UID: 976f5fa0-2c38-4f07-86ac-58931ba3d738 Spec: Affinity: Node Affinity: Required During Scheduling Ignored During Execution: Node Selector Terms: Match Expressions: Key: cluster.ocs.openshift.io/openshift-storage Operator: Exists Autoscaler: Autoscaler Type: hpav2 Prometheus Namespace: openshift-monitoring Cleanup Policy: Core Resources: Limits: Cpu: 999m Memory: 4Gi Requests: Cpu: 999m Memory: 4Gi Db Image: registry.redhat.io/rhel9/postgresql-15@sha256:76ff2541e3ff13b7f5feb1662597f33283bf9dc80e110bef2fb39633e8bbac00 Db Resources: Limits: Cpu: 500m Memory: 4Gi Requests: Cpu: 500m Memory: 4Gi Db Storage Class: ocs-storagecluster-ceph-rbd Db Type: postgres Db Volume Resources: Requests: Storage: 50Gi Endpoints: Max Count: 2 Min Count: 1 Resources: Limits: Cpu: 999m Memory: 2Gi Requests: Cpu: 999m Memory: 2Gi Image: registry.redhat.io/odf4/mcg-core-rhel9@sha256:b30a5087373a5b3378fd09807399dd0340973e891a410b1bfa74bac634926621 Labels: Monitoring: Load Balancer Source Subnets: Pv Pool Default Storage Class: ocs-storagecluster-ceph-rbd Security: Kms: Connection Details: AZURE_CERT_SECRET_NAME: azure-ocs-xtcg1e53 AZURE_CLIENT_ID: ec78e481-8052-4ba1-b01d-ce5a47827ab5 AZURE_TENANT_ID: 9cf78105-e3e9-4321-b88d-b001b66c762b AZURE_VAULT_URL: https://ocsqe-azure-kv.vault.azure.net/ KMS_PROVIDER: azure-kv KMS_SERVICE_NAME: Azure-kv-connection Schedule: @weekly Tolerations: Effect: NoSchedule Key: node.ocs.openshift.io/storage Operator: Equal Value: true Status: Accounts: Admin: Secret Ref: Actual Image: registry.redhat.io/odf4/mcg-core-rhel9@sha256:b30a5087373a5b3378fd09807399dd0340973e891a410b1bfa74bac634926621 Conditions: Last Heartbeat Time: 2024-04-15T06:55:33Z Last Transition Time: 2024-04-12T06:32:27Z Message: AZURE_SECRET_ID or AZURE_CLIENT_CERT_PATH not set Reason: TemporaryError Status: False Type: Available Last Heartbeat Time: 2024-04-15T06:55:33Z Last Transition Time: 2024-04-12T06:32:27Z Message: AZURE_SECRET_ID or AZURE_CLIENT_CERT_PATH not set Reason: TemporaryError Status: True Type: Progressing Last Heartbeat Time: 2024-04-15T06:55:33Z Last Transition Time: 2024-04-12T06:32:27Z Message: AZURE_SECRET_ID or AZURE_CLIENT_CERT_PATH not set Reason: TemporaryError Status: False Type: Degraded Last Heartbeat Time: 2024-04-15T06:55:33Z Last Transition Time: 2024-04-12T06:32:27Z Message: AZURE_SECRET_ID or AZURE_CLIENT_CERT_PATH not set Reason: TemporaryError Status: False Type: Upgradeable Last Heartbeat Time: 2024-04-15T06:55:33Z Last Transition Time: 2024-04-12T06:32:27Z Status: Invalid Type: KMS-Status Observed Generation: 1 Phase: Creating Readme: NooBaa operator is still working to reconcile this system. Check out the system status.phase, status.conditions, and events with: kubectl -n openshift-storage describe noobaa kubectl -n openshift-storage get noobaa -o yaml kubectl -n openshift-storage get events --sort-by=metadata.creationTimestamp You can wait for a specific condition with: kubectl -n openshift-storage wait noobaa/noobaa --for condition=available --timeout -1s NooBaa Core Version: master-20240314 NooBaa Operator Version: 5.17.0 Services: Service Mgmt: serviceS3: Service Sts: Service Syslog: Events: <none>
Verified with ODF 4.16.0-78. Deployment was completed without any issue. $ oc get storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 5h31m Ready 2024-04-18T10:43:02Z 4.16.0 $ oc get pod NAME READY STATUS RESTARTS AGE console-7f45ffc7d7-zg5cc 1/1 Running 0 5h34m csi-addons-controller-manager-7f46789597-dvnkh 2/2 Running 0 5h34m csi-cephfsplugin-7fwwb 2/2 Running 0 5h32m csi-cephfsplugin-provisioner-77dd4b4978-5pqb8 6/6 Running 0 5h32m csi-cephfsplugin-provisioner-77dd4b4978-tlgbv 6/6 Running 0 5h32m csi-cephfsplugin-swtz2 2/2 Running 0 5h32m csi-cephfsplugin-xptcg 2/2 Running 0 5h32m csi-rbdplugin-76c4k 3/3 Running 0 5h32m csi-rbdplugin-f72tr 3/3 Running 0 5h32m csi-rbdplugin-provisioner-7cb98fd4cf-87lt4 6/6 Running 0 5h32m csi-rbdplugin-provisioner-7cb98fd4cf-9c5pk 6/6 Running 0 5h32m csi-rbdplugin-w6fcm 3/3 Running 0 5h32m noobaa-core-0 2/2 Running 0 5h28m noobaa-db-pg-0 1/1 Running 0 5h28m noobaa-endpoint-7d79f779cb-585nq 1/1 Running 0 5h27m noobaa-operator-7f67cf86fb-pqx6m 1/1 Running 0 5h34m ocs-client-operator-console-7f45ffc7d7-9s6zm 1/1 Running 0 5h34m ocs-client-operator-controller-manager-fbd4c858f-fwtpf 2/2 Running 0 5h34m ocs-metrics-exporter-78c77bdfff-cckq4 1/1 Running 0 5h28m ocs-operator-8644dfb4fc-r86qm 1/1 Running 0 5h33m odf-console-69579fbbf9-dnbxf 1/1 Running 0 5h34m odf-operator-controller-manager-d9c7696bc-g2pwb 2/2 Running 0 5h34m rook-ceph-crashcollector-pakamble-az-dsqk9-worker-eastus1-6nhnf 1/1 Running 0 5h30m rook-ceph-crashcollector-pakamble-az-dsqk9-worker-eastus2-8p8dh 1/1 Running 0 5h30m rook-ceph-crashcollector-pakamble-az-dsqk9-worker-eastus3-ccxz6 1/1 Running 0 5h30m rook-ceph-exporter-pakamble-az-dsqk9-worker-eastus1-f74tz-h4jbq 1/1 Running 0 5h30m rook-ceph-exporter-pakamble-az-dsqk9-worker-eastus2-wn4pv-r2wfd 1/1 Running 0 5h30m rook-ceph-exporter-pakamble-az-dsqk9-worker-eastus3-2bxx4-xhfvg 1/1 Running 0 5h30m rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-5f7bf45bdzp4d 2/2 Running 0 5h29m rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-65b779cb4ct7s 2/2 Running 0 5h29m rook-ceph-mgr-a-7bff64bfb9-nvhl8 3/3 Running 0 5h30m rook-ceph-mgr-b-fdcc97c7d-k2lkj 3/3 Running 0 5h30m rook-ceph-mon-a-5c8db74579-h5s6r 2/2 Running 0 5h31m rook-ceph-mon-b-584b6678d7-hw646 2/2 Running 0 5h31m rook-ceph-mon-c-7f6556f5d5-nvmtb 2/2 Running 0 5h30m rook-ceph-operator-5b76cf76b7-ckfbj 1/1 Running 0 5h33m rook-ceph-osd-0-7c8ddc5564-tmlxm 2/2 Running 0 5h27m rook-ceph-osd-1-796884ff6d-l5hjj 2/2 Running 0 5h27m rook-ceph-osd-2-5957585cc7-g2tg7 2/2 Running 0 5h26m rook-ceph-osd-prepare-70e15aa433d0c7ee511e0b867b50ad1b-66v4z 0/1 Completed 0 5h30m rook-ceph-osd-prepare-b02b497d667ab52861c8ce6b484242e0-4c7q5 0/1 Completed 0 5h30m rook-ceph-osd-prepare-d1cbfe39935e78481f6b7657a4ce7948-t7vln 0/1 Completed 0 5h30m ux-backend-server-78db4d8c5-7g882 2/2 Running 0 5h33m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.16.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:4591