+++ This bug was initially created as a clone of Bug #2255557 +++ Description of problem (please be detailed as possible and provide log snippests): When trying to deploy ODF to an IBM IPI cluster using a COS-backed backingstore, Noobaa becomes stuck in the Configuring state. ----------------- phase: Configuring readme: "\n\n\tNooBaa operator is still working to reconcile this system.\n\tCheck out the system status.phase, status.conditions, and events with:\n\n\t\tkubectl -n openshift-storage describe noobaa\n\t\tkubectl -n openshift-storage get noobaa -o yaml\n\t\tkubectl -n openshift-storage get events --sort-by=metadata.creationTimestamp\n\n\tYou can wait for a specific condition with:\n\n\t\tkubectl -n openshift-storage wait noobaa/noobaa --for condition=available --timeout -1s\n\n\tNooBaa Core Version: \ master-20230920\n\tNooBaa Operator Version: 5.15.0\n" services: ----------------- - lastHeartbeatTime: "2023-12-18T10:39:32Z" lastTransitionTime: "2023-12-18T10:30:28Z" message: |- RequestError: send request failed caused by: Put "https://s3.direct..cloud-object-storage.appdomain.cloud/nb.1702895972648.apps.jnk-pr9072b6235.ibmcloud2.qe.rh-ocs.com": dial tcp: lookup s3.direct..cloud-object-storage.appdomain.cloud: no such host reason: TemporaryError ----------------- Version of all relevant components (if applicable): Server Version: 4.15.0-0.nightly-2023-12-19-033450 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes, Noobaa is unable to be installed for this scenario (IBM IPI with COS-backed backingstore). Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? yes Can this issue reproduce from the UI? unknown If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy OCP to IBM Cloud(IPI) 2. Label worker nodes with region information 'oc label node <worker name> ibm-cloud\.kubernetes\.io/region=<REGION>' in our case <REGION> is us-south. 3. Deploy ODF, creating the Secret via YAML as described here: https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.9/html-single/deploying_and_managing_openshift_data_foundation_using_google_cloud/index#creating-an-IBM-COS-backed-backingstore_rhodf Actual results: Noobaa is stuck in Creating phase. Expected results: Noobaa creation is successful, deployment succeeds. Additional info: https://cloud.ibm.com/docs/containers?topic=containers-storage_cos_install https://access.redhat.com/documentation/en-us/red_hat_openshift_data_foundation/4.9/html-single/deploying_and_managing_openshift_data_foundation_using_google_cloud/index#creating-an-IBM-COS-backed-backingstore_rhodf noobaa operator logs: https://url.corp.redhat.com/d7c188f noobaa.yaml: https://url.corp.redhat.com/a9c0127 full ocs must gather: https://url.corp.redhat.com/b9c7f09 --- Additional comment from RHEL Program Management on 2023-12-21 22:49:06 IST --- This bug having no release flag set previously, is now set with release flag 'odf‑4.15.0' to '?', and so is being proposed to be fixed at the ODF 4.15.0 release. Note that the 3 Acks (pm_ack, devel_ack, qa_ack), if any previously set while release flag was missing, have now been reset since the Acks are to be set against a release flag. --- Additional comment from Ben Eli on 2024-01-02 16:19:06 IST --- The node labels do not seem to be present - http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jnk-pr9072b6235/jnk-pr9072b6235_20231218T081756/logs/deployment_1702894821/jnk-pr9072b6235/ocs_must_gather/c2f9bca16b1fcd4caaeb21cefc9c0f5835b8c6d068b152724d7c54e451b83af3/cluster-scoped-resources/oc_output/desc_nodes --- Additional comment from Petr Balogh on 2024-01-02 16:47:57 IST --- I am not sure how Coady's PR validated this, but when I've tried manually before the shutdown I collected must gather here: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/pbalogh-cos/pbalogh-cos_20231221T102303/logs/deployment_1703160725/ Validation job was triggered here: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/32237/console I see labeles on all 3 worker nodes: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/pbalogh-cos/pbalogh-cos_20231221T102303/logs/deployment_1703160725/pbalogh-cos/ocs_must_gather/21ea0203971406f15ee064be07fc5878d185202baf8e79078a0893b8db582ef5/cluster-scoped-resources/oc_output/get_nodes_-o_wide_--show-labels ibm-cloud.kubernetes.io/region=us-south Labeling I did manually right after OCP deployment like: $ oc label node worker-X-node-name ibm-cloud.kubernetes.io/region=us-south I will let Coady to check why his verification link job doesn't have labels, but when I tried manually I also didn't succeed to get it working. --- Additional comment from Coady LaCroix on 2024-01-03 00:25:11 IST --- We were able to get a successful deployment by adding the region label to all of the cluster nodes. Applying to the workers only was insufficient. https://url.corp.redhat.com/8b2fc5d Logs are linked to in the description of the job if necessary. --- Additional comment from RHEL Program Management on 2024-01-09 11:37:35 IST --- This BZ is being approved for ODF 4.15.0 release, upon receipt of the 3 ACKs (PM,Devel,QA) for the release flag 'odf‑4.15.0 --- Additional comment from RHEL Program Management on 2024-01-09 11:37:35 IST --- Since this bug has been approved for ODF 4.15.0 release, through release flag 'odf-4.15.0+', the Target Release is being set to 'ODF 4.15.0 --- Additional comment from Coady LaCroix on 2024-01-11 23:11:03 IST --- Verified the deployment was successful after only labeling the worker nodes. Jenkins: https://url.corp.redhat.com/7db156b Verified on Server Version: 4.15.0-0.nightly-2024-01-10-101042 --- Additional comment from errata-xmlrpc on 2024-02-06 16:49:43 IST --- This bug has been added to advisory RHBA-2023:118688 by Boris Ranto (branto) --- Additional comment from Mudit Agarwal on 2024-02-07 11:13:02 IST --- --- Additional comment from Eran Tamir on 2024-02-11 09:48:46 IST --- please backport to 4.14
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Data Foundation 4.14.7 Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:3443