Version : 4.9.0-0.nightly-2022-05-24-200205 Sometimes scale-up job hit following error, but eventually, all nodes are Ready and cluster is healthy. TASK [openshift_node : Wait for node to report ready] ************************** Wednesday 25 May 2022 14:25:10 +0800 (0:00:19.202) 0:13:32.778 ********* FAILED - RETRYING: Wait for node to report ready (30 retries left). <--SNIP--> FAILED - RETRYING: Wait for node to report ready (1 retries left). fatal: [ip-10-0-60-71.us-east-2.compute.internal -> localhost]: FAILED! => {"attempts": 30, "changed": false, "cmd": ["oc", "get", "node", "ip-10-0-60-71.us-east-2.compute.internal", "--kubeconfig=/tmp/installer-aVed14/auth/kubeconfig", "--output=jsonpath={.status.conditions[?(@.type==\"Ready\")].status}"], "delta": "0:00:00.249540", "end": "2022-05-25 14:35:24.212666", "rc": 0, "start": "2022-05-25 14:35:23.963126", "stderr": "", "stderr_lines": [], "stdout": "False", "stdout_lines": ["False"]} fatal: [ip-10-0-61-254.us-east-2.compute.internal -> localhost]: FAILED! => {"attempts": 30, "changed": false, "cmd": ["oc", "get", "node", "ip-10-0-61-254.us-east-2.compute.internal", "--kubeconfig=/tmp/installer-aVed14/auth/kubeconfig", "--output=jsonpath={.status.conditions[?(@.type==\"Ready\")].status}"], "delta": "0:00:00.266898", "end": "2022-05-25 14:35:24.213355", "rc": 0, "start": "2022-05-25 14:35:23.946457", "stderr": "", "stderr_lines": [], "stdout": "False", "stdout_lines": ["False"]} The timeline is: 1.[6:24-6:34] Approve CSR and wait for 10 min TASK [openshift_node : Approve node CSRs] ************************************** Wednesday 25 May 2022 14:24:51 +0800 (0:04:04.743) 0:13:13.576 ********* 2.[6:34], scale-up up job reported error, time out 3.[6:37:09], node reported Ready May 25 06:37:09 ip-10-0-60-71.us-east-2.compute.internal hyperkube[2526]: I0525 06:37:09.201219 2526 kubelet_node_status.go:581] "Recording event message for node" node="ip-10-0-60-71.us-east-2.compute.in ternal" event="NodeReady" - lastHeartbeatTime: "2022-05-25T07:16:01Z" lastTransitionTime: "2022-05-25T06:37:09Z" message: kubelet is posting ready status reason: KubeletReady status: "True" type: Ready How to reproduce it (as minimally and precisely as possible)? > 30% Steps to Reproduce: 1. Create a cluster with OVN network 2. Do scale up against above cluster Expected results: Scale-up job finished successfully Suggestion: Increase wait time to 16-18 mins. Additional info: this issue is applicable for 4.9 4.10 and 4.11
verified. PASS. openshift-ansible-4.11.0-202206240216.p0.g9de1722.assembly.stream.el8.noarch.rpm
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069