Bug 1852357
Summary: | RHEL worker scale-up failed for OCP 4.6 due to "No package matching 'cri-o-1.18.*' found available" | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | xiyuan |
Component: | Installer | Assignee: | aos-install |
Installer sub component: | openshift-ansible | QA Contact: | Gaoyun Pei <gpei> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | adahiya, aos-bugs, jiajliu, jialiu, jokerman, wjiang, yanyang |
Version: | 4.6 | Keywords: | TestBlocker |
Target Milestone: | --- | ||
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-10-27 16:10:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1861097 | ||
Bug Blocks: |
Description
xiyuan
2020-06-30 08:59:19 UTC
the scaleup scripts should be using 1.19, as that's what's in the 4.6 puddle now. Moving to installer The kubernetes version is still 1.18 in OCP-4.6 nightly, we need the rebase of OCP on top of k8s 1.19, the steps openshift-ansible playbook did was expected. TASK [openshift_node : Set fact l_kubernetes_version] ************************** Tuesday 30 June 2020 15:55:55 +0800 (0:00:00.577) 0:05:39.098 ********** ok: [ip-10-0-58-77.us-east-2.compute.internal] => {"ansible_facts": {"l_kubernetes_version": "1.18"}, "changed": false} ok: [ip-10-0-50-230.us-east-2.compute.internal] => {"ansible_facts": {"l_kubernetes_version": "1.18"}, "changed": false} By design the openshift-ansible scaleup playbooks install cri-o based on the cluster kubernetes version [1]. In CI, we have an override (ci_version_override) for this behavior because during development cycles the package versions available are not always in step with kubernetes rebases or release branching. This override should not be used outside of CI to ensure we continue to validate installs with proper versions. This is not a bug. [1] https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_node/defaults/main.yml#L13 We are waiting for kube rebase and when that happens scaleup should start picking up the correct version. So this bug will have to wait until that happens. In today's scaleup testing with openshift-ansible-4.6.0-202007161549.p0.git.0.8f2f0c3.el7.noarch, seem like error message has some change, it is expecting cri-o-4.6, but not cri-o 1.8/1.9. TASK [openshift_node : Install openshift packages] ***************************** Monday 20 July 2020 21:02:01 +0800 (0:00:00.085) 0:04:40.552 *********** FAILED - RETRYING: Install openshift packages (3 retries left). FAILED - RETRYING: Install openshift packages (3 retries left). FAILED - RETRYING: Install openshift packages (2 retries left). FAILED - RETRYING: Install openshift packages (2 retries left). FAILED - RETRYING: Install openshift packages (1 retries left). FAILED - RETRYING: Install openshift packages (1 retries left). fatal: [10.0.32.6]: FAILED! => {"ansible_job_id": "152086687514.8199", "attempts": 3, "changed": false, "finished": 1, "msg": "No package matching 'cri-o-4.6.*' found available, installed or updated", "rc": 126, "results": ["No package matching 'cri-o-4.6.*' found available, installed or updated"]} fatal: [10.0.32.5]: FAILED! => {"ansible_job_id": "713708506834.8148", "attempts": 3, "changed": false, "finished": 1, "msg": "No package matching 'cri-o-4.6.*' found available, installed or updated", "rc": 126, "results": ["No package matching 'cri-o-4.6.*' found available, installed or updated"]} Also hit the issue in upgrade ci test from 4.5.2-x86_64 to 4.6.0-0.nightly-2020-07-20-093546. Upgrade cluster succeed, and then run rhel worker upgrade failed. TASK [openshift_node : Install openshift packages] ***************************** Monday 20 July 2020 21:19:54 +0800 (0:00:00.120) 0:04:21.097 *********** FAILED - RETRYING: Install openshift packages (3 retries left). FAILED - RETRYING: Install openshift packages (2 retries left). FAILED - RETRYING: Install openshift packages (1 retries left). fatal: [10.0.32.62]: FAILED! => {"ansible_job_id": "446048133576.57238", "attempts": 3, "changed": false, "finished": 1, "msg": "No package matching 'cri-o-4.6.*' found available, installed or updated", "rc": 126, "results": ["No package matching 'cri-o-4.6.*' found available, installed or updated"]} TASK [openshift_node : Package install failure message] ************************ Monday 20 July 2020 21:22:19 +0800 (0:02:24.795) 0:06:45.893 *********** fatal: [10.0.32.62]: FAILED! => {"changed": false, "msg": "Unable to install cri-o-4.6.*, openshift-clients-4.6*, openshift-hyperkube-4.6*, podman. Please ensure repos are configured properly to provide these packages and indicated versions.\n"} A known regression (1861097) is impacting the install due to the kube version not being reported correctly. As the Kube API returns version 1.19 now, it's not a blocker for RHEL scale-up now. Thanks. TASK [openshift_node : Install openshift packages] ***************************** Monday 10 August 2020 20:22:06 +0800 (0:00:00.069) 0:06:37.777 ********* changed: [10.0.96.117] => {"ansible_job_id": "317997613290.7515", "attempts": 1, "changed": true, "changes": {"installed": ["cri-o-1.19.*", "openshift-clients-4.6*", "openshift-hyperkube-4.6*", "podman"], "updated": []}, "finished": 1, "msg": "", "obsoletes": {"python-urllib3": {"dist": "noarch", "repo": "installed", "version": "1.10.2-7.el7"}}, "rc": 0, "results": ["Loaded plugins: search-disabled-repos\nResolving Dependencies\n--> Running transaction check\n---> Package cri-o.x86_64 0:1.19.0-69.rhaos4.6.git707b4b9.el7 will be installed ... # oc get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ... wj46uos810z-hz29f-rhel3-0 Ready worker 2m7s v1.19.0-rc.2+5241b27-dirty 192.168.3.185 10.0.96.117 Red Hat Enterprise Linux Server 7.8 (Maipo) 3.10.0-1127.18.2.el7.x86_64 cri-o://1.19.0-69.rhaos4.6.git707b4b9.el7-dev Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |