Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1895309

Summary: [OCP v47] The RHEL node scaleup fails due to "No package matching 'cri-o-1.19.*' found available" on OCP 4.7 cluster
Product: OpenShift Container Platform Reporter: Prashant Dhamdhere <pdhamdhe>
Component: InstallerAssignee: Russell Teague <rteague>
Installer sub component: openshift-ansible QA Contact: Gaoyun Pei <gpei>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, jokerman, mfojtik, pehunt, rteague, xiyuan, xxia
Version: 4.7   
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Package install logic required an exact match of Kube API version to CRI-O version. Consequence: Newer versions of CRI-O could not be installed, although they should still function correctly. Fix: Changed package install logic to allow newer versions of CRI-O to be installed while still requiring a minimum of the current Kube API version. Result: Newer versions of CRI-O can be installed for older Kube API versions.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:31:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1870490    

Description Prashant Dhamdhere 2020-11-06 10:13:04 UTC
Description of problem:

The RHEL node scaleup fails due to "No package matching 'cri-o-1.19.*' found available" on OCP 4.7 cluster. 

TASK [openshift_node : Install openshift packages] *****************************
Friday 06 November 2020  17:02:45 +0800 (0:00:00.089)       0:08:14.213 ******* 

FAILED - RETRYING: Install openshift packages (3 retries left).
FAILED - RETRYING: Install openshift packages (3 retries left).

FAILED - RETRYING: Install openshift packages (2 retries left).
FAILED - RETRYING: Install openshift packages (2 retries left).

FAILED - RETRYING: Install openshift packages (1 retries left).
FAILED - RETRYING: Install openshift packages (1 retries left).

fatal: [ip-10-0-53-18.us-east-2.compute.internal]: FAILED! => {"ansible_job_id": "224354404120.3488", "attempts": 3, "changed": false, "finished": 1, "msg": "No package matching 'cri-o-1.19.*' found available, installed or updated", "rc": 126, "results": ["No package matching 'cri-o-1.19.*' found available, installed or updated"]}
fatal: [ip-10-0-48-123.us-east-2.compute.internal]: FAILED! => {"ansible_job_id": "372095033110.3575", "attempts": 3, "changed": false, "finished": 1, "msg": "No package matching 'cri-o-1.19.*' found available, installed or updated", "rc": 126, "results": ["No package matching 'cri-o-1.19.*' found available, installed or updated"]}

TASK [openshift_node : Package install failure message] ************************
Friday 06 November 2020  17:06:07 +0800 (0:03:22.442)       0:11:36.656 ******* 
fatal: [ip-10-0-53-18.us-east-2.compute.internal]: FAILED! => {"changed": false, "msg": "Unable to install cri-o-1.19.*, openshift-clients-4.7*, openshift-hyperkube-4.7*, podman. Please ensure repos are configured properly to provide these packages and indicated versions.\n"}
fatal: [ip-10-0-48-123.us-east-2.compute.internal]: FAILED! => {"changed": false, "msg": "Unable to install cri-o-1.19.*, openshift-clients-4.7*, openshift-hyperkube-4.7*, podman. Please ensure repos are configured properly to provide these packages and indicated versions.\n"}


Version-Release number of selected component (if applicable):

4.7.0-0.nightly-2020-10-27-051128

How reproducible:

Always

Steps to Reproduce:

1. Deploy OCP 4.7 cluster
2. scaleup RHEL worker nodes


Actual results:

The RHEL scaleup failed to add RHEL worker nodes in ocp 4.7 cluster due to "No package 
matching 'cri-o-1.19.*' found available"

Expected results:

The RHEL node scaleup should not fail with the subjected error and it should add RHEL 
worker nodes in ocp 4.7 cluster without an issue.

Additional info:

This is blocking BZ 1870490 verification

Comment 1 Stefan Schimanski 2020-11-06 10:46:14 UTC
What is the reason this is assigned to the kube-apiserver component?

Comment 2 Russell Teague 2020-11-06 13:17:39 UTC
It looks like cri-0-1.20 has been built and tagged for 4.7 (which is ultimately correct) however 4.7 has not yet been rebased to Kube 1.20 therefore the scaleup playbook is looking for 1.19.

We've had a check in the playbooks to install the same version of cri-o as the kube api version since we started OCP 4 and we seem to hit some combination of version mismatch issues during every release cycle.  Is there something better we can do to make sure we test the right versions, but also allow for this skew that happens each release?

Comment 3 Peter Hunt 2020-11-06 14:18:58 UTC
I would say it'd be better to first look for version N, then look for version N-1, for both kube and cri-o. 

We like to get cri-o 1.20 in early to catch any regressions early.

from my experience, having cri-o on version N and kube on N-1 has not caused any issues in recent memory. the CRI has been fairly stable and backward compatible

Comment 4 Russell Teague 2020-11-06 14:57:48 UTC
I will put together a change in openshift-ansible to verify the available cri-o version is at or above the current kubernetes version.

Comment 5 Russell Teague 2020-11-09 14:33:05 UTC
This is not a release blocker because the current code would work correctly once OCP is rebased to Kube 1.20.  The open PR will allow newer versions of crio to be installed than the current kube version to testing during the development cycle.

Comment 7 Gaoyun Pei 2020-11-10 04:14:34 UTC
Verify this bug with openshift-ansible-4.7.0-202011092117.p0.git.0.ec2dd4f.el7.noarch.rpm

cri-o 1.20 could be installed when the k8s version is 1.19.


TASK [openshift_node : Set fact l_kubernetes_server_version] *******************
Tuesday 10 November 2020  11:59:49 +0800 (0:00:00.471)       0:07:14.303 ****** 
ok: [10.0.32.4] => {"ansible_facts": {"l_kubernetes_server_version": "1.19"}, "changed": false}

TASK [openshift_node : Get available cri-o RPM versions] ***********************
Tuesday 10 November 2020  11:59:49 +0800 (0:00:00.077)       0:07:14.380 ****** 
ok: [10.0.32.4] => {"changed": false, "results": [{"arch": "x86_64", "envra": "0:cri-o-1.20.0-0.rhaos4.7.git8e23406.el7.9.x86_64", "epoch": "0", "name": "cri-o", "release": "0.rhaos4.7.git8e23406.el7.9", "repo": "aos-v4-devel-install", "version": "1.20.0", "yumstate": "available"}]}
...

TASK [openshift_node : Install openshift packages] *****************************

nInstalled:\n  cri-o.x86_64 0:1.20.0-0.rhaos4.7.git8e23406.el7.9

Comment 10 errata-xmlrpc 2021-02-24 15:31:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633