Bug 1877070 - [4.4.z] RHCOS nodes' cri-o version is not consistent with RHEL
Summary: [4.4.z] RHCOS nodes' cri-o version is not consistent with RHEL
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.z
Assignee: Micah Abbott
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On: 1876724 1877069
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-08 19:18 UTC by Micah Abbott
Modified: 2020-09-15 17:32 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 1877069
Environment:
Last Closed: 2020-09-15 17:32:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:3605 0 None None None 2020-09-15 17:32:57 UTC

Description Micah Abbott 2020-09-08 19:18:51 UTC
+++ This bug was initially created as a clone of Bug #1877069 +++

+++ This bug was initially created as a clone of Bug #1876724 +++

Description of problem:

Scale up a cluster with RHEL 7.8, checking the cri-o version against RHCOS and RHEL:

Red Hat Enterprise Linux CoreOS 44.82.202009030930-0 (Ootpa)   4.18.0-193.14.3.el8_2.x86_64   cri-o://1.16.6-18.rhaos4.3.git538d861.el8
Red Hat Enterprise Linux Server 7.8 (Maipo)                    3.10.0-1127.19.1.el7.x86_64    cri-o://1.17.5-4.rhaos4.4.git7f0085b.el7


TASK [Get the detailed cri-o version from RHCOS worker] ************************
Monday 07 September 2020  18:41:17 +0800 (0:00:00.799)       0:00:08.459 ****** 
changed: [localhost] => {"changed": true, "cmd": "oc get node ip-10-0-56-49.us-east-2.compute.internal  --output=jsonpath='{.status.nodeInfo.containerRuntimeVersion}' | awk -F '//' '{print $2}' | awk -F '-' '{print $1}'\n", "delta": "0:00:00.213235", "end": "2020-09-07 18:41:17.873516", "rc": 0, "start": "2020-09-07 18:41:17.660281", "stderr": "", "stderr_lines": [], "stdout": "1.16.6", "stdout_lines": ["1.16.6"]}

TASK [Compare the RHEL nodes' cri-o major version is consistent with RHCOS] ****
Monday 07 September 2020  18:41:17 +0800 (0:00:00.379)       0:00:08.839 ****** 
failed: [localhost] (item=ip-10-0-51-48.us-east-2.compute.internal) => {"ansible_loop_var": "item", "changed": true, "cmd": "oc get node ip-10-0-51-48.us-east-2.compute.internal --output=jsonpath='{.status.nodeInfo.containerRuntimeVersion}'  | awk -F '//' '{print $2}' | awk -F '-' '{print $1}'\n", "delta": "0:00:00.207157", "end": "2020-09-07 18:41:18.284359", "failed_when_result": true, "item": "ip-10-0-51-48.us-east-2.compute.internal", "rc": 0, "start": "2020-09-07 18:41:18.077202", "stderr": "", "stderr_lines": [], "stdout": "
", "stdout_lines": ["1.17.5"]}


Version-Release number of selected component (if applicable):
4.4.0-0.nightly-2020-09-07-061145

How reproducible:
100%

Steps to Reproduce:
1. install 4.4 cluster and scaleup with RHEL 7.8 on AWS
2.
3.

Actual results:
RHCOS nodes' cri-o version is not consistent with RHEL

Expected results:
RHCOS nodes' cri-o version is consistent with RHEL

Additional info:

--- Additional comment from Micah Abbott on 2020-09-08 17:27:41 UTC ---

Between Aug 25 and Sep 2, the correct version of `cri-o` was no longer available in the RHAOS 4.4 repo.  The RHCOS build process looked for the best match for the `cri-o` package and selected the version from the RHAOS 4.3 repos.

I'll work with ART to make sure that the correct version of `cri-o` is re-included in the RHAOS 4.4 repo.

--- Additional comment from Eric Paris on 2020-09-08 18:00:26 UTC ---

This bug sets Target Release equal to a z-stream but has no bug in the 'Depends On' field. As such this is not a valid bug state and the target release is being unset.

Any bug targeting 4.1.z must have a bug targeting 4.2 in 'Depends On.'
Similarly, any bug targeting 4.2.z must have a bug with Target Release of 4.3 in 'Depends On.'

Comment 1 Yuxiang Zhu 2020-09-09 12:42:10 UTC
This happened because older versions (cri-o-1.17.5-2.rhaos4.4.git7f0085b.el8 cri-o-1.17.5-3.rhaos4.4.git6b97f81.el8) of cri-o were tagged into rhaos-4.4-rhel-8-candidate after cri-o-1.17.5-4.rhaos4.4.git7f0085b.el8 was tagged in:

$ brew list-history --package=cri-o --tag=rhaos-4.4-rhel-8-candidate | tail
Mon Aug 10 16:49:31 2020 cri-o-1.17.4-24.rhaos4.4.git73658e6.el8 tagged into rhaos-4.4-rhel-8-candidate by lmandvek [still active]
Wed Aug 12 04:23:51 2020 cri-o-1.17.4-15.dev.rhaos4.4.git3572ab6.el8 untagged from rhaos-4.4-rhel-8-candidate by garbage-collector
Tue Aug 18 00:44:32 2020 cri-o-1.17.4-25.rhaos4.4.git462bd29.el8 tagged into rhaos-4.4-rhel-8-candidate by lmandvek [still active]
Tue Aug 18 11:21:45 2020 cri-o-1.17.5-2.rhaos4.4.git34a1ed2.el8 tagged into rhaos-4.4-rhel-8-candidate by lmandvek [still active]
Fri Aug 21 06:47:45 2020 cri-o-1.17.5-3.rhaos4.4.git75b5183.el8 tagged into rhaos-4.4-rhel-8-candidate by lmandvek [still active]
Mon Aug 24 18:48:22 2020 cri-o-1.17.5-4.rhaos4.4.git7f0085b.el8 tagged into rhaos-4.4-rhel-8-candidate by lmandvek [still active]
Sun Aug 30 07:15:07 2020 cri-o-1.17.5-2.rhaos4.4.git7f0085b.el8 tagged into rhaos-4.4-rhel-8-candidate by pehunt
Tue Sep  1 13:27:36 2020 cri-o-1.17.5-3.rhaos4.4.git6b97f81.el8 tagged into rhaos-4.4-rhel-8-candidate by pehunt
Mon Sep  7 23:18:47 2020 cri-o-1.17.5-3.rhaos4.4.git6b97f81.el8 untagged from rhaos-4.4-rhel-8-candidate by yuxzhu
Mon Sep  7 23:19:25 2020 cri-o-1.17.5-2.rhaos4.4.git7f0085b.el8 untagged from rhaos-4.4-rhel-8-candidate by yuxzhu

On Sep 7 I untagged those 2 older versions but apparently it was too late. 

The root cause is that ART leverages Errata Tool to sign RPMs. However Errata Tool doesn't allow us to attach an older version after a newer version is released. As a result, cri-o was excluded from the RHAOS repo created by our pipeline.

I've triggered a force rebuild of the repo. Hopefully cri-o will be there and ready for RHCOS to consume.

Comment 3 Micah Abbott 2020-09-09 14:16:35 UTC
The 4.4 plashet was updated this morning and a new RHCOS build (44.82.202009091324-0) with the correct version of `cri-o`

Moving to MODIFIED

Comment 7 weiwei jiang 2020-09-10 07:51:46 UTC
Checked with 4.4.0-0.nightly-2020-09-09-153044, and it's fixed now. 

qeci-8067-545wx-m-0.c.openshift-qe.internal     Ready    master   60m   v1.17.1+6af3663   10.0.0.5                    Red Hat Enterprise Linux CoreOS 44.82.202009091324-0 (Ootpa)   4.18.0-193.19.1.el8_2.x86_64   cri-o://1.17.5-4.rhaos4.4.git7f0085b.el8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/arch=amd64,kubernetes.io/hostname=qeci-8067-545wx-m-0.c.openshift-qe.internal,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=n1-standard-4,node.openshift.io/os_id=rhcos,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-a", 
        "qeci-8067-545wx-m-1.c.openshift-qe.internal     Ready    master   59m   v1.17.1+6af3663   10.0.0.6                    Red Hat Enterprise Linux CoreOS 44.82.202009091324-0 (Ootpa)   4.18.0-193.19.1.el8_2.x86_64   cri-o://1.17.5-4.rhaos4.4.git7f0085b.el8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/arch=amd64,kubernetes.io/hostname=qeci-8067-545wx-m-1.c.openshift-qe.internal,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=n1-standard-4,node.openshift.io/os_id=rhcos,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-b", 
        "qeci-8067-545wx-m-2.c.openshift-qe.internal     Ready    master   59m   v1.17.1+6af3663   10.0.0.4                    Red Hat Enterprise Linux CoreOS 44.82.202009091324-0 (Ootpa)   4.18.0-193.19.1.el8_2.x86_64   cri-o://1.17.5-4.rhaos4.4.git7f0085b.el8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-c,kubernetes.io/arch=amd64,kubernetes.io/hostname=qeci-8067-545wx-m-2.c.openshift-qe.internal,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.kubernetes.io/instance-type=n1-standard-4,node.openshift.io/os_id=rhcos,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-c", 
        "qeci-8067-545wx-w-a-0.c.openshift-qe.internal   Ready    worker   45m   v1.17.1+6af3663   10.0.32.2                   Red Hat Enterprise Linux CoreOS 44.82.202009091324-0 (Ootpa)   4.18.0-193.19.1.el8_2.x86_64   cri-o://1.17.5-4.rhaos4.4.git7f0085b.el8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/arch=amd64,kubernetes.io/hostname=qeci-8067-545wx-w-a-0.c.openshift-qe.internal,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=n1-standard-4,node.openshift.io/os_id=rhcos,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-a", 
        "qeci-8067-545wx-w-a-l-rhel-0                    Ready    worker   54s   v1.17.1+6af3663   10.0.32.5                   Red Hat Enterprise Linux Server 7.7 (Maipo)                    3.10.0-1127.19.1.el7.x86_64    cri-o://1.17.5-4.rhaos4.4.git7f0085b.el7   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/arch=amd64,kubernetes.io/hostname=qeci-8067-545wx-w-a-l-rhel-0,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=n1-standard-4,node.openshift.io/os_id=rhel,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-a", 
        "qeci-8067-545wx-w-a-l-rhel-1                    Ready    worker   48s   v1.17.1+6af3663   10.0.32.6                   Red Hat Enterprise Linux Server 7.7 (Maipo)                    3.10.0-1127.19.1.el7.x86_64    cri-o://1.17.5-4.rhaos4.4.git7f0085b.el7   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/arch=amd64,kubernetes.io/hostname=qeci-8067-545wx-w-a-l-rhel-1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=n1-standard-4,node.openshift.io/os_id=rhel,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-a", 
        "qeci-8067-545wx-w-b-1.c.openshift-qe.internal   Ready    worker   44m   v1.17.1+6af3663   10.0.32.3                   Red Hat Enterprise Linux CoreOS 44.82.202009091324-0 (Ootpa)   4.18.0-193.19.1.el8_2.x86_64   cri-o://1.17.5-4.rhaos4.4.git7f0085b.el8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/arch=amd64,kubernetes.io/hostname=qeci-8067-545wx-w-b-1.c.openshift-qe.internal,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=n1-standard-4,node.openshift.io/os_id=rhcos,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-b", 
        "qeci-8067-545wx-w-c-2.c.openshift-qe.internal   Ready    worker   41m   v1.17.1+6af3663   10.0.32.4                   Red Hat Enterprise Linux CoreOS 44.82.202009091324-0 (Ootpa)   4.18.0-193.19.1.el8_2.x86_64   cri-o://1.17.5-4.rhaos4.4.git7f0085b.el8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n1-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-c,kubernetes.io/arch=amd64,kubernetes.io/hostname=qeci-8067-545wx-w-c-2.c.openshift-qe.internal,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=n1-standard-4,node.openshift.io/os_id=rhcos,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-c

Comment 9 errata-xmlrpc 2020-09-15 17:32:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.4.21 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3605


Note You need to log in before you can comment on or make changes to this bug.