Bug 1838984 - 4.4 MachineSet with 4.2 or earlier bootimages fails to scale up because old CRI-O chokes on new CRI-O config
Summary: 4.4 MachineSet with 4.2 or earlier bootimages fails to scale up because old C...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
urgent
medium
Target Milestone: ---
: 4.3.z
Assignee: Sinny Kumari
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On: 1830102
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-22 08:57 UTC by Sinny Kumari
Modified: 2020-06-17 20:28 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1830102
Environment:
Last Closed: 2020-06-17 20:28:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1747 0 None closed Bug 1838984: [release-4.3] templates: Don't enable machine-config-daemon-host.service by default 2020-06-05 17:50:27 UTC
Red Hat Product Errata RHBA-2020:2436 0 None None None 2020-06-17 20:28:49 UTC

Comment 3 Micah Abbott 2020-06-02 22:05:14 UTC
Verified with 4.3.0-0.nightly-2020-06-01-225519

1. Booted 4.2.34 cluster

```
$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS                                                                                  
version   4.2.34    True        False         5m20s   Cluster version is 4.2.34           
$ oc get nodes
NAME                                         STATUS   ROLES    AGE   VERSION                                         
ip-10-0-129-85.us-west-2.compute.internal    Ready    master   23m   v1.14.6-152-g117ba1f 
ip-10-0-133-59.us-west-2.compute.internal    Ready    worker   15m   v1.14.6-152-g117ba1f
ip-10-0-146-102.us-west-2.compute.internal   Ready    worker   15m   v1.14.6-152-g117ba1f                                                
ip-10-0-153-172.us-west-2.compute.internal   Ready    master   22m   v1.14.6-152-g117ba1f
ip-10-0-173-143.us-west-2.compute.internal   Ready    master   22m   v1.14.6-152-g117ba1f  
ip-10-0-173-154.us-west-2.compute.internal   Ready    worker   14m   v1.14.6-152-g117ba1f
```

2. Upgraded to 4.3.0-0.nightly-2020-06-01-225519

```
$ oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge
clusterversion.config.openshift.io/version patched

$ oc adm upgrade --allow-explicit-upgrade=true --force=true --to-image=registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2020-06-01-225519
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2020-06-01-225519

$ oc get clusterversion                                                                       
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.0-0.nightly-2020-06-01-225519   True        False         3m35s   Cluster version is 4.3.0-0.nightly-2020-06-01-225519
$ oc get nodes                                       
NAME                                         STATUS   ROLES    AGE   VERSION                       
ip-10-0-129-85.us-west-2.compute.internal    Ready    master   81m   v1.16.2+18cfcc9                                                          
ip-10-0-133-59.us-west-2.compute.internal    Ready    worker   73m   v1.16.2+18cfcc9                               
ip-10-0-146-102.us-west-2.compute.internal   Ready    worker   73m   v1.16.2+18cfcc9                             
ip-10-0-153-172.us-west-2.compute.internal   Ready    master   81m   v1.16.2+18cfcc9                             
ip-10-0-173-143.us-west-2.compute.internal   Ready    master   81m   v1.16.2+18cfcc9 
ip-10-0-173-154.us-west-2.compute.internal   Ready    worker   73m   v1.16.2+18cfcc9

$ oc debug node/ip-10-0-129-85.us-west-2.compute.internal
Starting pod/ip-10-0-129-85us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.129.85
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a20fcdb0e02b8bfd610711c9231789c54b403d4fcf91c9eb8a89a31bb52d0b87
              CustomOrigin: Managed by machine-config-operator
                   Version: 43.81.202006011853.0 (2020-06-01T18:58:42Z)

  pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b64e472b57538ebd6808a1e0528d9ea83877207d29c091634f41627a609f9b04
              CustomOrigin: Managed by machine-config-operator
                   Version: 42.81.20200525.0 (2020-05-25T20:53:09Z)
```

3.  Checked MachineSets and scaled up one of the worker pools

```
$ oc get machinesets -n openshift-machine-api                                                               
NAME                                      DESIRED   CURRENT   READY   AVAILABLE   AGE 
miabbott-4-2-34-84khx-worker-us-west-2a   1         1         1       1           81m 
miabbott-4-2-34-84khx-worker-us-west-2b   1         1         1       1           81m                                 
miabbott-4-2-34-84khx-worker-us-west-2c   1         1         1       1           81m                                 
miabbott-4-2-34-84khx-worker-us-west-2d   0         0                             81m    
$ oc scale --replicas=2 machineset miabbott-4-2-34-84khx-worker-us-west-2a -n openshift-machine-api
machineset.machine.openshift.io/miabbott-4-2-34-84khx-worker-us-west-2a scaled

$ oc get machinesets -n openshift-machine-api
NAME                                      DESIRED   CURRENT   READY   AVAILABLE   AGE
miabbott-4-2-34-84khx-worker-us-west-2a   2         2         2       2           93m
miabbott-4-2-34-84khx-worker-us-west-2b   1         1         1       1           93m
miabbott-4-2-34-84khx-worker-us-west-2c   1         1         1       1           93m
miabbott-4-2-34-84khx-worker-us-west-2d   0         0                             93m
$ oc get nodes
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-129-85.us-west-2.compute.internal    Ready    master   94m     v1.16.2+18cfcc9
ip-10-0-132-188.us-west-2.compute.internal   Ready    worker   7m38s   v1.16.2+18cfcc9
ip-10-0-133-59.us-west-2.compute.internal    Ready    worker   86m     v1.16.2+18cfcc9
ip-10-0-146-102.us-west-2.compute.internal   Ready    worker   86m     v1.16.2+18cfcc9
ip-10-0-153-172.us-west-2.compute.internal   Ready    master   94m     v1.16.2+18cfcc9
ip-10-0-173-143.us-west-2.compute.internal   Ready    master   94m     v1.16.2+18cfcc9
ip-10-0-173-154.us-west-2.compute.internal   Ready    worker   86m     v1.16.2+18cfcc9

$ oc debug node/ip-10-0-132-188.us-west-2.compute.internal
Starting pod/ip-10-0-132-188us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.132.188
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a20fcdb0e02b8bfd610711c9231789c54b403d4fcf91c9eb8a89a31bb52d0b87
              CustomOrigin: Managed by machine-config-operator
                   Version: 43.81.202006011853.0 (2020-06-01T18:58:42Z)

  pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cc71fbd134f063d9fc0ccc78933b89c8dd2b1418b7a7b85bb70de87bc80486d7
              CustomOrigin: Image generated via coreos-assembler
                   Version: 42.80.20191002.0 (2019-10-02T13:31:28Z)
```

4.  Upgraded to 4.4.0-0.nightly-2020-06-02-093230

```
$ oc adm upgrade --allow-explicit-upgrade=true --force=true --to-image=registry.svc.ci.openshift.org/ocp/release:4.4.0-0.nightly-2020-06-02-093230
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.4.0-0.nightly-2020-06-02-093230                                                                                                                                                                                       

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-06-02-093230   True        False         15m     Cluster version is 4.4.0-0.nightly-2020-06-02-093230
$ oc get nodes
NAME                                         STATUS   ROLES    AGE    VERSION
ip-10-0-129-85.us-west-2.compute.internal    Ready    master   145m   v1.17.1+f5fb168
ip-10-0-132-188.us-west-2.compute.internal   Ready    worker   58m    v1.17.1+f5fb168
ip-10-0-133-59.us-west-2.compute.internal    Ready    worker   137m   v1.17.1+f5fb168
ip-10-0-146-102.us-west-2.compute.internal   Ready    worker   137m   v1.17.1+f5fb168
ip-10-0-153-172.us-west-2.compute.internal   Ready    master   145m   v1.17.1+f5fb168
ip-10-0-173-143.us-west-2.compute.internal   Ready    master   145m   v1.17.1+f5fb168
ip-10-0-173-154.us-west-2.compute.internal   Ready    worker   137m   v1.17.1+f5fb168

$ oc debug node/ip-10-0-132-188.us-west-2.compute.internal
Starting pod/ip-10-0-132-188us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.132.188
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6ce91fcf3b244f86b2fbda6daa6a80c76ea99b0d4640dd64e10469999b540be2
              CustomOrigin: Managed by machine-config-operator
                   Version: 44.81.202006011547-0 (2020-06-01T15:52:16Z)

  pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a20fcdb0e02b8bfd610711c9231789c54b403d4fcf91c9eb8a89a31bb52d0b87
              CustomOrigin: Managed by machine-config-operator
                   Version: 43.81.202006011853.0 (2020-06-01T18:58:42Z)
```

5.  Checked MachineSets and scaled up a different worker pool

```
$ oc get machinesets -n openshift-machine-api
NAME                                      DESIRED   CURRENT   READY   AVAILABLE   AGE
miabbott-4-2-34-84khx-worker-us-west-2a   2         2         2       2           145m
miabbott-4-2-34-84khx-worker-us-west-2b   1         1         1       1           145m
miabbott-4-2-34-84khx-worker-us-west-2c   1         1         1       1           145m
miabbott-4-2-34-84khx-worker-us-west-2d   0         0                             145m
$ oc scale --replicas=2 machineset miabbott-4-2-34-84khx-worker-us-west-2b -n openshift-machine-api
machineset.machine.openshift.io/miabbott-4-2-34-84khx-worker-us-west-2b scaled

$ oc get machinesets -n openshift-machine-api
NAME                                      DESIRED   CURRENT   READY   AVAILABLE   AGE
miabbott-4-2-34-84khx-worker-us-west-2a   2         2         2       2           154m
miabbott-4-2-34-84khx-worker-us-west-2b   2         2         2       2           154m
miabbott-4-2-34-84khx-worker-us-west-2c   1         1         1       1           154m
miabbott-4-2-34-84khx-worker-us-west-2d   0         0                             154m
$ oc get nodes
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-129-85.us-west-2.compute.internal    Ready    master   155m    v1.17.1+f5fb168
ip-10-0-132-188.us-west-2.compute.internal   Ready    worker   68m     v1.17.1+f5fb168
ip-10-0-133-59.us-west-2.compute.internal    Ready    worker   147m    v1.17.1+f5fb168
ip-10-0-146-102.us-west-2.compute.internal   Ready    worker   147m    v1.17.1+f5fb168
ip-10-0-153-172.us-west-2.compute.internal   Ready    master   155m    v1.17.1+f5fb168
ip-10-0-159-46.us-west-2.compute.internal    Ready    worker   3m59s   v1.17.1+f5fb168
ip-10-0-173-143.us-west-2.compute.internal   Ready    master   155m    v1.17.1+f5fb168
ip-10-0-173-154.us-west-2.compute.internal   Ready    worker   147m    v1.17.1+f5fb168
$ oc debug node/ip-10-0-159-46.us-west-2.compute.internal
Starting pod/ip-10-0-159-46us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.159.46
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6ce91fcf3b244f86b2fbda6daa6a80c76ea99b0d4640dd64e10469999b540be2
              CustomOrigin: Managed by machine-config-operator
                   Version: 44.81.202006011547-0 (2020-06-01T15:52:16Z)

  pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cc71fbd134f063d9fc0ccc78933b89c8dd2b1418b7a7b85bb70de87bc80486d7
              CustomOrigin: Image generated via coreos-assembler
                   Version: 42.80.20191002.0 (2019-10-02T13:31:28Z)
```

Comment 7 errata-xmlrpc 2020-06-17 20:28:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2436


Note You need to log in before you can comment on or make changes to this bug.