1879099 – Extra reboot during 4.5 -> 4.6 upgrade

Bug 1879099 - Extra reboot during 4.5 -> 4.6 upgrade

Summary: Extra reboot during 4.5 -> 4.6 upgrade

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Machine Config Operator
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.7.0
Assignee:	Colin Walters
QA Contact:	Michael Nguyen
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1901602
TreeView+	depends on / blocked

Reported:	2020-09-15 12:49 UTC by Vadim Rutkovsky
Modified:	2022-03-03 13:33 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: The MachineConfigController during upgrades would roll out its update relatively late in the MCO upgrade process, giving it the chance to process new cluster config with the old MCC pod, generating one extra rendered config during updates Consequence: There is one extra reboot during said update if and only if there are any config or config field changes, as observed during 4.5->4.6 due to us deprecating fields Fix: The MCC now has blocks generating a new rendered machineconfig if the update cluster config was generated by a newer MCO Result: no further extra reboots should be observed
Clone Of:
Environment:
Last Closed:	2021-02-24 15:18:03 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift machine-config-operator pull 2206	0	None	closed	Bug 1879099: Inject version into controllerconfig, refuse mismatches	2021-02-17 13:29:38 UTC
Red Hat Product Errata	RHSA-2020:5633	0	None	None	None	2021-02-24 15:18:40 UTC

Description Vadim Rutkovsky 2020-09-15 12:49:19 UTC

Description of problem:
MCO is doing extra node reboot on 4.5 -> 4.6 upgrade. 

After 4.5 install the nodes have 2 boots (initial RHCOS + pivoted). After upgrade to 4.6.fc5 nodes have 2 more boots. Seems one of the upgrade reboots can be avoided.

STR:
* Launch 4.5
* Check `journalctl --list-boots` on any node
* Upgrade to 4.6
* Check `journalctl --list-boots` on any node

I'll link collected must-gather in the followup comments

Comment 3 Yu Qi Zhang 2020-09-15 19:07:52 UTC

I installed my own cluster and was able to reproduce this. So I think what is happening is this:

Worker pools only have 3 boots. Master pools have 4. There are actually 3 rendered master configs vs 2 of workers, and this is because the OLD machineconfigcontroller renders one last config before it dies. The configs have the following two files diff'ed only:

/etc/kubernetes/static-pod-resources/etcd-member/ca.crt
/etc/kubernetes/static-pod-resources/etcd-member/metric-ca.crt

With the introduction of https://github.com/openshift/machine-config-operator/pull/1933, we remove the following keys: etcdCAData and etcdMetricCAData from the controllerconfig. This causes a new controllerconfigs to be generated when you upgrade to 4.6, that strips out those keys, which the controller then takes to render a new machineconfig. I believe this is because when the new MCO for 4.6 rolls out it rolls out in this order:

RenderConfig
MachineConfigPools
MachineConfigDaemon
MachineConfigController
MachineConfigServer
RequiredPools

So before the new MCC takes over the old MCC still has one last chance to render a new config and start the upgrade process.

If my above assessment is correct, I don't believe this is a regression since we've always rolled out in this way. Which is somewhat odd that we've never reported this before (or have we?) that the MCO will roll out an extra config IF the controllerconfig has changed. Now I think about it I think I've seen quite a few must-gathers that exhibit similar behaviour. I guess most people don't look at the boots.

Comment 4 Yu Qi Zhang 2020-09-15 22:32:09 UTC

I took a quick look at the code and I think the steps are as follows:

1. The new MCO rolls out. It first syncs the controllerconfig to the new one as part of the RenderConfig step above.
2. The running (old) MCC's template controller has an informer on the controllerconfig, which since the previous step updated the config, calls a syncControllerConfig. This re-generates the base master config (00-master MC)
3. The running (old) MCC now sees a MC change which calls another sync call to generate a new rendered config, and instructs the master pool to start an update
4. The new MCC rolls out, and does the whole rendering process, creating the actual updated MC for the master pool, which it waits until the previous one is finished, and starts the new (real) update.

I don't recall us modifying this behaviour recently so I think we've always done it this way. I'm surprised this change of controllerconfig being synced first from the old MCC hasn't caused us trouble (as far as I know) in the past. Do you think this is high enough impact to warrant high priority as a BZ? It shouldn't affect anything currently other than incur one extra reboot.

Comment 5 Vadim Rutkovsky 2020-09-16 07:01:33 UTC

>Do you think this is high enough impact to warrant high priority as a BZ?

I think its pretty important. Every additional boot takes a lot of time, especially on baremetal.
Also a mix of MCC and operator versions is concerning and may bite us later.

Another problem I've hit is OKD-specific, but related to that. In OKD 4.6 we should install NetworkManager-ovs on hosts for network to function correctly. During clean install we've chosen to install it using OS extensions. During 4.5 -> 4.6 OKD upgrade we add an MC which contains necessary extensions. But due to additional reboot nodes don't come up - they already use new network, but don't have necessary packages installed yet (new MCC didn't generate Ignition with 4.6 osURL yet). As a result the whole upgrade stops.

Comment 6 Colin Walters 2020-09-21 18:35:48 UTC

Looking at this now with Jerry what I'm feeling is that in this ordering:

RenderConfig
...
MachineConfigController

We really want both of these to happen before we should start rolling out any changes to the pools.

There were a few changes in the MCO around similar "version mismatch" issues:

- https://github.com/openshift/machine-config-operator/commit/c2ef2c1bad545eb8d60e3f57024526dfb49829df
- https://github.com/openshift/machine-config-operator/commit/c7d3d9a805a0932afbb4141600ff0038c480c59c

Does the RenderConfig not have the operator version embedded today?  I think the simplest fix
along those lines would be to embed it, and have the controller check it and ignore the render
config if it doesn't match.

Comment 7 Colin Walters 2020-09-21 18:39:33 UTC

Sorry it's the controllerconfig object that needs a version I think, renderconfig is just what the operator uses.

Comment 8 Colin Walters 2020-09-21 19:02:52 UTC

OK since I had this in mind I just tried to dive in and do a fix: https://github.com/openshift/machine-config-operator/pull/2112

Comment 9 Yu Qi Zhang 2020-09-21 19:33:00 UTC

Tentatively setting this to 4.7 as this shouldn't block the existing 4.5->4.6 upgrade. As discussed there are a few approaches:

1. backport a fix to 4.5, that has the 4.5 controller check which operator version generated the controllerconfig. If there is a mismatch, don't render the MC (basically it would wait for the new controller to roll out first)
   - this approach comes with the issue that we must block all 4.5 -> 4.6 upgrades without the fix

2. have the 4.6 operator pod "pause" the old controller pod (before it generates the new controllerconfig, until the daemonset finishes the rolling update for the pod. Then resume operations
   - we can potentially expand this so that all operations are paused when the new operator is rolling out pods. This may have other side effects

Given the timelines I'm not sure we will be able to fix this for 4.6

Comment 10 Michelle Krejci 2020-09-23 17:04:53 UTC

Following a conversation with Jerry, we will move this to 4.7 pending a conversation about the proposed change with the MCO team next week.

Comment 11 Colin Walters 2020-11-04 15:02:40 UTC

I think we should fix this for 4.7, marking blocker.

Comment 13 Michael Nguyen 2020-12-11 23:52:52 UTC

Verified on 4.7.0-0.nightly-2020-12-11-135127.  Upgraded from 4.6.8 to 4.7.0-0.nightly-2020-12-11-135127 -- Only one boot is observed.
$ for i in $(oc get nodes -o name); do oc debug $i -- chroot /host journalctl --list-boots; done
Starting pod/ip-10-0-139-254us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-1 50766f40d79940d4ba3e00d02994e533 Fri 2020-12-11 21:52:58 UTC—Fri 2020-12-11 22:02:56 UTC
 0 c77f9a72e7044ec990173b52a8cdad0c Fri 2020-12-11 22:03:15 UTC—Fri 2020-12-11 22:30:20 UTC

Removing debug pod ...
Starting pod/ip-10-0-144-25us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-1 e6803cbd7a1f4bdb946abdaac0e62de0 Fri 2020-12-11 22:04:04 UTC—Fri 2020-12-11 22:10:26 UTC
 0 7900e377606b4e2c9a01520b0ddec718 Fri 2020-12-11 22:10:43 UTC—Fri 2020-12-11 22:30:32 UTC

Removing debug pod ...
Starting pod/ip-10-0-165-187us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-1 1855da4e7bc247aa865e21ef6ce6f898 Fri 2020-12-11 22:03:05 UTC—Fri 2020-12-11 22:10:22 UTC
 0 776ff364a2a447bc8e42abadeb1874ca Fri 2020-12-11 22:10:39 UTC—Fri 2020-12-11 22:30:45 UTC

Removing debug pod ...
Starting pod/ip-10-0-173-136us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-1 77fcc9652021428ebd8b12d4c76346c2 Fri 2020-12-11 21:52:46 UTC—Fri 2020-12-11 21:58:35 UTC
 0 332db58320bd4257a93de7e19da70ac8 Fri 2020-12-11 21:58:53 UTC—Fri 2020-12-11 22:30:57 UTC

Removing debug pod ...
Starting pod/ip-10-0-212-236us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-1 edc410c7d3874c899e28e5bb6f82b709 Fri 2020-12-11 21:52:46 UTC—Fri 2020-12-11 22:02:22 UTC
 0 57a062b876234798be5ace680cddd9d5 Fri 2020-12-11 22:02:39 UTC—Fri 2020-12-11 22:31:07 UTC

Removing debug pod ...
Starting pod/ip-10-0-221-39us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-1 628feefe3c2945b785b448aedc132270 Fri 2020-12-11 22:03:09 UTC—Fri 2020-12-11 22:04:50 UTC
 0 d09c0fb7e0294e099cf910e58f244ebf Fri 2020-12-11 22:05:07 UTC—Fri 2020-12-11 22:31:20 UTC

Removing debug pod ...
$ oc adm upgrade --force --to-image=registry.svc.ci.openshift.org/ocp/release:4.7.0-0.nightly-2020-12-11-135127
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.7.0-0.nightly-2020-12-11-135127
$ watch oc get clusterversion
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-12-11-135127   True        False         4m52s   Cluster version is 4.7.0-0.nightly-2020-12-11-135127
$ for i in $(oc get nodes -o name); do oc debug $i -- chroot /host journalctl --list-boots; done
Starting pod/ip-10-0-139-254us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-2 50766f40d79940d4ba3e00d02994e533 Fri 2020-12-11 21:52:58 UTC—Fri 2020-12-11 22:02:56 UTC
-1 c77f9a72e7044ec990173b52a8cdad0c Fri 2020-12-11 22:03:15 UTC—Fri 2020-12-11 23:41:24 UTC
 0 0c41547572a54f53afe9bfe2e215fdf5 Fri 2020-12-11 23:41:42 UTC—Fri 2020-12-11 23:49:59 UTC

Removing debug pod ...
Starting pod/ip-10-0-144-25us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-2 e6803cbd7a1f4bdb946abdaac0e62de0 Fri 2020-12-11 22:04:04 UTC—Fri 2020-12-11 22:10:26 UTC
-1 7900e377606b4e2c9a01520b0ddec718 Fri 2020-12-11 22:10:43 UTC—Fri 2020-12-11 23:35:09 UTC
 0 53b3715f1c9547f68691bf1832609cd9 Fri 2020-12-11 23:35:26 UTC—Fri 2020-12-11 23:50:02 UTC

Removing debug pod ...
Starting pod/ip-10-0-165-187us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-2 1855da4e7bc247aa865e21ef6ce6f898 Fri 2020-12-11 22:03:05 UTC—Fri 2020-12-11 22:10:22 UTC
-1 776ff364a2a447bc8e42abadeb1874ca Fri 2020-12-11 22:10:39 UTC—Fri 2020-12-11 23:38:50 UTC
 0 ba0db9c1fe4e4770b6beb9a766ed654a Fri 2020-12-11 23:39:07 UTC—Fri 2020-12-11 23:50:06 UTC

Removing debug pod ...
Starting pod/ip-10-0-173-136us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-2 77fcc9652021428ebd8b12d4c76346c2 Fri 2020-12-11 21:52:46 UTC—Fri 2020-12-11 21:58:35 UTC
-1 332db58320bd4257a93de7e19da70ac8 Fri 2020-12-11 21:58:53 UTC—Fri 2020-12-11 23:32:04 UTC
 0 989640a9d5064e72a95cbdf9130f93eb Fri 2020-12-11 23:32:22 UTC—Fri 2020-12-11 23:50:09 UTC

Removing debug pod ...
Starting pod/ip-10-0-212-236us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-2 edc410c7d3874c899e28e5bb6f82b709 Fri 2020-12-11 21:52:46 UTC—Fri 2020-12-11 22:02:22 UTC
-1 57a062b876234798be5ace680cddd9d5 Fri 2020-12-11 22:02:39 UTC—Fri 2020-12-11 23:36:46 UTC
 0 3cfc0873b2c24284bc42fd0d30d4d696 Fri 2020-12-11 23:37:04 UTC—Fri 2020-12-11 23:50:12 UTC

Removing debug pod ...
Starting pod/ip-10-0-221-39us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
-2 628feefe3c2945b785b448aedc132270 Fri 2020-12-11 22:03:09 UTC—Fri 2020-12-11 22:04:50 UTC
-1 d09c0fb7e0294e099cf910e58f244ebf Fri 2020-12-11 22:05:07 UTC—Fri 2020-12-11 23:31:31 UTC
 0 9ee0a30ec09049d4806254ca1c7f5157 Fri 2020-12-11 23:31:49 UTC—Fri 2020-12-11 23:50:16 UTC

Removing debug pod ...
$

Comment 16 errata-xmlrpc 2021-02-24 15:18:03 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633

Note You need to log in before you can comment on or make changes to this bug.