+++ This bug was initially created as a clone of Bug #1995785 +++ Description of problem: Another step of the fallout of https://bugzilla.redhat.com/show_bug.cgi?id=1993385 includes an interesting interaction between rpm-ostree and older versions of MCO. If a cluster was ever at a version where the MCO configured /etc/crio/crio.conf (4.5 or earlier), then updates to the cri-o rpm won't update the crio.conf file (in ways like updating the conmon path). Since the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1993385 only updated MCO to *not* specify the conmon path (thinking it would leave it to the CRI-O default of "") in the drop in template, the pre-existing value in /etc/crio/crio.conf (unchanged from fixing the rpm) would prevail, causing cri-o to expect conmon to be at /usr/libexec/crio/conmon, which no longer exists. This causes nodes to not come up Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. upgrade a node from 4.5->affectected versions (going through each minor version) 2. notice cri-o does not come up in similar ways to https://bugzilla.redhat.com/show_bug.cgi?id=1993385 Actual results: the node does not come up Expected results: the node starts Additional info:
Verified on 4.8.0-0.nightly-2021-08-21-050932 1. Install 4.7.24 2. oc debug to a worker and edit /etc/crio/crio.conf and make some changes (I changed loglevel and turned metrics on) and save the file 3. Create a containerruntime config with the following contents apiVersion: machineconfiguration.openshift.io/v1 kind: ContainerRuntimeConfig metadata: name: set-pids-limit spec: machineConfigPoolSelector: matchLabels: custom-crio: high-pid-limit containerRuntimeConfig: pidsLimit: 2048 4. oc label machineconfigpool worker custom-crio=high-pid-limit 5. oc get mcp worker -w and watch for all workers to be ready 6. oc adm upgrade --force --allow-explicit-upgrade --to-image registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-08-21-050932 - verify upgrade successful - oc debug to the node where crio.conf was modified and verify customizations are still in place - crio config | grep conmon and verify value is "" and not /usr/libexec/crio/conmon
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.8.9 bug fix), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3247