Description of problem: It is possible to trigger duplication of a ContainerRuntimeConfiguration when multiple exist for a given set of nodes. And the duplicated config is the first in the list. When each config is managing the same configuration it effectively means what was the second config is now overridden. Version-Release number of selected component (if applicable): Tested and reproduced on 4.10.18 OSD clusters. Observed on production customer OSD cluster version 4.10.6. How reproducible: About 25%. Steps to Reproduce: 1. Create OSD cluster. Setup IDP. Note SRE used backplane for access and did not setup IDP. 2. Login to cluster as soon as possible. 3. Wait for at least one worker to have pids_limit = 4096, applied by custom-crio ContainerRuntimeConfiguration oc -n default debug node/$(oc get nodes | grep worker | grep -v infra | awk '{print $1}' | head -n1) -- "chroot /host crio config | grep pids_limit" 4. Apply new ContainerRuntimeConfiguration to bump pids_limit to 65000 cat << EOF | oc create -f- apiVersion: machineconfiguration.openshift.io/v1 kind: ContainerRuntimeConfig metadata: name: new-large-pidlimit spec: containerRuntimeConfig: pidsLimit: 65000 machineConfigPoolSelector: matchExpressions: - key: pools.operator.machineconfiguration.openshift.io/worker operator: Exists EOF 5. Wait for at least one worker to have pids_limit = 65000, applied by new-large-pidlimit ContainerRuntimeConfiguration oc -n default debug node/$(oc get nodes | grep worker | grep -v infra | awk '{print $1}' | head -n1) -- "chroot /host crio config | grep pids_limit" 6. Verify there are only 2 machineconfig for containerruntime oc get machineconfig | grep containerruntime 7. Force CVO to reconcile things oc -n openshift-cluster-version scale deployment cluster-version-operator --replicas=0 sleep 5 oc -n openshift-cluster-version scale deployment cluster-version-operator --replicas=1 8. Check machineconfig for containerruntime again. If the problem is triggered (25% chance observed in testing) you'll now see a 3rd. This 3rd one (with -2 post-fix) will be a duplicate of the original machineconfig created for "custom-crio". oc get machineconfig | grep containerruntime Actual results: 3 MachineConfig for containerruntime exist in this order: 1. custom-crio 2. new-large-pidlimit 3. custom-crio (duplicate) Expected results: 2 MachineConfig for containerruntime exist in this order: 1. custom-crio 2. new-large-pidlimit Additional info: OSD creates a ContainerRuntimeConfiguration called "custom-crio" that sets pids_limit for workers to 4096. We support customers creating a second ContainerRuntimeConfiguration to adjust that limit and other configurations. Therefore the second customer ContainerRuntimeConfiguration is expected (and usually does) get rendered in MachineConfig. Given this is reproduce while cluster is new while Nodes are being updated and CO's are progressing it's likely some timing issue. And while this is happening the "master" nodes are being updated. To reproduce more consistently CVO was scaled down then up to trigger reconcile which creates the 3rd rogue ContainerRuntimeConfiguration. Must gather's will be provided in private comment.
Note I tested my theory of a race condition at startup on 11 clusters (user error on the 12th!). I did NOT reproduce the issue if all nodes were done progressing and all CO's were done progressing and none were degraded. The test was the same other than conditions to wait. Changes: * after login, wait for all nodes to finish progressing and CO to be done progressing and none degraded * after creating second ContainerRuntimeConfig wait for pids_limit to be updated on all nodes before scaling CVO
Timeline on customer cluster that shows this is hard to be 100% certain on. What I do see is the creation timestamp on resources in cluster. Further complicating this is additional changes were done on the cluster since this triggered, so the -1 machineconfig has been deleted. What is of interest though is the age of 99-worker-generated-containerruntime-2, which is a duplicate of 99-worker-generated-containerruntime. It was created 44 days after! $ oc get machineconfig | grep containerruntime 99-worker-generated-containerruntime e6ba00b885558712d660a3704c071490d999de6f 3.2.0 79d 99-worker-generated-containerruntime-2 e6ba00b885558712d660a3704c071490d999de6f 3.2.0 35d 99-worker-generated-containerruntime-3 e6ba00b885558712d660a3704c071490d999de6f 3.2.0 17d
*** Bug 2104160 has been marked as a duplicate of this bug. ***
% oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-07-26-232654 True False 79m Cluster version is 4.10.0-0.nightly-2022-07-26-232654 % oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-142-35.us-east-2.compute.internal Ready worker 88m v1.23.5+012e945 ip-10-0-149-126.us-east-2.compute.internal Ready master 93m v1.23.5+012e945 ip-10-0-168-61.us-east-2.compute.internal Ready master 93m v1.23.5+012e945 ip-10-0-179-76.us-east-2.compute.internal Ready worker 88m v1.23.5+012e945 ip-10-0-218-35.us-east-2.compute.internal Ready master 94m v1.23.5+012e945 ip-10-0-219-184.us-east-2.compute.internal Ready worker 88m v1.23.5+012e945 % oc debug node/ip-10-0-142-35.us-east-2.compute.internal Starting pod/ip-10-0-142-35us-east-2computeinternal-debug ... … sh-4.4# crio config | grep pids_limit INFO[2022-07-27 13:09:33.787028081Z] Starting CRI-O, version: 1.23.3-11.rhaos4.10.gitddf4b1a.1.el8, git: () INFO Using default capabilities: CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_FSETID, CAP_FOWNER, CAP_SETGID, CAP_SETUID, CAP_SETPCAP, CAP_NET_BIND_SERVICE, CAP_KILL pids_limit = 4096 % oc get containerruntimeconfig NAME AGE new-max-pidlimit 6m35s pidlimit 23m % oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m 00-worker dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m 01-master-container-runtime dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m 01-master-kubelet dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m 01-worker-container-runtime dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m 01-worker-kubelet dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m 99-master-generated-crio-seccomp-use-default 3.2.0 88m 99-master-generated-registries dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m 99-master-ssh 3.2.0 90m 99-worker-generated-containerruntime dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 23m 99-worker-generated-containerruntime-1 dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 6m40s 99-worker-generated-crio-seccomp-use-default 3.2.0 88m 99-worker-generated-registries dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m 99-worker-ssh 3.2.0 90m rendered-master-1f5449d03a8fb49f0ff3d741eb363a4c dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m rendered-worker-d229647baf68ce03bce6557c7890110d dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 23m rendered-worker-d92fd0744b797e11843570f0b681e971 dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 88m rendered-worker-efaf76f5ebf797d15ef5c6014919afed dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 6m35s % oc debug node/ip-10-0-142-35.us-east-2.compute.internal Starting pod/ip-10-0-142-35us-east-2computeinternal-debug ... … sh-4.4# crio config | grep pids_limit INFO[2022-07-27 13:17:32.805457991Z] Starting CRI-O, version: 1.23.3-11.rhaos4.10.gitddf4b1a.1.el8, git: () INFO Using default capabilities: CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_FSETID, CAP_FOWNER, CAP_SETGID, CAP_SETUID, CAP_SETPCAP, CAP_NET_BIND_SERVICE, CAP_KILL pids_limit = 65000 % oc get mc | grep -i containerruntime 99-worker-generated-containerruntime dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 29m 99-worker-generated-containerruntime-1 dc29945da95a65f460ad50ad1bbc10e1918a9c61 3.2.0 12m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.25 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5730