Bug 1825417 - The containerruntimecontroller doesn't roll back to CR-1 if we delete CR-2
Summary: The containerruntimecontroller doesn't roll back to CR-1 if we delete CR-2
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.8.0
Assignee: Urvashi Mohnani
QA Contact: MinLi
URL:
Whiteboard:
Depends On:
Blocks: 1926271 1941367
TreeView+ depends on / blocked
 
Reported: 2020-04-17 23:24 UTC by Urvashi Mohnani
Modified: 2021-07-27 22:33 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
This has been documented as well, I don't think anything needs to be done apart from verifying the doc.
Clone Of:
: 1926271 (view as bug list)
Environment:
Last Closed: 2021-07-27 22:32:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ctrcfg must-gather (13.58 MB, application/gzip)
2021-02-05 05:06 UTC, MinLi
no flags Details
mcc log (33.76 KB, text/plain)
2021-02-19 10:54 UTC, MinLi
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2310 0 None closed Bug 1825417: Make the ctrcfg CR to MC mapping 1:1 2021-02-18 06:22:19 UTC
Github openshift machine-config-operator pull 2458 0 None open Bug 1825417: Make getting the suffix of an MC more robust 2021-03-10 19:23:15 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:33:11 UTC

Description Urvashi Mohnani 2020-04-17 23:24:30 UTC
Description of problem:

If we have created two ctrcfg CRs, CR-1 and CR-2, and we delete CR-2. We roll back to the default settings and not the changes that were introduced by CR-1.
This seems like a design flaw as we create only 1 MC for a CR and overwrite that whenever we update the CR or add a new one. We need to change this to create a new MC for each CR that is added so that we can roll back when one CR is deleted but the previous one is still there.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Comment 5 MinLi 2021-01-26 10:11:44 UTC
not fixed on version : 4.7.0-0.nightly-2021-01-25-160335

1)I created three ctrcfg CRs, CR-1, CR-2 and CR-3. CR-1 and CR-2 had matching MC objects(MC-1 and MC-2), but CR-3 didn't have any new MC. Indeed CR-3 shared CR2's MC(MC-2). It's not reasonable. We should create a new MC for each ctrcfg according to https://github.com/openshift/machine-config-operator/pull/2310. 

And when I delete CR3, the changes roll back to MC-1, when I delete CR2, no changes happen, when I delete CR1, changes roll back to default.

2)Also I find not all ctrcfg CR's annotation show the suffix of the MC name. in my case, CR-1 don't show the suffix of the MC name, but CR-2 do.
$ oc get containerruntimeconfig set-pids-limit-master -o yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
  creationTimestamp: "2021-01-26T08:20:41Z"
  finalizers:
  - 99-worker-generated-containerruntime
  generation: 1
  managedFields:

$ oc get containerruntimeconfig overlay-size -o yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
  annotations:
    machineconfiguration.openshift.io/mc-name-suffix: "1" // this line show suffix of the MC name
  creationTimestamp: "2021-01-26T08:34:39Z"
  finalizers:
  - 99-worker-generated-containerruntime-1
  generation: 2

3)Besides, I have a question,shall we must delete ctrcfg in the reverse order they are created? Such as first-create, last-delete. For in real customer scenario, user may delete any ctrcfg, no matter when it is created.

Comment 6 Urvashi Mohnani 2021-01-26 14:54:03 UTC
So the way this works is that higher alphanumeric MC gets a higher priority, that is why they have to be in order and is something we should make clear in the docs. So if you create cr-1, cr-2, cr-3 that will create mc, mc-1, and mc-2 respectively. mc-2 has higher priority compared to mc-1 and that is why the config from that ctrcfg is rolled out to the nodes. You can delete cr-1, but that won't make a difference if you have cr-2 and cr-3 still in place, the deletion does have to be in order.

If cr-3 shared cr-2's config, no new MC will be created as expected. We only make a new MC when a new change is detected and something new needs to be rolled out to the nodes, so this definitely worked as expected.

The reason not all the ctrcfg CRs have a suffix annotation is because the MC create for the first ctrcfg does not have a suffix in its name hence is left empty, the next one will have '-1' as the suffix and so forth. This was done to ensure we are compatible when we upgrade from a previous version and don't require a name change for an existing ctrcfg. So that works as expected. The user will never care about the suffix being used by the MC so should make no difference.

This looks like it passed QE. Next thing would be make docs more clear so users understand how exactly this works. Moving back to on_qa.

Comment 7 MinLi 2021-01-28 03:25:46 UTC
(In reply to Urvashi Mohnani from comment #6)
> So the way this works is that higher alphanumeric MC gets a higher priority,
> that is why they have to be in order and is something we should make clear
> in the docs. So if you create cr-1, cr-2, cr-3 that will create mc, mc-1,
> and mc-2 respectively.

@Urvashi Mohnani, in my test steps,if I create cr-1, cr-2 and cr-3 that only create mc, mc-1, not mc-2.
And I make a change in every cr, so it's not expected. 

cr-1:
pidsLimit: 4095  , with mc name: 99-worker-generated-containerruntime

cr-2:
overlaySize: 10G , with mc name: 99-worker-generated-containerruntime-1

cr-3:
logLevel: debug, not create any new mc, but with mc name: 99-worker-generated-containerruntime-1 (I think it should be 99-worker-generated-containerruntime-2)

Comment 8 Urvashi Mohnani 2021-01-29 01:45:31 UTC
Hi Min,

I tested this out on the latest nightly build 4.7.0-0.nightly-2021-01-28-203708 and it works as expected for me. I created 3 different ctrcfg CRs and got 3 MCs as expected.

➜  ~ oc get ctrcfg
NAME         AGE
ctr-test     24m
ctr-test-1   15m
ctr-test-2   5m45s


➜  ~ oc get mc | grep container
01-master-container-runtime                        b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             57m
01-worker-container-runtime                        b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             57m
99-worker-generated-containerruntime               b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             26m
99-worker-generated-containerruntime-1             b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             17m
99-worker-generated-containerruntime-2             b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             7m26s

Each ctrcfg of mine was different just like yours. Please test it again and if it doesn't roll out can you check the machine-config-controller logs to see if there were any errors or issues.

Comment 9 MinLi 2021-02-01 09:15:38 UTC
Hi, Urvashi

I tested the fix on the same version as you mentioned "4.7.0-0.nightly-2021-01-28-203708", and I can also reproduce the failed case.
Indeed, the reproducer is related with the order of creation of specific ctrcfg. 
For example, ctr-1(pidsLimit: 4095), ctr-2(overlaySize: 10G), ctr-3(logLevel: debug)
If I created ctrcfg as the following order:ctr-1(pidsLimit: 4095) -> ctr-3(logLevel: debug) -> ctr-2(overlaySize: 10G), I can get the expected results, that is: 
99-worker-generated-containerruntime               b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             33m
99-worker-generated-containerruntime-1             b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             23m
99-worker-generated-containerruntime-2             b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             10m

But if I created ctrcfg as the following order: ctr-1(pidsLimit: 4095) -> ctr-2(overlaySize: 10G)-> ctr-3(logLevel: debug), then the ctr-3 can't generate a matching mc(such as 99-worker-generated-containerruntime-2 ), but I saw a new render-mc will generate to sync the change of ctr-3.

the ctr-2 description:
$ oc get containerruntimeconfig overlay-size -o yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
  annotations:
    machineconfiguration.openshift.io/mc-name-suffix: "1"
  creationTimestamp: "2021-02-01T07:54:07Z"
  finalizers:
  - 99-worker-generated-containerruntime-1
  generation: 2
  managedFields:
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:machineconfiguration.openshift.io/mc-name-suffix: {}
        f:finalizers:
          .: {}
          v:"99-worker-generated-containerruntime-1": {}
      f:spec:
        f:containerRuntimeConfig:
          f:logSizeMax: {}
      f:status:
        .: {}
        f:conditions: {}
        f:observedGeneration: {}
    manager: machine-config-controller
    operation: Update
    time: "2021-02-01T07:54:07Z"
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:containerRuntimeConfig:
          .: {}
          f:overlaySize: {}
        f:machineConfigPoolSelector:
          .: {}
          f:matchLabels:
            .: {}
            f:custom-crio-overlay: {}
    manager: oc
    operation: Update
    time: "2021-02-01T07:54:07Z"
  name: overlay-size
  resourceVersion: "102303"
  selfLink: /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/overlay-size
  uid: 08785247-0fdf-4de4-ade9-9c959be268cf
spec:
  containerRuntimeConfig:
    logSizeMax: "0"
    overlaySize: 10G
  machineConfigPoolSelector:
    matchLabels:
      custom-crio-overlay: overlay-size
status:
  conditions:
  - lastTransitionTime: "2021-02-01T07:54:07Z"
    message: Success
    status: "True"
    type: Success
  observedGeneration: 2


the ctr-3 description:
$ oc get containerruntimeconfig set-loglevel -o yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
  creationTimestamp: "2021-02-01T08:12:36Z"
  finalizers:
  - 99-worker-generated-containerruntime-1
  generation: 1
  managedFields:
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .: {}
          v:"99-worker-generated-containerruntime-1": {}
      f:spec:
        f:containerRuntimeConfig:
          f:logSizeMax: {}
          f:overlaySize: {}
      f:status:
        .: {}
        f:conditions: {}
        f:observedGeneration: {}
    manager: machine-config-controller
    operation: Update
    time: "2021-02-01T08:12:36Z"
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:containerRuntimeConfig:
          .: {}
          f:logLevel: {}
        f:machineConfigPoolSelector:
          .: {}
          f:matchLabels:
            .: {}
            f:custom-loglevel: {}
    manager: oc
    operation: Update
    time: "2021-02-01T08:12:36Z"
  name: set-loglevel
  resourceVersion: "110349"
  selfLink: /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/set-loglevel
  uid: 3c8ef065-8ecc-4608-9b5b-6b053aca6347
spec:
  containerRuntimeConfig:
    logLevel: debug
  machineConfigPoolSelector:
    matchLabels:
      custom-loglevel: debug
status:
  conditions:
  - lastTransitionTime: "2021-02-01T08:12:36Z"
    message: Success
    status: "True"
    type: Success
  observedGeneration: 1

$ oc get mc 
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m
00-worker                                          b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m
01-master-container-runtime                        b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m
01-master-kubelet                                  b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m
01-worker-container-runtime                        b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m
01-worker-kubelet                                  b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m
99-master-generated-registries                     b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m
99-master-ssh                                                                                 3.1.0             5h2m
99-worker-generated-containerruntime               b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             57m
99-worker-generated-containerruntime-1             b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             47m
99-worker-generated-registries                     b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m
99-worker-ssh                                                                                 3.1.0             5h2m
rendered-master-026e7825f35e20927a8d21965cd9b231   b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m
rendered-master-2af96245bd8bf03da524d65fbfc95d23   b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h35m
rendered-worker-1bd8eaa86b66e4e141086c17f4cd439b   b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h35m
rendered-worker-21352c2f273d829f15ab85dfe6a81acd   b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             56m
rendered-worker-5f7846c808113744345e7ad4d09da393   b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             47m
rendered-worker-9786b62fd62a373a02836c98d9e9dbdd   b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             11m (this one sync the change of ctr3)
rendered-worker-aed2e4875537b5df3396f31821a75b63   b5c5119de007945b6fe6fb215db3b8e2ceb12511   3.2.0             4h58m



machineconfig controller log:
I0201 07:44:27.431907       1 container_runtime_config_controller.go:617] Applied ContainerRuntimeConfig set-pids-limit-master on MachineConfigPool worker
I0201 07:44:32.478117       1 render_controller.go:498] Generated machineconfig rendered-worker-21352c2f273d829f15ab85dfe6a81acd from 6 configs: [{MachineConfig  00-worker  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-container-runtime  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-kubelet  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-generated-containerruntime  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-generated-registries  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-ssh  machineconfiguration.openshift.io/v1  }]
I0201 07:44:32.492590       1 render_controller.go:522] Pool worker: now targeting: rendered-worker-21352c2f273d829f15ab85dfe6a81acd
I0201 07:44:37.492894       1 node_controller.go:414] Pool worker: 2 candidate nodes for update, capacity: 1
I0201 07:44:37.492921       1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal target to rendered-worker-21352c2f273d829f15ab85dfe6a81acd
I0201 07:44:37.520361       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-21352c2f273d829f15ab85dfe6a81acd
I0201 07:44:37.521047       1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"97691", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal to config rendered-worker-21352c2f273d829f15ab85dfe6a81acd
E0201 07:44:37.565413       1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 07:44:37.565449       1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 07:44:38.537612       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working
I0201 07:44:38.632649       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting Unschedulable
I0201 07:46:32.952827       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting OutOfDisk=Unknown
I0201 07:47:02.994360       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting NotReady=False
I0201 07:47:12.133838       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Completed update to rendered-worker-21352c2f273d829f15ab85dfe6a81acd
I0201 07:47:13.028507       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting ready
I0201 07:47:17.134255       1 node_controller.go:414] Pool worker: 1 candidate nodes for update, capacity: 1
I0201 07:47:17.134284       1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal target to rendered-worker-21352c2f273d829f15ab85dfe6a81acd
I0201 07:47:17.158404       1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"97782", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal to config rendered-worker-21352c2f273d829f15ab85dfe6a81acd
I0201 07:47:17.159903       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-21352c2f273d829f15ab85dfe6a81acd
I0201 07:47:18.184495       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working
I0201 07:47:18.274304       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting Unschedulable
E0201 07:47:22.259284       1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 07:47:22.259392       1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 07:48:55.451535       1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool master
I0201 07:48:55.511049       1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool worker
I0201 07:49:43.731033       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting OutOfDisk=Unknown
I0201 07:50:07.934037       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting NotReady=False
I0201 07:50:17.495604       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Completed update to rendered-worker-21352c2f273d829f15ab85dfe6a81acd
I0201 07:50:17.966983       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting ready
I0201 07:50:22.495873       1 status.go:90] Pool worker: All nodes are updated with rendered-worker-21352c2f273d829f15ab85dfe6a81acd
I0201 07:54:07.859784       1 container_runtime_config_controller.go:617] Applied ContainerRuntimeConfig overlay-size on MachineConfigPool worker
I0201 07:54:12.903079       1 render_controller.go:498] Generated machineconfig rendered-worker-5f7846c808113744345e7ad4d09da393 from 7 configs: [{MachineConfig  00-worker  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-container-runtime  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-kubelet  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-generated-containerruntime  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-generated-containerruntime-1  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-generated-registries  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-ssh  machineconfiguration.openshift.io/v1  }]
I0201 07:54:12.917685       1 render_controller.go:522] Pool worker: now targeting: rendered-worker-5f7846c808113744345e7ad4d09da393
I0201 07:54:17.917496       1 node_controller.go:414] Pool worker: 2 candidate nodes for update, capacity: 1
I0201 07:54:17.917523       1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal target to rendered-worker-5f7846c808113744345e7ad4d09da393
I0201 07:54:17.942896       1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"102324", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal to config rendered-worker-5f7846c808113744345e7ad4d09da393
I0201 07:54:17.948025       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-5f7846c808113744345e7ad4d09da393
E0201 07:54:18.006364       1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 07:54:18.006479       1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 07:54:18.963520       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working
I0201 07:54:19.050062       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting Unschedulable
I0201 07:56:24.240933       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting OutOfDisk=Unknown
I0201 07:56:49.031344       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting NotReady=False
I0201 07:56:57.971240       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Completed update to rendered-worker-5f7846c808113744345e7ad4d09da393
I0201 07:56:59.059012       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting ready
I0201 07:57:02.971714       1 node_controller.go:414] Pool worker: 1 candidate nodes for update, capacity: 1
I0201 07:57:02.971741       1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal target to rendered-worker-5f7846c808113744345e7ad4d09da393
I0201 07:57:02.999368       1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"102663", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal to config rendered-worker-5f7846c808113744345e7ad4d09da393
I0201 07:57:03.004447       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-5f7846c808113744345e7ad4d09da393
I0201 07:57:04.024857       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working
I0201 07:57:04.126107       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting Unschedulable
E0201 07:57:08.176006       1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 07:57:08.176047       1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 07:59:19.692607       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting OutOfDisk=Unknown
I0201 07:59:50.603328       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting NotReady=False
I0201 07:59:59.338349       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Completed update to rendered-worker-5f7846c808113744345e7ad4d09da393
I0201 08:00:00.637906       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting ready
I0201 08:00:04.338711       1 status.go:90] Pool worker: All nodes are updated with rendered-worker-5f7846c808113744345e7ad4d09da393
I0201 08:02:52.225053       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:52.238149       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:52.255877       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:52.284233       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:52.333317       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:52.420960       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:52.590602       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:52.920007       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:53.572082       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:54.863409       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:02:57.435071       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:03:02.566978       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:03:12.816869       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:03:33.318292       1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel
I0201 08:04:14.278564       1 container_runtime_config_controller.go:487] ContainerRuntimeConfig set-loglevel has been deleted
E0201 08:12:36.455829       1 render_controller.go:203] machineconfig has changed controller, not allowed.
I0201 08:12:36.476710       1 container_runtime_config_controller.go:617] Applied ContainerRuntimeConfig set-loglevel on MachineConfigPool worker
I0201 08:13:43.477939       1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool master
I0201 08:13:43.536518       1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool worker
I0201 08:25:11.826941       1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool master
I0201 08:25:11.887347       1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool worker
I0201 08:30:09.678226       1 render_controller.go:498] Generated machineconfig rendered-worker-9786b62fd62a373a02836c98d9e9dbdd from 7 configs: [{MachineConfig  00-worker  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-container-runtime  machineconfiguration.openshift.io/v1  } {MachineConfig  01-worker-kubelet  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-generated-containerruntime  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-generated-containerruntime-1  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-generated-registries  machineconfiguration.openshift.io/v1  } {MachineConfig  99-worker-ssh  machineconfiguration.openshift.io/v1  }]
I0201 08:30:09.692867       1 render_controller.go:522] Pool worker: now targeting: rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
I0201 08:30:14.695273       1 node_controller.go:414] Pool worker: 2 candidate nodes for update, capacity: 1
I0201 08:30:14.695405       1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal target to rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
I0201 08:30:14.719956       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
I0201 08:30:14.723203       1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"115301", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal to config rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
E0201 08:30:14.774631       1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 08:30:14.774688       1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I0201 08:30:15.411599       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working
I0201 08:30:15.501899       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting Unschedulable
I0201 08:32:20.412963       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting OutOfDisk=Unknown
I0201 08:32:44.581308       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting NotReady=False
I0201 08:32:53.672797       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Completed update to rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
I0201 08:32:54.617431       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting ready
I0201 08:32:58.673162       1 node_controller.go:414] Pool worker: 1 candidate nodes for update, capacity: 1
I0201 08:32:58.673192       1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal target to rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
I0201 08:32:58.698142       1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"115312", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal to config rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
I0201 08:32:58.699031       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
I0201 08:32:59.721398       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working
I0201 08:32:59.819253       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting Unschedulable
I0201 08:37:35.937394       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting OutOfDisk=Unknown
I0201 08:38:07.935689       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting NotReady=False
I0201 08:38:16.776181       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Completed update to rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
I0201 08:38:17.964040       1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting ready
I0201 08:38:21.776592       1 status.go:90] Pool worker: All nodes are updated with rendered-worker-9786b62fd62a373a02836c98d9e9dbdd
I0201 08:44:01.727446       1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool master
I0201 08:44:01.827633       1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool worker


the line"E0201 08:12:36.455829       1 render_controller.go:203] machineconfig has changed controller, not allowed." matched the time of creation of ctr3.
Btw, when I first created ctr3, I didn't set correct MachineConfigPool label, so generate the error "could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel", so I deleted it, and then add correct MachineConfigPool label, and created ctr3

Comment 10 Urvashi Mohnani 2021-02-01 20:06:52 UTC
Hi Min,

I tried the order you mentioned ctr-1(pidsLimit: 2048), ctr-2(overlaySize: 3G), ctr-3(logLevel: debug) and the controller generated 3 MCs as expected.

99-worker-generated-containerruntime               a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0   3.2.0             47m
99-worker-generated-containerruntime-1             a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0   3.2.0             24m
99-worker-generated-containerruntime-2             a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0   3.2.0             9s

Name:         ctr-test
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  machineconfiguration.openshift.io/v1
Kind:         ContainerRuntimeConfig
Metadata:
  Creation Timestamp:  2021-02-01T18:29:02Z
  Finalizers:
    99-worker-generated-containerruntime
  Generation:  1
  Managed Fields:
    API Version:  machineconfiguration.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:containerRuntimeConfig:
          .:
          f:pidsLimit:
        f:machineConfigPoolSelector:
          .:
          f:matchLabels:
            .:
            f:custom-crio:
    Manager:      kubectl-create
    Operation:    Update
    Time:         2021-02-01T18:29:02Z
    API Version:  machineconfiguration.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"99-worker-generated-containerruntime":
      f:spec:
        f:containerRuntimeConfig:
          f:logSizeMax:
          f:overlaySize:
      f:status:
        .:
        f:conditions:
        f:observedGeneration:
    Manager:         machine-config-controller
    Operation:       Update
    Time:            2021-02-01T18:29:02Z
  Resource Version:  35013
  Self Link:         /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/ctr-test
  UID:               ed246c1f-60b1-475f-a10f-1bcf36fb00e3
Spec:
  Container Runtime Config:
    Pids Limit:  2048
  Machine Config Pool Selector:
    Match Labels:
      Custom - Crio:  test
Status:
  Conditions:
    Last Transition Time:  2021-02-01T18:29:02Z
    Message:               Success
    Status:                True
    Type:                  Success
  Observed Generation:     1
Events:                    <none>


Name:         ctr-test-1
Namespace:    
Labels:       <none>
Annotations:  machineconfiguration.openshift.io/mc-name-suffix: 1
API Version:  machineconfiguration.openshift.io/v1
Kind:         ContainerRuntimeConfig
Metadata:
  Creation Timestamp:  2021-02-01T18:52:23Z
  Finalizers:
    99-worker-generated-containerruntime-1
  Generation:  2
  Managed Fields:
    API Version:  machineconfiguration.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:containerRuntimeConfig:
          .:
          f:overlaySize:
        f:machineConfigPoolSelector:
          .:
          f:matchLabels:
            .:
            f:custom-crio:
    Manager:      kubectl-create
    Operation:    Update
    Time:         2021-02-01T18:52:23Z
    API Version:  machineconfiguration.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:machineconfiguration.openshift.io/mc-name-suffix:
        f:finalizers:
          .:
          v:"99-worker-generated-containerruntime-1":
      f:spec:
        f:containerRuntimeConfig:
          f:logSizeMax:
      f:status:
        .:
        f:conditions:
        f:observedGeneration:
    Manager:         machine-config-controller
    Operation:       Update
    Time:            2021-02-01T18:52:23Z
  Resource Version:  44349
  Self Link:         /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/ctr-test-1
  UID:               e1ceeedd-d3e5-4cc8-892e-e0eda27f7600
Spec:
  Container Runtime Config:
    Log Size Max:  0
    Overlay Size:  2G
  Machine Config Pool Selector:
    Match Labels:
      Custom - Crio:  test
Status:
  Conditions:
    Last Transition Time:  2021-02-01T18:52:23Z
    Message:               Success
    Status:                True
    Type:                  Success
  Observed Generation:     2
Events:                    <none>


Name:         ctr-test-2
Namespace:    
Labels:       <none>
Annotations:  machineconfiguration.openshift.io/mc-name-suffix: 2
API Version:  machineconfiguration.openshift.io/v1
Kind:         ContainerRuntimeConfig
Metadata:
  Creation Timestamp:  2021-02-01T19:16:26Z
  Finalizers:
    99-worker-generated-containerruntime-2
  Generation:  2
  Managed Fields:
    API Version:  machineconfiguration.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:containerRuntimeConfig:
          .:
          f:logLevel:
        f:machineConfigPoolSelector:
          .:
          f:matchLabels:
            .:
            f:custom-crio:
    Manager:      kubectl-create
    Operation:    Update
    Time:         2021-02-01T19:16:26Z
    API Version:  machineconfiguration.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:machineconfiguration.openshift.io/mc-name-suffix:
        f:finalizers:
          .:
          v:"99-worker-generated-containerruntime-2":
      f:spec:
        f:containerRuntimeConfig:
          f:logSizeMax:
          f:overlaySize:
      f:status:
        .:
        f:conditions:
        f:observedGeneration:
    Manager:         machine-config-controller
    Operation:       Update
    Time:            2021-02-01T19:16:27Z
  Resource Version:  53465
  Self Link:         /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/ctr-test-2
  UID:               22803757-e4db-4ed6-8df4-a191a78e5739
Spec:
  Container Runtime Config:
    Log Level:     debug
    Log Size Max:  0
    Overlay Size:  0
  Machine Config Pool Selector:
    Match Labels:
      Custom - Crio:  test
Status:
  Conditions:
    Last Transition Time:  2021-02-01T19:16:27Z
    Message:               Success
    Status:                True
    Type:                  Success
  Observed Generation:     2
Events:                    <none>


I am not entirely sure why it isn't working for you, maybe try a different overlaySize value?

Comment 11 MinLi 2021-02-04 11:41:25 UTC
Hi, Urvashi Mohnani

Unfortunately I reproduced this issue on version :4.7.0-0.nightly-2021-02-02-223803
This time, I tried a overlaySize 3G in ctr-2, others keep the same,
i.e. ctr-1(pidsLimit: 4095) -> ctr-2(overlaySize: 3G)-> ctr-3(logLevel: debug), but ctr-3 still didn't generate a matching mc(such as 99-worker-generated-containerruntime-2 ), but generated a new render-mc which can sync the change of ctr-3 to nodes.

And I think the following error message in the log is abnormal:
E0204 11:14:29.162268       1 render_controller.go:203] machineconfig has changed controller, not allowed.

you can refer to the whole log in  Comment 9 , they are similar.

Comment 12 Urvashi Mohnani 2021-02-04 21:52:52 UTC
Hi Min,

I am unable to reproduce the issue you are seeing as well. I tried again on Server Version: 4.7.0-fc.5 and this time went a step further to create more ctrcfgs and got an MC for each of them:

1) pidsLimit: 2048
2) overlaySize: 3G
3) logLevel: debug
4) overlaySize: 2G

99-worker-generated-containerruntime               a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0   3.2.0             79m
99-worker-generated-containerruntime-1             a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0   3.2.0             18m
99-worker-generated-containerruntime-2             a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0   3.2.0             10m
99-worker-generated-containerruntime-3             a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0   3.2.0             13s

I used 2 different labels as well this time and alternated the labels in the ctrcfg. A MC was created for each of the ctrcfgs.
I also do not see the "E0204 11:14:29.162268       1 render_controller.go:203] machineconfig has changed controller, not allowed." error in my MCC logs. The controller should not be changing, weird that you are seeing this error. Can you try this test out again and if you hit the issue can you grab the must-gather logs and preserve the environment so we can poke around to see what is causing that weird controller change.

My cluster is running on AWS, is there something specific/different about your environment?

Comment 13 MinLi 2021-02-05 03:42:10 UTC
Hi, Urvashi Mohnani

you said "I used 2 different labels as well this time and alternated the labels in the ctrcfg", maybe it's the key point, I used an unique label in each ctrcfg, so if I created 3 ctrcfg, then I will add 3 different labels to mcp. 

Also, I will try again and upload the must-gather log. And you can test with my method.

Comment 14 MinLi 2021-02-05 05:06:38 UTC
Created attachment 1755176 [details]
ctrcfg must-gather

Comment 15 MinLi 2021-02-05 05:09:11 UTC
Hi, Urvashi Mohnani
I leave my test cluster for you to debug: https://mastern-jenkins-csb-openshift-qe.cloud.paas.psi.redhat.com/job/Launch%20Environment%20Flexy/137480/artifact/workdir/install-dir/auth/kubeconfig/*view*/

I am not sure when this cluster will be removed, so you can debug it ASAP, thanks.

Comment 16 Urvashi Mohnani 2021-02-12 03:11:05 UTC
Hi Min,

Thanks for keeping that cluster around, I wasn't able to reproduce the issue again on your cluster. Tried with another cluster and wasn't able to reproduce as well. I have created a release image with more logs to the ctrcfg controller, can you spin up a cluster with that and see if you hit the issue again? Can you please preserve the MCC logs if you do and if possible leave the cluster up for me to take a look?

The release image is quay.io/umohnani8/4.7-with-logs.

Thanks!

Comment 17 MinLi 2021-02-19 10:27:57 UTC
Hi, Urvashi 
I reproduce the issue with the image you provided, please refer to: https://mastern-jenkins-csb-openshift-qe.cloud.paas.psi.redhat.com/job/Launch%20Environment%20Flexy/139411/artifact/workdir/install-dir/auth/kubeconfig/*view*/

Comment 18 MinLi 2021-02-19 10:54:10 UTC
Created attachment 1758128 [details]
mcc log

attach mcc log

Comment 23 MinLi 2021-03-22 10:43:58 UTC
verified on version : 4.8.0-0.nightly-2021-03-21-224928

Comment 26 errata-xmlrpc 2021-07-27 22:32:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.