Description of problem: If we have created two ctrcfg CRs, CR-1 and CR-2, and we delete CR-2. We roll back to the default settings and not the changes that were introduced by CR-1. This seems like a design flaw as we create only 1 MC for a CR and overwrite that whenever we update the CR or add a new one. We need to change this to create a new MC for each CR that is added so that we can roll back when one CR is deleted but the previous one is still there. Version-Release number of selected component (if applicable): How reproducible: Always
not fixed on version : 4.7.0-0.nightly-2021-01-25-160335 1)I created three ctrcfg CRs, CR-1, CR-2 and CR-3. CR-1 and CR-2 had matching MC objects(MC-1 and MC-2), but CR-3 didn't have any new MC. Indeed CR-3 shared CR2's MC(MC-2). It's not reasonable. We should create a new MC for each ctrcfg according to https://github.com/openshift/machine-config-operator/pull/2310. And when I delete CR3, the changes roll back to MC-1, when I delete CR2, no changes happen, when I delete CR1, changes roll back to default. 2)Also I find not all ctrcfg CR's annotation show the suffix of the MC name. in my case, CR-1 don't show the suffix of the MC name, but CR-2 do. $ oc get containerruntimeconfig set-pids-limit-master -o yaml apiVersion: machineconfiguration.openshift.io/v1 kind: ContainerRuntimeConfig metadata: creationTimestamp: "2021-01-26T08:20:41Z" finalizers: - 99-worker-generated-containerruntime generation: 1 managedFields: $ oc get containerruntimeconfig overlay-size -o yaml apiVersion: machineconfiguration.openshift.io/v1 kind: ContainerRuntimeConfig metadata: annotations: machineconfiguration.openshift.io/mc-name-suffix: "1" // this line show suffix of the MC name creationTimestamp: "2021-01-26T08:34:39Z" finalizers: - 99-worker-generated-containerruntime-1 generation: 2 3)Besides, I have a question,shall we must delete ctrcfg in the reverse order they are created? Such as first-create, last-delete. For in real customer scenario, user may delete any ctrcfg, no matter when it is created.
So the way this works is that higher alphanumeric MC gets a higher priority, that is why they have to be in order and is something we should make clear in the docs. So if you create cr-1, cr-2, cr-3 that will create mc, mc-1, and mc-2 respectively. mc-2 has higher priority compared to mc-1 and that is why the config from that ctrcfg is rolled out to the nodes. You can delete cr-1, but that won't make a difference if you have cr-2 and cr-3 still in place, the deletion does have to be in order. If cr-3 shared cr-2's config, no new MC will be created as expected. We only make a new MC when a new change is detected and something new needs to be rolled out to the nodes, so this definitely worked as expected. The reason not all the ctrcfg CRs have a suffix annotation is because the MC create for the first ctrcfg does not have a suffix in its name hence is left empty, the next one will have '-1' as the suffix and so forth. This was done to ensure we are compatible when we upgrade from a previous version and don't require a name change for an existing ctrcfg. So that works as expected. The user will never care about the suffix being used by the MC so should make no difference. This looks like it passed QE. Next thing would be make docs more clear so users understand how exactly this works. Moving back to on_qa.
(In reply to Urvashi Mohnani from comment #6) > So the way this works is that higher alphanumeric MC gets a higher priority, > that is why they have to be in order and is something we should make clear > in the docs. So if you create cr-1, cr-2, cr-3 that will create mc, mc-1, > and mc-2 respectively. @Urvashi Mohnani, in my test steps,if I create cr-1, cr-2 and cr-3 that only create mc, mc-1, not mc-2. And I make a change in every cr, so it's not expected. cr-1: pidsLimit: 4095 , with mc name: 99-worker-generated-containerruntime cr-2: overlaySize: 10G , with mc name: 99-worker-generated-containerruntime-1 cr-3: logLevel: debug, not create any new mc, but with mc name: 99-worker-generated-containerruntime-1 (I think it should be 99-worker-generated-containerruntime-2)
Hi Min, I tested this out on the latest nightly build 4.7.0-0.nightly-2021-01-28-203708 and it works as expected for me. I created 3 different ctrcfg CRs and got 3 MCs as expected. ➜ ~ oc get ctrcfg NAME AGE ctr-test 24m ctr-test-1 15m ctr-test-2 5m45s ➜ ~ oc get mc | grep container 01-master-container-runtime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 57m 01-worker-container-runtime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 57m 99-worker-generated-containerruntime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 26m 99-worker-generated-containerruntime-1 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 17m 99-worker-generated-containerruntime-2 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 7m26s Each ctrcfg of mine was different just like yours. Please test it again and if it doesn't roll out can you check the machine-config-controller logs to see if there were any errors or issues.
Hi, Urvashi I tested the fix on the same version as you mentioned "4.7.0-0.nightly-2021-01-28-203708", and I can also reproduce the failed case. Indeed, the reproducer is related with the order of creation of specific ctrcfg. For example, ctr-1(pidsLimit: 4095), ctr-2(overlaySize: 10G), ctr-3(logLevel: debug) If I created ctrcfg as the following order:ctr-1(pidsLimit: 4095) -> ctr-3(logLevel: debug) -> ctr-2(overlaySize: 10G), I can get the expected results, that is: 99-worker-generated-containerruntime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 33m 99-worker-generated-containerruntime-1 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 23m 99-worker-generated-containerruntime-2 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 10m But if I created ctrcfg as the following order: ctr-1(pidsLimit: 4095) -> ctr-2(overlaySize: 10G)-> ctr-3(logLevel: debug), then the ctr-3 can't generate a matching mc(such as 99-worker-generated-containerruntime-2 ), but I saw a new render-mc will generate to sync the change of ctr-3. the ctr-2 description: $ oc get containerruntimeconfig overlay-size -o yaml apiVersion: machineconfiguration.openshift.io/v1 kind: ContainerRuntimeConfig metadata: annotations: machineconfiguration.openshift.io/mc-name-suffix: "1" creationTimestamp: "2021-02-01T07:54:07Z" finalizers: - 99-worker-generated-containerruntime-1 generation: 2 managedFields: - apiVersion: machineconfiguration.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:machineconfiguration.openshift.io/mc-name-suffix: {} f:finalizers: .: {} v:"99-worker-generated-containerruntime-1": {} f:spec: f:containerRuntimeConfig: f:logSizeMax: {} f:status: .: {} f:conditions: {} f:observedGeneration: {} manager: machine-config-controller operation: Update time: "2021-02-01T07:54:07Z" - apiVersion: machineconfiguration.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:containerRuntimeConfig: .: {} f:overlaySize: {} f:machineConfigPoolSelector: .: {} f:matchLabels: .: {} f:custom-crio-overlay: {} manager: oc operation: Update time: "2021-02-01T07:54:07Z" name: overlay-size resourceVersion: "102303" selfLink: /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/overlay-size uid: 08785247-0fdf-4de4-ade9-9c959be268cf spec: containerRuntimeConfig: logSizeMax: "0" overlaySize: 10G machineConfigPoolSelector: matchLabels: custom-crio-overlay: overlay-size status: conditions: - lastTransitionTime: "2021-02-01T07:54:07Z" message: Success status: "True" type: Success observedGeneration: 2 the ctr-3 description: $ oc get containerruntimeconfig set-loglevel -o yaml apiVersion: machineconfiguration.openshift.io/v1 kind: ContainerRuntimeConfig metadata: creationTimestamp: "2021-02-01T08:12:36Z" finalizers: - 99-worker-generated-containerruntime-1 generation: 1 managedFields: - apiVersion: machineconfiguration.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:finalizers: .: {} v:"99-worker-generated-containerruntime-1": {} f:spec: f:containerRuntimeConfig: f:logSizeMax: {} f:overlaySize: {} f:status: .: {} f:conditions: {} f:observedGeneration: {} manager: machine-config-controller operation: Update time: "2021-02-01T08:12:36Z" - apiVersion: machineconfiguration.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:containerRuntimeConfig: .: {} f:logLevel: {} f:machineConfigPoolSelector: .: {} f:matchLabels: .: {} f:custom-loglevel: {} manager: oc operation: Update time: "2021-02-01T08:12:36Z" name: set-loglevel resourceVersion: "110349" selfLink: /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/set-loglevel uid: 3c8ef065-8ecc-4608-9b5b-6b053aca6347 spec: containerRuntimeConfig: logLevel: debug machineConfigPoolSelector: matchLabels: custom-loglevel: debug status: conditions: - lastTransitionTime: "2021-02-01T08:12:36Z" message: Success status: "True" type: Success observedGeneration: 1 $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m 00-worker b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m 01-master-container-runtime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m 01-master-kubelet b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m 01-worker-container-runtime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m 01-worker-kubelet b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m 99-master-generated-registries b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m 99-master-ssh 3.1.0 5h2m 99-worker-generated-containerruntime b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 57m 99-worker-generated-containerruntime-1 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 47m 99-worker-generated-registries b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m 99-worker-ssh 3.1.0 5h2m rendered-master-026e7825f35e20927a8d21965cd9b231 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m rendered-master-2af96245bd8bf03da524d65fbfc95d23 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h35m rendered-worker-1bd8eaa86b66e4e141086c17f4cd439b b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h35m rendered-worker-21352c2f273d829f15ab85dfe6a81acd b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 56m rendered-worker-5f7846c808113744345e7ad4d09da393 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 47m rendered-worker-9786b62fd62a373a02836c98d9e9dbdd b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 11m (this one sync the change of ctr3) rendered-worker-aed2e4875537b5df3396f31821a75b63 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.2.0 4h58m machineconfig controller log: I0201 07:44:27.431907 1 container_runtime_config_controller.go:617] Applied ContainerRuntimeConfig set-pids-limit-master on MachineConfigPool worker I0201 07:44:32.478117 1 render_controller.go:498] Generated machineconfig rendered-worker-21352c2f273d829f15ab85dfe6a81acd from 6 configs: [{MachineConfig 00-worker machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-container-runtime machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-kubelet machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-containerruntime machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-registries machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-ssh machineconfiguration.openshift.io/v1 }] I0201 07:44:32.492590 1 render_controller.go:522] Pool worker: now targeting: rendered-worker-21352c2f273d829f15ab85dfe6a81acd I0201 07:44:37.492894 1 node_controller.go:414] Pool worker: 2 candidate nodes for update, capacity: 1 I0201 07:44:37.492921 1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal target to rendered-worker-21352c2f273d829f15ab85dfe6a81acd I0201 07:44:37.520361 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-21352c2f273d829f15ab85dfe6a81acd I0201 07:44:37.521047 1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"97691", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal to config rendered-worker-21352c2f273d829f15ab85dfe6a81acd E0201 07:44:37.565413 1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 07:44:37.565449 1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 07:44:38.537612 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working I0201 07:44:38.632649 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting Unschedulable I0201 07:46:32.952827 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting OutOfDisk=Unknown I0201 07:47:02.994360 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting NotReady=False I0201 07:47:12.133838 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Completed update to rendered-worker-21352c2f273d829f15ab85dfe6a81acd I0201 07:47:13.028507 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting ready I0201 07:47:17.134255 1 node_controller.go:414] Pool worker: 1 candidate nodes for update, capacity: 1 I0201 07:47:17.134284 1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal target to rendered-worker-21352c2f273d829f15ab85dfe6a81acd I0201 07:47:17.158404 1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"97782", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal to config rendered-worker-21352c2f273d829f15ab85dfe6a81acd I0201 07:47:17.159903 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-21352c2f273d829f15ab85dfe6a81acd I0201 07:47:18.184495 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working I0201 07:47:18.274304 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting Unschedulable E0201 07:47:22.259284 1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 07:47:22.259392 1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 07:48:55.451535 1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool master I0201 07:48:55.511049 1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool worker I0201 07:49:43.731033 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting OutOfDisk=Unknown I0201 07:50:07.934037 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting NotReady=False I0201 07:50:17.495604 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Completed update to rendered-worker-21352c2f273d829f15ab85dfe6a81acd I0201 07:50:17.966983 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting ready I0201 07:50:22.495873 1 status.go:90] Pool worker: All nodes are updated with rendered-worker-21352c2f273d829f15ab85dfe6a81acd I0201 07:54:07.859784 1 container_runtime_config_controller.go:617] Applied ContainerRuntimeConfig overlay-size on MachineConfigPool worker I0201 07:54:12.903079 1 render_controller.go:498] Generated machineconfig rendered-worker-5f7846c808113744345e7ad4d09da393 from 7 configs: [{MachineConfig 00-worker machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-container-runtime machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-kubelet machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-containerruntime machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-containerruntime-1 machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-registries machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-ssh machineconfiguration.openshift.io/v1 }] I0201 07:54:12.917685 1 render_controller.go:522] Pool worker: now targeting: rendered-worker-5f7846c808113744345e7ad4d09da393 I0201 07:54:17.917496 1 node_controller.go:414] Pool worker: 2 candidate nodes for update, capacity: 1 I0201 07:54:17.917523 1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal target to rendered-worker-5f7846c808113744345e7ad4d09da393 I0201 07:54:17.942896 1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"102324", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal to config rendered-worker-5f7846c808113744345e7ad4d09da393 I0201 07:54:17.948025 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-5f7846c808113744345e7ad4d09da393 E0201 07:54:18.006364 1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 07:54:18.006479 1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 07:54:18.963520 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working I0201 07:54:19.050062 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting Unschedulable I0201 07:56:24.240933 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting OutOfDisk=Unknown I0201 07:56:49.031344 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting NotReady=False I0201 07:56:57.971240 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Completed update to rendered-worker-5f7846c808113744345e7ad4d09da393 I0201 07:56:59.059012 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting ready I0201 07:57:02.971714 1 node_controller.go:414] Pool worker: 1 candidate nodes for update, capacity: 1 I0201 07:57:02.971741 1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal target to rendered-worker-5f7846c808113744345e7ad4d09da393 I0201 07:57:02.999368 1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"102663", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal to config rendered-worker-5f7846c808113744345e7ad4d09da393 I0201 07:57:03.004447 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-5f7846c808113744345e7ad4d09da393 I0201 07:57:04.024857 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working I0201 07:57:04.126107 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting Unschedulable E0201 07:57:08.176006 1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 07:57:08.176047 1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 07:59:19.692607 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting OutOfDisk=Unknown I0201 07:59:50.603328 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting NotReady=False I0201 07:59:59.338349 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Completed update to rendered-worker-5f7846c808113744345e7ad4d09da393 I0201 08:00:00.637906 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting ready I0201 08:00:04.338711 1 status.go:90] Pool worker: All nodes are updated with rendered-worker-5f7846c808113744345e7ad4d09da393 I0201 08:02:52.225053 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:52.238149 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:52.255877 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:52.284233 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:52.333317 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:52.420960 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:52.590602 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:52.920007 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:53.572082 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:54.863409 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:02:57.435071 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:03:02.566978 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:03:12.816869 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:03:33.318292 1 container_runtime_config_controller.go:346] Error syncing containerruntimeconfig set-loglevel: could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel I0201 08:04:14.278564 1 container_runtime_config_controller.go:487] ContainerRuntimeConfig set-loglevel has been deleted E0201 08:12:36.455829 1 render_controller.go:203] machineconfig has changed controller, not allowed. I0201 08:12:36.476710 1 container_runtime_config_controller.go:617] Applied ContainerRuntimeConfig set-loglevel on MachineConfigPool worker I0201 08:13:43.477939 1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool master I0201 08:13:43.536518 1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool worker I0201 08:25:11.826941 1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool master I0201 08:25:11.887347 1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool worker I0201 08:30:09.678226 1 render_controller.go:498] Generated machineconfig rendered-worker-9786b62fd62a373a02836c98d9e9dbdd from 7 configs: [{MachineConfig 00-worker machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-container-runtime machineconfiguration.openshift.io/v1 } {MachineConfig 01-worker-kubelet machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-containerruntime machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-containerruntime-1 machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-generated-registries machineconfiguration.openshift.io/v1 } {MachineConfig 99-worker-ssh machineconfiguration.openshift.io/v1 }] I0201 08:30:09.692867 1 render_controller.go:522] Pool worker: now targeting: rendered-worker-9786b62fd62a373a02836c98d9e9dbdd I0201 08:30:14.695273 1 node_controller.go:414] Pool worker: 2 candidate nodes for update, capacity: 1 I0201 08:30:14.695405 1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal target to rendered-worker-9786b62fd62a373a02836c98d9e9dbdd I0201 08:30:14.719956 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-9786b62fd62a373a02836c98d9e9dbdd I0201 08:30:14.723203 1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"115301", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal to config rendered-worker-9786b62fd62a373a02836c98d9e9dbdd E0201 08:30:14.774631 1 render_controller.go:460] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 08:30:14.774688 1 render_controller.go:377] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0201 08:30:15.411599 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working I0201 08:30:15.501899 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting Unschedulable I0201 08:32:20.412963 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting OutOfDisk=Unknown I0201 08:32:44.581308 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal is reporting NotReady=False I0201 08:32:53.672797 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Completed update to rendered-worker-9786b62fd62a373a02836c98d9e9dbdd I0201 08:32:54.617431 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-a-kcqq4.c.openshift-qe.internal: Reporting ready I0201 08:32:58.673162 1 node_controller.go:414] Pool worker: 1 candidate nodes for update, capacity: 1 I0201 08:32:58.673192 1 node_controller.go:414] Pool worker: Setting node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal target to rendered-worker-9786b62fd62a373a02836c98d9e9dbdd I0201 08:32:58.698142 1 event.go:282] Event(v1.ObjectReference{Kind:"MachineConfigPool", Namespace:"", Name:"worker", UID:"f44dec88-8c15-4293-9105-9fc3f4534e64", APIVersion:"machineconfiguration.openshift.io/v1", ResourceVersion:"115312", FieldPath:""}): type: 'Normal' reason: 'SetDesiredConfig' Targeted node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal to config rendered-worker-9786b62fd62a373a02836c98d9e9dbdd I0201 08:32:58.699031 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-9786b62fd62a373a02836c98d9e9dbdd I0201 08:32:59.721398 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: changed annotation machineconfiguration.openshift.io/state = Working I0201 08:32:59.819253 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting Unschedulable I0201 08:37:35.937394 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting OutOfDisk=Unknown I0201 08:38:07.935689 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting unready: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal is reporting NotReady=False I0201 08:38:16.776181 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Completed update to rendered-worker-9786b62fd62a373a02836c98d9e9dbdd I0201 08:38:17.964040 1 node_controller.go:419] Pool worker: node minmli020147-m2chk-worker-b-8bvpl.c.openshift-qe.internal: Reporting ready I0201 08:38:21.776592 1 status.go:90] Pool worker: All nodes are updated with rendered-worker-9786b62fd62a373a02836c98d9e9dbdd I0201 08:44:01.727446 1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool master I0201 08:44:01.827633 1 container_runtime_config_controller.go:769] Applied ImageConfig cluster on MachineConfigPool worker the line"E0201 08:12:36.455829 1 render_controller.go:203] machineconfig has changed controller, not allowed." matched the time of creation of ctr3. Btw, when I first created ctr3, I didn't set correct MachineConfigPool label, so generate the error "could not find any MachineConfigPool set for ContainerRuntimeConfig set-loglevel", so I deleted it, and then add correct MachineConfigPool label, and created ctr3
Hi Min, I tried the order you mentioned ctr-1(pidsLimit: 2048), ctr-2(overlaySize: 3G), ctr-3(logLevel: debug) and the controller generated 3 MCs as expected. 99-worker-generated-containerruntime a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0 3.2.0 47m 99-worker-generated-containerruntime-1 a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0 3.2.0 24m 99-worker-generated-containerruntime-2 a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0 3.2.0 9s Name: ctr-test Namespace: Labels: <none> Annotations: <none> API Version: machineconfiguration.openshift.io/v1 Kind: ContainerRuntimeConfig Metadata: Creation Timestamp: 2021-02-01T18:29:02Z Finalizers: 99-worker-generated-containerruntime Generation: 1 Managed Fields: API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:spec: .: f:containerRuntimeConfig: .: f:pidsLimit: f:machineConfigPoolSelector: .: f:matchLabels: .: f:custom-crio: Manager: kubectl-create Operation: Update Time: 2021-02-01T18:29:02Z API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:finalizers: .: v:"99-worker-generated-containerruntime": f:spec: f:containerRuntimeConfig: f:logSizeMax: f:overlaySize: f:status: .: f:conditions: f:observedGeneration: Manager: machine-config-controller Operation: Update Time: 2021-02-01T18:29:02Z Resource Version: 35013 Self Link: /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/ctr-test UID: ed246c1f-60b1-475f-a10f-1bcf36fb00e3 Spec: Container Runtime Config: Pids Limit: 2048 Machine Config Pool Selector: Match Labels: Custom - Crio: test Status: Conditions: Last Transition Time: 2021-02-01T18:29:02Z Message: Success Status: True Type: Success Observed Generation: 1 Events: <none> Name: ctr-test-1 Namespace: Labels: <none> Annotations: machineconfiguration.openshift.io/mc-name-suffix: 1 API Version: machineconfiguration.openshift.io/v1 Kind: ContainerRuntimeConfig Metadata: Creation Timestamp: 2021-02-01T18:52:23Z Finalizers: 99-worker-generated-containerruntime-1 Generation: 2 Managed Fields: API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:spec: .: f:containerRuntimeConfig: .: f:overlaySize: f:machineConfigPoolSelector: .: f:matchLabels: .: f:custom-crio: Manager: kubectl-create Operation: Update Time: 2021-02-01T18:52:23Z API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:machineconfiguration.openshift.io/mc-name-suffix: f:finalizers: .: v:"99-worker-generated-containerruntime-1": f:spec: f:containerRuntimeConfig: f:logSizeMax: f:status: .: f:conditions: f:observedGeneration: Manager: machine-config-controller Operation: Update Time: 2021-02-01T18:52:23Z Resource Version: 44349 Self Link: /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/ctr-test-1 UID: e1ceeedd-d3e5-4cc8-892e-e0eda27f7600 Spec: Container Runtime Config: Log Size Max: 0 Overlay Size: 2G Machine Config Pool Selector: Match Labels: Custom - Crio: test Status: Conditions: Last Transition Time: 2021-02-01T18:52:23Z Message: Success Status: True Type: Success Observed Generation: 2 Events: <none> Name: ctr-test-2 Namespace: Labels: <none> Annotations: machineconfiguration.openshift.io/mc-name-suffix: 2 API Version: machineconfiguration.openshift.io/v1 Kind: ContainerRuntimeConfig Metadata: Creation Timestamp: 2021-02-01T19:16:26Z Finalizers: 99-worker-generated-containerruntime-2 Generation: 2 Managed Fields: API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:spec: .: f:containerRuntimeConfig: .: f:logLevel: f:machineConfigPoolSelector: .: f:matchLabels: .: f:custom-crio: Manager: kubectl-create Operation: Update Time: 2021-02-01T19:16:26Z API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:machineconfiguration.openshift.io/mc-name-suffix: f:finalizers: .: v:"99-worker-generated-containerruntime-2": f:spec: f:containerRuntimeConfig: f:logSizeMax: f:overlaySize: f:status: .: f:conditions: f:observedGeneration: Manager: machine-config-controller Operation: Update Time: 2021-02-01T19:16:27Z Resource Version: 53465 Self Link: /apis/machineconfiguration.openshift.io/v1/containerruntimeconfigs/ctr-test-2 UID: 22803757-e4db-4ed6-8df4-a191a78e5739 Spec: Container Runtime Config: Log Level: debug Log Size Max: 0 Overlay Size: 0 Machine Config Pool Selector: Match Labels: Custom - Crio: test Status: Conditions: Last Transition Time: 2021-02-01T19:16:27Z Message: Success Status: True Type: Success Observed Generation: 2 Events: <none> I am not entirely sure why it isn't working for you, maybe try a different overlaySize value?
Hi, Urvashi Mohnani Unfortunately I reproduced this issue on version :4.7.0-0.nightly-2021-02-02-223803 This time, I tried a overlaySize 3G in ctr-2, others keep the same, i.e. ctr-1(pidsLimit: 4095) -> ctr-2(overlaySize: 3G)-> ctr-3(logLevel: debug), but ctr-3 still didn't generate a matching mc(such as 99-worker-generated-containerruntime-2 ), but generated a new render-mc which can sync the change of ctr-3 to nodes. And I think the following error message in the log is abnormal: E0204 11:14:29.162268 1 render_controller.go:203] machineconfig has changed controller, not allowed. you can refer to the whole log in Comment 9 , they are similar.
Hi Min, I am unable to reproduce the issue you are seeing as well. I tried again on Server Version: 4.7.0-fc.5 and this time went a step further to create more ctrcfgs and got an MC for each of them: 1) pidsLimit: 2048 2) overlaySize: 3G 3) logLevel: debug 4) overlaySize: 2G 99-worker-generated-containerruntime a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0 3.2.0 79m 99-worker-generated-containerruntime-1 a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0 3.2.0 18m 99-worker-generated-containerruntime-2 a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0 3.2.0 10m 99-worker-generated-containerruntime-3 a060922f97f16ea7c7d60efe4f92b34c2e5b4ec0 3.2.0 13s I used 2 different labels as well this time and alternated the labels in the ctrcfg. A MC was created for each of the ctrcfgs. I also do not see the "E0204 11:14:29.162268 1 render_controller.go:203] machineconfig has changed controller, not allowed." error in my MCC logs. The controller should not be changing, weird that you are seeing this error. Can you try this test out again and if you hit the issue can you grab the must-gather logs and preserve the environment so we can poke around to see what is causing that weird controller change. My cluster is running on AWS, is there something specific/different about your environment?
Hi, Urvashi Mohnani you said "I used 2 different labels as well this time and alternated the labels in the ctrcfg", maybe it's the key point, I used an unique label in each ctrcfg, so if I created 3 ctrcfg, then I will add 3 different labels to mcp. Also, I will try again and upload the must-gather log. And you can test with my method.
Created attachment 1755176 [details] ctrcfg must-gather
Hi, Urvashi Mohnani I leave my test cluster for you to debug: https://mastern-jenkins-csb-openshift-qe.cloud.paas.psi.redhat.com/job/Launch%20Environment%20Flexy/137480/artifact/workdir/install-dir/auth/kubeconfig/*view*/ I am not sure when this cluster will be removed, so you can debug it ASAP, thanks.
Hi Min, Thanks for keeping that cluster around, I wasn't able to reproduce the issue again on your cluster. Tried with another cluster and wasn't able to reproduce as well. I have created a release image with more logs to the ctrcfg controller, can you spin up a cluster with that and see if you hit the issue again? Can you please preserve the MCC logs if you do and if possible leave the cluster up for me to take a look? The release image is quay.io/umohnani8/4.7-with-logs. Thanks!
Hi, Urvashi I reproduce the issue with the image you provided, please refer to: https://mastern-jenkins-csb-openshift-qe.cloud.paas.psi.redhat.com/job/Launch%20Environment%20Flexy/139411/artifact/workdir/install-dir/auth/kubeconfig/*view*/
Created attachment 1758128 [details] mcc log attach mcc log
verified on version : 4.8.0-0.nightly-2021-03-21-224928
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438