Description of problem: When performing a cluster upgrade with machine config pools configured as stalled, the upgrade will stall while trying to upgrade the MCO. There is no indication in the events or logs associated with the MCO as to the reason why the upgrade is not progressing. Version-Release number of selected component (if applicable): 4.6 How reproducible: On demand Steps to Reproduce: 1. Configure `worker` machine config to be 'Paused' 2. Update the `99-worker-ssh` machine config 3. Wait for the machine config pool to progress to 'Updating=true' Actual results: Machine config pool transitions to 'Updating=true' but never completes. There is no indication in logs or events as to this condition in the machine-config-controller logs or events in the openshift-machine-config-operator namespaces. Expected results: machine-config-controller should communicate that the 'Updating' state will never progress to completion. Additional info:
Tentatively targeting 4.6 but given the low priority we might push this forward to a z release or 4.7 Thanks for helping on this, much appreciated!
Verified on 4.6.0-0.nightly-2020-09-25-085318 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-09-25-085318 True False 62m Cluster version is 4.6.0-0.nightly-2020-09-25-085318 $ oc describe mcp/worker Name: worker Namespace: Labels: machineconfiguration.openshift.io/mco-built-in= pools.operator.machineconfiguration.openshift.io/worker= Annotations: <none> API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfigPool Metadata: Creation Timestamp: 2020-09-25T12:42:13Z Generation: 3 Managed Fields: API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:labels: .: f:machineconfiguration.openshift.io/mco-built-in: f:pools.operator.machineconfiguration.openshift.io/worker: f:spec: .: f:configuration: f:machineConfigSelector: .: f:matchLabels: .: f:machineconfiguration.openshift.io/role: f:nodeSelector: .: f:matchLabels: .: f:node-role.kubernetes.io/worker: f:paused: Manager: machine-config-operator Operation: Update Time: 2020-09-25T12:42:13Z API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:spec: f:configuration: f:name: f:source: f:status: .: f:conditions: f:configuration: .: f:name: f:source: f:degradedMachineCount: f:machineCount: f:observedGeneration: f:readyMachineCount: f:unavailableMachineCount: f:updatedMachineCount: Manager: machine-config-controller Operation: Update Time: 2020-09-25T12:54:29Z Resource Version: 27091 Self Link: /apis/machineconfiguration.openshift.io/v1/machineconfigpools/worker UID: 0ef14384-761e-43fb-abe0-ebd88d2302dd Spec: Configuration: Name: rendered-worker-0825ccbd713febb2260f339945806b66 Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-worker API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-generated-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-ssh Machine Config Selector: Match Labels: machineconfiguration.openshift.io/role: worker Node Selector: Match Labels: node-role.kubernetes.io/worker: Paused: false Status: Conditions: Last Transition Time: 2020-09-25T12:44:22Z Message: Reason: Status: False Type: NodeDegraded Last Transition Time: 2020-09-25T12:44:22Z Message: Reason: Status: False Type: Degraded Last Transition Time: 2020-09-25T12:44:33Z Message: Reason: Status: False Type: RenderDegraded Last Transition Time: 2020-09-25T12:54:29Z Message: All nodes are updated with rendered-worker-0825ccbd713febb2260f339945806b66 Reason: Status: True Type: Updated Last Transition Time: 2020-09-25T12:54:29Z Message: Reason: Status: False Type: Updating Configuration: Name: rendered-worker-0825ccbd713febb2260f339945806b66 Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-worker API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-generated-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-ssh Degraded Machine Count: 0 Machine Count: 3 Observed Generation: 3 Ready Machine Count: 3 Unavailable Machine Count: 0 Updated Machine Count: 3 Events: <none> $ oc edit mcp/worker machineconfigpool.machineconfiguration.openshift.io/worker edited $ oc describe mcp/worker Name: worker Namespace: Labels: machineconfiguration.openshift.io/mco-built-in= pools.operator.machineconfiguration.openshift.io/worker= Annotations: <none> API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfigPool Metadata: Creation Timestamp: 2020-09-25T12:42:13Z Generation: 4 Managed Fields: API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:labels: .: f:machineconfiguration.openshift.io/mco-built-in: f:pools.operator.machineconfiguration.openshift.io/worker: f:spec: .: f:configuration: f:machineConfigSelector: .: f:matchLabels: .: f:machineconfiguration.openshift.io/role: f:nodeSelector: .: f:matchLabels: .: f:node-role.kubernetes.io/worker: Manager: machine-config-operator Operation: Update Time: 2020-09-25T12:42:13Z API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:spec: f:configuration: f:name: f:source: f:status: .: f:conditions: f:configuration: .: f:name: f:source: f:degradedMachineCount: f:machineCount: f:observedGeneration: f:readyMachineCount: f:unavailableMachineCount: f:updatedMachineCount: Manager: machine-config-controller Operation: Update Time: 2020-09-25T12:54:29Z API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:spec: f:paused: Manager: oc Operation: Update Time: 2020-09-25T14:13:37Z Resource Version: 93236 Self Link: /apis/machineconfiguration.openshift.io/v1/machineconfigpools/worker UID: 0ef14384-761e-43fb-abe0-ebd88d2302dd Spec: Configuration: Name: rendered-worker-0825ccbd713febb2260f339945806b66 Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-worker API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-generated-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-ssh Machine Config Selector: Match Labels: machineconfiguration.openshift.io/role: worker Node Selector: Match Labels: node-role.kubernetes.io/worker: Paused: true Status: Conditions: Last Transition Time: 2020-09-25T12:44:22Z Message: Reason: Status: False Type: NodeDegraded Last Transition Time: 2020-09-25T12:44:22Z Message: Reason: Status: False Type: Degraded Last Transition Time: 2020-09-25T12:44:33Z Message: Reason: Status: False Type: RenderDegraded Last Transition Time: 2020-09-25T12:54:29Z Message: All nodes are updated with rendered-worker-0825ccbd713febb2260f339945806b66 Reason: Status: True Type: Updated Last Transition Time: 2020-09-25T12:54:29Z Message: Reason: Status: False Type: Updating Configuration: Name: rendered-worker-0825ccbd713febb2260f339945806b66 Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-worker API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-worker-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-generated-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-worker-ssh Degraded Machine Count: 0 Machine Count: 3 Observed Generation: 3 Ready Machine Count: 3 Unavailable Machine Count: 0 Updated Machine Count: 3 Events: <none> $ cp ../file.yaml . $ cat file.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: test-file spec: config: ignition: version: 2.2.0 storage: files: - contents: source: data:text/plain;charset=utf;base64,c2VydmVyIGZvby5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmFyLmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCnNlcnZlciBiYXouZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUK filesystem: root mode: 0644 path: /etc/test $ oc create -f file.yaml machineconfig.machineconfiguration.openshift.io/test-file created 2.2.0 4s $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m 00-worker a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m 01-master-container-runtime a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m 01-master-kubelet a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m 01-worker-container-runtime a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m 01-worker-kubelet a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m 99-master-generated-registries a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m 99-master-ssh 3.1.0 102m 99-worker-generated-registries a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m 99-worker-ssh 3.1.0 102m rendered-master-ec4d762b46b2b709eb29fed299628864 a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m rendered-worker-0825ccbd713febb2260f339945806b66 a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 92m rendered-worker-669f746a17a568d6e4e3b34fe3e2ed7b a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b 3.1.0 4s test-file 2.2.0 9s $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-ec4d762b46b2b709eb29fed299628864 True False False 3 3 3 0 94m worker rendered-worker-0825ccbd713febb2260f339945806b66 False True False 3 0 0 0 94m $ oc -n openshift-machine-config-operator get pods NAME READY STATUS RESTARTS AGE machine-config-controller-5555dd5b85-zrjtz 1/1 Running 0 96m machine-config-daemon-5ct9t 2/2 Running 0 87m machine-config-daemon-7s4fd 2/2 Running 0 97m machine-config-daemon-g6vtj 2/2 Running 0 86m machine-config-daemon-gpzp4 2/2 Running 0 97m machine-config-daemon-nxr7j 2/2 Running 0 86m machine-config-daemon-pcq79 2/2 Running 0 97m machine-config-operator-5749976cd6-m225p 1/1 Running 0 106m machine-config-server-7k9h6 1/1 Running 0 95m machine-config-server-d42m2 1/1 Running 0 95m machine-config-server-dm4f6 1/1 Running 0 95m $ oc -n openshift-machine-config-operator logs machine-config-controller-5555dd5b85-zrjtz ...SNIP... E0925 14:16:34.684307 1 render_controller.go:459] Error updating MachineConfigPool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0925 14:16:34.684329 1 render_controller.go:376] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again I0925 14:16:39.635872 1 node_controller.go:740] Pool worker is paused and will not update.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196