Description of problem: The MCO is improperly reporting the status (both the state of the nodes and the target MachineConfig) in the extension field of its ClusterOperator. Version-Release number of selected component (if applicable): 4.2.0-0.okd-2019-08-19-143649 How reproducible: 2/2 Steps to Reproduce: 1. Boot cluster 2. Create a MachineConfig (I only enabled FIPS in mine) Actual results (and expected): Before creating the extra MachineConfig, here's how the MachineConfigs looked: $ oc get machineconfigs NAME GENERATEDBYCONTROLLER IGNITIONVERSION CREATED 00-master bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 00-worker bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 01-master-container-runtime bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 01-master-kubelet bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 01-worker-container-runtime bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 01-worker-kubelet bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 99-master-9e68df9c-c295-11e9-8f72-5254003ec71c-registries bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 99-master-ssh 2.2.0 121m 99-worker-9e6b6010-c295-11e9-8f72-5254003ec71c-registries bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 99-worker-ssh 2.2.0 121m rendered-master-07a405ed1f29060653a45c236bf5fd6e bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m rendered-worker-2fe2c50ca1b300646e59b6fad007d537 bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121 And after: $ oc get machineconfigs NAME GENERATEDBYCONTROLLER IGNITIONVERSION CREATED 00-master bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 00-worker bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 01-master-container-runtime bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 01-master-kubelet bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 01-worker-container-runtime bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 01-worker-kubelet bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 99-master-9e68df9c-c295-11e9-8f72-5254003ec71c-registries bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 99-master-ssh 2.2.0 121m 99-worker-9e6b6010-c295-11e9-8f72-5254003ec71c-registries bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m 99-worker-fips 17s 99-worker-ssh 2.2.0 121m rendered-master-07a405ed1f29060653a45c236bf5fd6e bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m rendered-worker-2fe2c50ca1b300646e59b6fad007d537 bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 121m rendered-worker-b3530b52778ea08aba671f744d5d536a bea1c28437c25c31bc02b87d39004ac0479b5395 2.2.0 12s So far, so good. But despite there being a change, the MCO did not report that it was progressing: $ oc get co machine-config NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE machine-config 4.2.0-0.okd-2019-08-19-143649 True False False 118m And looking more closely at the ClusterOperator, I see the following: $ oc get co machine-config -o yaml | jq .status status: conditions: - lastTransitionTime: "2019-08-19T15:30:33Z" message: Cluster has deployed 4.2.0-0.okd-2019-08-19-143649 status: "True" type: Available - lastTransitionTime: "2019-08-19T15:26:54Z" message: Cluster version is 4.2.0-0.okd-2019-08-19-143649 status: "False" type: Progressing - lastTransitionTime: "2019-08-19T15:30:32Z" status: "False" type: Degraded - lastTransitionTime: "2019-08-19T15:26:54Z" reason: AsExpected status: "True" type: Upgradeable extension: master: all 3 nodes are at latest configuration rendered-master-07a405ed1f29060653a45c236bf5fd6e worker: 0 (ready 0) out of 2 nodes are updating to latest configuration rendered-worker-2fe2c50ca1b300646e59b6fad007d537 So it says that none of the machines are updating (even though I could see the machine was rebooting) and it says that it's progressing toward rendered-worker-2fe2c50ca1b300646e59b6fad007d537 (when it should have said rendered-worker-b3530b52778ea08aba671f744d5d536a). After that first node came back, I checked again: $ oc get co machine-config -o yaml | jq .status status: conditions: - lastTransitionTime: "2019-08-19T15:30:33Z" message: Cluster has deployed 4.2.0-0.okd-2019-08-19-143649 status: "True" type: Available - lastTransitionTime: "2019-08-19T15:26:54Z" message: Cluster version is 4.2.0-0.okd-2019-08-19-143649 status: "False" type: Progressing - lastTransitionTime: "2019-08-19T15:30:32Z" status: "False" type: Degraded - lastTransitionTime: "2019-08-19T15:26:54Z" reason: AsExpected status: "True" type: Upgradeable extension: master: all 3 nodes are at latest configuration rendered-master-07a405ed1f29060653a45c236bf5fd6e worker: 1 (ready 1) out of 2 nodes are updating to latest configuration rendered-worker-2fe2c50ca1b300646e59b6fad007d537 So again, it still says that it's progressing to rendered-worker-2fe2c50ca1b300646e59b6fad007d537.
Do you happen to have the must-gather for this cluster?
or MCD logs?
I believe that I've reproduced this issue locally with a non-FIPS MC... Will do more testing and dig into this further.
Can confirm that this affects masters as well: ``` master: 0 (ready 0) out of 3 nodes are updating to latest configuration rendered-master-a8f0ae377e3f2b0af5b93ca37709c2f0 ```
https://github.com/openshift/machine-config-operator/pull/1066 - WIP, still need to test
However in my tests, I can confirm that when masters are updated we do see: ``` $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED master rendered-master-a8f0ae377e3f2b0af5b93ca37709c2f0 False True False ```
as a brief update: I fixed the output issue in my PR. However to track the progress of a new MC we'd recommend using $ oc get mcp and watching the UPDATING column I believe Progressing as reflected in the CVO represents an operator version change, which adding a new MC does not do: https://github.com/openshift/machine-config-operator/blob/093e96ef4cdbd15ecda18323dadf4d552fcfd327/pkg/operator/status.go#L117 We are lacking clear documentation explaining this and I will add a doc for this in a follow-on PR.
Verified on 4.2.0-0.nightly-2019-09-04-102339. oc get co machine-config now indicates the correct machineconfig in the co/machine-config status. $ cat file-drop.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: master name: test-file spec: config: ignition: version: 2.2.0 storage: files: - contents: source: data:text/plain;charset=utf;base64,c2VydmVyIGZvby5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmFyLmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCnNlcnZlciBiYXouZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUK filesystem: root mode: 0644 path: /etc/test $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED master rendered-master-4fb70e93cdca2867233058ac88786760 True False False worker rendered-worker-be34dec52f9201d3ecaaa5874da6ccb6 True False False $ oc get co machine-config -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-09-04T17:13:44Z" generation: 1 name: machine-config resourceVersion: "13923" selfLink: /apis/config.openshift.io/v1/clusteroperators/machine-config uid: 59c96d47-cf37-11e9-95b3-02d2aae96faa spec: {} status: conditions: - lastTransitionTime: "2019-09-04T17:14:28Z" message: Cluster has deployed 4.2.0-0.nightly-2019-09-04-102339 status: "True" type: Available - lastTransitionTime: "2019-09-04T17:14:28Z" message: Cluster version is 4.2.0-0.nightly-2019-09-04-102339 status: "False" type: Progressing - lastTransitionTime: "2019-09-04T17:13:44Z" status: "False" type: Degraded - lastTransitionTime: "2019-09-04T17:14:28Z" reason: AsExpected status: "True" type: Upgradeable extension: master: all 3 nodes are at latest configuration rendered-master-4fb70e93cdca2867233058ac88786760 worker: all 3 nodes are at latest configuration rendered-worker-be34dec52f9201d3ecaaa5874da6ccb6 relatedObjects: - group: "" name: openshift-machine-config-operator resource: namespaces - group: machineconfiguration.openshift.io name: master resource: machineconfigpools - group: machineconfiguration.openshift.io name: worker resource: machineconfigpools - group: machineconfiguration.openshift.io name: machine-config-controller resource: controllerconfigs versions: - name: operator version: 4.2.0-0.nightly-2019-09-04-102339 $ oc apply -f file-drop.yaml machineconfig.machineconfiguration.openshift.io/test-file created $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION CREATED 00-master 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m 00-worker 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m 01-master-container-runtime 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m 01-master-kubelet 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m 01-worker-container-runtime 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m 01-worker-kubelet 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m 99-master-5ce72f89-cf37-11e9-95b3-02d2aae96faa-registries 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m 99-master-ssh 2.2.0 14m 99-worker-5ce8ba94-cf37-11e9-95b3-02d2aae96faa-registries 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m 99-worker-ssh 2.2.0 14m rendered-master-4fb70e93cdca2867233058ac88786760 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m rendered-master-ab16e8a420ec8a6db44ab06d6e45ec93 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 0s rendered-worker-be34dec52f9201d3ecaaa5874da6ccb6 3b375b425a3bf6ca4189206f8ea4c499376eb71c 2.2.0 13m test-file 2.2.0 5s $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED master rendered-master-4fb70e93cdca2867233058ac88786760 False True False worker rendered-worker-be34dec52f9201d3ecaaa5874da6ccb6 True False False $ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-133-250.us-west-2.compute.internal Ready master 15m v1.14.0+7fe5fb087 ip-10-0-133-56.us-west-2.compute.internal Ready worker 8m38s v1.14.0+7fe5fb087 ip-10-0-145-112.us-west-2.compute.internal Ready master 15m v1.14.0+7fe5fb087 ip-10-0-148-159.us-west-2.compute.internal Ready worker 8m50s v1.14.0+7fe5fb087 ip-10-0-169-149.us-west-2.compute.internal Ready worker 8m42s v1.14.0+7fe5fb087 ip-10-0-170-68.us-west-2.compute.internal Ready,SchedulingDisabled master 15m v1.14.0+7fe5fb087 $ oc get co machine-config -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-09-04T17:13:44Z" generation: 1 name: machine-config resourceVersion: "16682" selfLink: /apis/config.openshift.io/v1/clusteroperators/machine-config uid: 59c96d47-cf37-11e9-95b3-02d2aae96faa spec: {} status: conditions: - lastTransitionTime: "2019-09-04T17:14:28Z" message: Cluster has deployed 4.2.0-0.nightly-2019-09-04-102339 status: "True" type: Available - lastTransitionTime: "2019-09-04T17:14:28Z" message: Cluster version is 4.2.0-0.nightly-2019-09-04-102339 status: "False" type: Progressing - lastTransitionTime: "2019-09-04T17:13:44Z" status: "False" type: Degraded - lastTransitionTime: "2019-09-04T17:14:28Z" reason: AsExpected status: "True" type: Upgradeable extension: lastSyncError: 'error pool master is not ready, retrying. Status: (pool degraded: false total: 3, ready 0, updated: 0, unavailable: 0)' master: 0 (ready 0) out of 3 nodes are updating to latest configuration rendered-master-ab16e8a420ec8a6db44ab06d6e45ec93 worker: all 3 nodes are at latest configuration rendered-worker-be34dec52f9201d3ecaaa5874da6ccb6 relatedObjects: - group: "" name: openshift-machine-config-operator resource: namespaces - group: machineconfiguration.openshift.io name: master resource: machineconfigpools - group: machineconfiguration.openshift.io name: worker resource: machineconfigpools - group: machineconfiguration.openshift.io name: machine-config-controller resource: controllerconfigs versions: - name: operator version: 4.2.0-0.nightly-2019-09-04-102339 $ oc get mcp master -o yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: creationTimestamp: "2019-09-04T17:13:50Z" generation: 3 labels: machineconfiguration.openshift.io/mco-built-in: "" operator.machineconfiguration.openshift.io/required-for-upgrade: "" name: master resourceVersion: "16931" selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigpools/master uid: 5ce72f89-cf37-11e9-95b3-02d2aae96faa spec: configuration: name: rendered-master-ab16e8a420ec8a6db44ab06d6e45ec93 source: - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 00-master - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-master-container-runtime - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-master-kubelet - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-5ce72f89-cf37-11e9-95b3-02d2aae96faa-registries - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-ssh - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: test-file machineConfigSelector: matchLabels: machineconfiguration.openshift.io/role: master nodeSelector: matchLabels: node-role.kubernetes.io/master: "" paused: false status: conditions: - lastTransitionTime: "2019-09-04T17:14:22Z" message: "" reason: "" status: "False" type: RenderDegraded - lastTransitionTime: "2019-09-04T17:14:27Z" message: "" reason: "" status: "False" type: NodeDegraded - lastTransitionTime: "2019-09-04T17:14:27Z" message: "" reason: "" status: "False" type: Degraded - lastTransitionTime: "2019-09-04T17:28:07Z" message: "" reason: "" status: "False" type: Updated - lastTransitionTime: "2019-09-04T17:28:07Z" message: All nodes are updating to rendered-master-ab16e8a420ec8a6db44ab06d6e45ec93 reason: "" status: "True" type: Updating configuration: name: rendered-master-4fb70e93cdca2867233058ac88786760 source: - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 00-master - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-master-container-runtime - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-master-kubelet - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-5ce72f89-cf37-11e9-95b3-02d2aae96faa-registries - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-master-ssh degradedMachineCount: 0 machineCount: 3 observedGeneration: 3 readyMachineCount: 0 unavailableMachineCount: 1 updatedMachineCount: 0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922