SNO: upgrade gets stuck waiting on machine-config: error: cannot apply annotation for SSH access due to: unable to update node "nil": node <FQDN> not found Version: 4.8.0-0.nightly-2021-04-01-072432 Attempted to upgrade to 4.8.0-0.nightly-2021-04-03-044912 Result: The upgrade gets stuck at 84%: [kni@r640-u01 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-04-01-072432 True True 91m Working towards 4.8.0-0.nightly-2021-04-03-044912: 567 of 675 done (84% complete) [kni@r640-u01 ~]$ [kni@r640-u01 ~]$ oc get pod -A|grep -v Run|grep -v Comple; NAMESPACE NAME READY STATUS RESTARTS AGE openshift-machine-config-operator machine-config-daemon-n286c 1/2 CrashLoopBackOff 18 67m [kni@r640-u01 ~]$ oc logs -n openshift-machine-config-operator machine-config-daemon-n286c -c machine-config-daemon I0406 17:41:21.488360 544460 start.go:108] Version: v4.8.0-202104030047.p0-dirty (86270f3375f894ff1dc21eee74247f04790dd0e1) I0406 17:41:21.490593 544460 start.go:121] Calling chroot("/rootfs") I0406 17:41:21.490652 544460 rpm-ostree.go:258] Running captured: rpm-ostree status --json I0406 17:41:21.552522 544460 daemon.go:219] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0581b43ed5c3db21620c09711e102acf7837bacb99e22cd4dce1fc9c3800ec4c (48.83.202104010252-0) I0406 17:41:21.577196 544460 start.go:97] Copied self to /run/bin/machine-config-daemon on host I0406 17:41:21.578921 544460 metrics.go:105] Registering Prometheus metrics I0406 17:41:21.579608 544460 metrics.go:110] Starting metrics listener on 127.0.0.1:8797 I0406 17:41:21.580261 544460 update.go:1851] Starting to manage node: openshift-master-0.qe1.kni.lab.eng.bos.redhat.com I0406 17:41:21.583051 544460 rpm-ostree.go:258] Running captured: rpm-ostree status I0406 17:41:21.585604 544460 daemon.go:669] Detected a new login session: New session 1 of user core. I0406 17:41:21.585614 544460 daemon.go:670] Login access is discouraged! Applying annotation: machineconfiguration.openshift.io/ssh I0406 17:41:21.612682 544460 daemon.go:851] State: idle Deployments: * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0581b43ed5c3db21620c09711e102acf7837bacb99e22cd4dce1fc9c3800ec4c CustomOrigin: Managed by machine-config-operator Version: 48.83.202104010252-0 (2021-04-01T02:55:23Z) ostree://646a9832dd0dc9fe174a2fc005863a9582186518a5476522a0e9bdccc0e5252a Version: 47.83.202102090044-0 (2021-02-09T00:47:36Z) I0406 17:41:21.612703 544460 rpm-ostree.go:258] Running captured: journalctl --list-boots I0406 17:41:21.618321 544460 daemon.go:858] journalctl --list-boots: -1 be1322d40e1d4a599ad3b51ccd383f91 Sat 2021-04-03 16:27:18 UTC—Sat 2021-04-03 16:28:51 UTC 0 ad6204a549f94f7da2a60ab4bff96f36 Sat 2021-04-03 16:30:11 UTC—Tue 2021-04-06 17:41:21 UTC I0406 17:41:21.618334 544460 rpm-ostree.go:258] Running captured: systemctl list-units --state=failed --no-legend I0406 17:41:21.623970 544460 daemon.go:871] systemctl --failed: NetworkManager-wait-online.service loaded failed failed Network Manager Wait Online I0406 17:41:21.623980 544460 daemon.go:607] Starting MachineConfigDaemon I0406 17:41:21.623986 544460 daemon.go:577] Guarding against sigterm signal I0406 17:41:21.623996 544460 daemon.go:614] Enabling Kubelet Healthz Monitor W0406 17:41:21.624004 544460 daemon.go:635] Got an error from auxiliary tools: error: cannot apply annotation for SSH access due to: unable to update node "nil": node "openshift-master-0.qe1.kni.lab.eng.bos.redhat.com" not found I0406 17:41:21.624013 544460 daemon.go:636] Shutting down MachineConfigDaemon F0406 17:41:21.624051 544460 helpers.go:147] error: cannot apply annotation for SSH access due to: unable to update node "nil": node "openshift-master-0.qe1.kni.lab.eng.bos.redhat.com" not found [kni@r640-u01 ~]$ [kni@r640-u01 ~]$ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-81fbf9897a1ee0bf6cce5faf66c9f24f False True False 1 0 0 0 3d1h worker rendered-worker-f91909329df6dca62a07551ac0a530f3 True False False 0 0 0 0 3d1h [kni@r640-u01 ~]$ oc describe mcp master Name: master Namespace: Labels: machineconfiguration.openshift.io/mco-built-in= operator.machineconfiguration.openshift.io/required-for-upgrade= pools.operator.machineconfiguration.openshift.io/master= Annotations: <none> API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfigPool Metadata: Creation Timestamp: 2021-04-03T16:36:51Z Generation: 5 Managed Fields: API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:labels: .: f:machineconfiguration.openshift.io/mco-built-in: f:operator.machineconfiguration.openshift.io/required-for-upgrade: f:pools.operator.machineconfiguration.openshift.io/master: f:spec: .: f:configuration: f:machineConfigSelector: .: f:matchLabels: .: f:machineconfiguration.openshift.io/role: f:nodeSelector: .: f:matchLabels: .: f:node-role.kubernetes.io/master: f:paused: Manager: machine-config-operator Operation: Update Time: 2021-04-03T16:36:51Z API Version: machineconfiguration.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:spec: f:configuration: f:name: f:source: f:status: .: f:conditions: f:configuration: .: f:name: f:source: f:degradedMachineCount: f:machineCount: f:observedGeneration: f:readyMachineCount: f:unavailableMachineCount: f:updatedMachineCount: Manager: machine-config-controller Operation: Update Time: 2021-04-03T16:40:46Z Resource Version: 995781 UID: 4e4c15c5-bedb-4012-ad26-0dfc0357b742 Spec: Configuration: Name: rendered-master-064c2e13dbbd9d85f2614bb0979c31c4 Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-master API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-master-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-master-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 10-hostname API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 10-static-ips API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-generated-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-ssh Machine Config Selector: Match Labels: machineconfiguration.openshift.io/role: master Node Selector: Match Labels: node-role.kubernetes.io/master: Paused: false Status: Conditions: Last Transition Time: 2021-04-03T16:40:46Z Message: Reason: Status: False Type: NodeDegraded Last Transition Time: 2021-04-03T16:40:46Z Message: Reason: Status: False Type: Degraded Last Transition Time: 2021-04-03T16:40:47Z Message: Reason: Status: False Type: RenderDegraded Last Transition Time: 2021-04-06T16:36:42Z Message: Reason: Status: False Type: Updated Last Transition Time: 2021-04-06T16:36:42Z Message: All nodes are updating to rendered-master-064c2e13dbbd9d85f2614bb0979c31c4 Reason: Status: True Type: Updating Configuration: Name: rendered-master-81fbf9897a1ee0bf6cce5faf66c9f24f Source: API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 00-master API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-master-container-runtime API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 01-master-kubelet API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 10-hostname API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 10-static-ips API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-generated-registries API Version: machineconfiguration.openshift.io/v1 Kind: MachineConfig Name: 99-master-ssh Degraded Machine Count: 0 Machine Count: 1 Observed Generation: 5 Ready Machine Count: 0 Unavailable Machine Count: 1 Updated Machine Count: 0 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SetDesiredConfig 67m machineconfigcontroller-nodecontroller Targeted node openshift-master-0.qe1.kni.lab.eng.bos.redhat.com to config rendered-master-064c2e13dbbd9d85f2614bb0979c31c4 Normal AnnotationChange 67m machineconfigcontroller-nodecontroller Node openshift-master-0.qe1.kni.lab.eng.bos.redhat.com now has machineconfiguration.openshift.io/desiredConfig=rendered-master-064c2e13dbbd9d85f2614bb0979c31c4 [kni@r640-u01 ~]$
*** This bug has been marked as a duplicate of bug 1933772 ***
Workaround - successfully tested: ssh to SNO and run: journalctl --flush rm -rf /var/log/journal/* systemctl restart systemd-journald Then restart the affected pods (in crashloopbackoff) and make sure they go up correctly.