Description of problem: In the case a file managed by the MCO (mistakenly) has been edited manually, subsequent changes to a machine config will fail and the following not very informative error message will be shown: # oc -n openshift-machine-config-operator logs -f pods/machine-config-daemon-fg4f9 machine-config-daemon ... E1027 11:59:00.136374 3741 daemon.go:1403] mode mismatch for file: "/etc/sysconfig/network-scripts/ifcfg-bond0"; expected: --w----r--; received: --w----r-- E1027 11:59:00.136412 3741 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-679db1aec71c1b90263381d5dcf2d515 Do note how the "expected" and "received" fields are identical. Some more information on what actually is amiss would be appreciated... Version-Release number of selected component (if applicable): root@infravm ~]# oc get co/machine-config NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE machine-config 4.5.11 True False False 5d
Verified on 4.7.0-0.nightly-2021-01-19-095812 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-01-19-095812 True False 78m Cluster version is 4.7.0-0.nightly-2021-01-19-095812 $ cat file.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: test-file spec: config: ignition: version: 3.1.0 storage: files: - contents: source: data:text/plain;charset=utf;base64,c2VydmVyIGZvby5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmFyLmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCnNlcnZlciBiYXouZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUK mode: 0644 path: /etc/test $ oc create -f file.yaml machineconfig.machineconfiguration.openshift.io/test-file created $ oc get mcp/worker NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-b8df3d134de8245b16227eec2c44b003 True False False 3 3 3 0 3h36m $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 3h34m 00-worker 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 3h34m 01-master-container-runtime 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 3h34m 01-master-kubelet 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 3h34m 01-worker-container-runtime 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 3h34m 01-worker-kubelet 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 3h34m 99-master-generated-registries 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 3h34m 99-master-ssh 3.1.0 3h41m 99-worker-generated-registries 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 3h34m 99-worker-realtime 3h41m 99-worker-ssh 3.1.0 3h41m rendered-master-34dc347728dd226856851aa3b2efdc96 eb9778355a9020673e8ce9aee092cb98d80cde5e 3.1.0 3h34m rendered-master-e7970f3bfa5fb5e620cb40b3bc75bc3a 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 98m rendered-worker-b8df3d134de8245b16227eec2c44b003 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 98m rendered-worker-c651389fdb0057a8941007d3e5b15f4b 82929b72f6901650c5849ba4afe7ccdbed48b17c 3.2.0 4s rendered-worker-fca111ae774c0766b04dfd13759446a7 eb9778355a9020673e8ce9aee092cb98d80cde5e 3.1.0 3h34m test-file 3.1.0 9s $ watch oc get mcp/worker $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-128-71.us-west-2.compute.internal Ready master 3h47m v1.20.0+d9c52cc ip-10-0-149-13.us-west-2.compute.internal Ready worker 3h34m v1.20.0+d9c52cc ip-10-0-175-147.us-west-2.compute.internal Ready master 3h43m v1.20.0+d9c52cc ip-10-0-175-54.us-west-2.compute.internal Ready worker 3h38m v1.20.0+d9c52cc ip-10-0-213-88.us-west-2.compute.internal Ready worker 3h34m v1.20.0+d9c52cc ip-10-0-220-222.us-west-2.compute.internal Ready master 3h43m v1.20.0+d9c52cc $ oc debug node/ip-10-0-149-13.us-west-2.compute.internal Starting pod/ip-10-0-149-13us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# cd /etc sh-4.4# ls -la test -rw-r--r--. 1 root root 132 Jan 19 21:50 test sh-4.4# chmod 1644 test sh-4.4# ls -la test -rw-r--r-T. 1 root root 132 Jan 19 21:50 test sh-4.4# exit exit sh-4.2# exit exit Removing debug pod ... $ oc get pods -A --field-selector spec.nodeName=ip-10-0-149-13.us-west-2.compute.internal NAMESPACE NAME READY STATUS RESTARTS AGE openshift-cluster-csi-drivers aws-ebs-csi-driver-node-tlp78 3/3 Running 0 132m openshift-cluster-node-tuning-operator tuned-9xsrq 1/1 Running 0 134m openshift-console downloads-7ffdbf746f-n2zgr 1/1 Running 0 10m openshift-dns dns-default-2p6ft 3/3 Running 0 118m openshift-image-registry image-registry-6979dcbcdb-lch8w 1/1 Running 0 10m openshift-image-registry node-ca-z796r 1/1 Running 0 132m openshift-ingress-canary ingress-canary-npccs 1/1 Running 0 135m openshift-ingress router-default-56f657765b-bkprb 1/1 Running 0 10m openshift-machine-config-operator machine-config-daemon-f9whr 2/2 Running 0 117m openshift-marketplace certified-operators-9xplr 1/1 Running 0 7m20s openshift-marketplace community-operators-v6kbk 1/1 Running 0 9m51s openshift-marketplace redhat-marketplace-2x4qd 1/1 Running 0 9m52s openshift-marketplace redhat-operators-hl6ld 1/1 Running 0 9m51s openshift-monitoring alertmanager-main-0 5/5 Running 0 9m51s openshift-monitoring alertmanager-main-2 5/5 Running 0 9m51s openshift-monitoring node-exporter-s8csj 2/2 Running 0 132m openshift-monitoring prometheus-adapter-5c6c9845f6-btdmr 1/1 Running 0 10m openshift-monitoring prometheus-k8s-0 7/7 Running 1 9m51s openshift-monitoring thanos-querier-6996dfb5d5-2rwtw 5/5 Running 0 10m openshift-multus multus-szqsg 1/1 Running 0 124m openshift-multus network-metrics-daemon-zjqtr 2/2 Running 0 129m openshift-network-diagnostics network-check-source-8b577f64-n24zz 1/1 Running 0 10m openshift-network-diagnostics network-check-target-xp6jn 1/1 Running 0 130m openshift-sdn ovs-jm5q7 1/1 Running 0 126m openshift-sdn sdn-jmb6x 2/2 Running 0 128m $ oc -n openshift-machine-config-operator logs -f machine-config-daemon-f9whr -c machine-config-daemon ...SNIP... E0119 21:53:34.627464 1861 reflector.go:138] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to watch *v1.MachineConfig: failed to list *v1.MachineConfig: Get "https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0": dial tcp 172.30.0.1:443: connect: no route to host I0119 21:53:37.093598 1861 daemon.go:401] Node ip-10-0-149-13.us-west-2.compute.internal is not labeled node-role.kubernetes.io/master I0119 21:53:37.202319 1861 daemon.go:816] Current config: rendered-worker-b8df3d134de8245b16227eec2c44b003 I0119 21:53:37.202340 1861 daemon.go:817] Desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b I0119 21:53:37.209989 1861 update.go:1852] Disk currentConfig rendered-worker-c651389fdb0057a8941007d3e5b15f4b overrides node's currentConfig annotation rendered-worker-b8df3d134de8245b16227eec2c44b003 I0119 21:53:37.214066 1861 daemon.go:1099] Validating against pending config rendered-worker-c651389fdb0057a8941007d3e5b15f4b I0119 21:53:37.233510 1861 daemon.go:1110] Validated on-disk state I0119 21:53:37.249538 1861 daemon.go:1165] Completing pending config rendered-worker-c651389fdb0057a8941007d3e5b15f4b I0119 21:53:37.259579 1861 update.go:1852] completed update for config rendered-worker-c651389fdb0057a8941007d3e5b15f4b I0119 21:53:37.263088 1861 daemon.go:1181] In desired config rendered-worker-c651389fdb0057a8941007d3e5b15f4b ^C $ oc -n openshift-machine-config-operator delete pod machine-config-daemon-f9whr pod "machine-config-daemon-f9whr" deleted $ oc get pods -A --field-selector spec.nodeName=ip-10-0-149-13.us-west-2.compute.internal NAMESPACE NAME READY STATUS RESTARTS AGE openshift-cluster-csi-drivers aws-ebs-csi-driver-node-tlp78 3/3 Running 0 134m openshift-cluster-node-tuning-operator tuned-9xsrq 1/1 Running 0 136m openshift-console downloads-7ffdbf746f-n2zgr 1/1 Running 0 12m openshift-dns dns-default-2p6ft 3/3 Running 0 120m openshift-image-registry image-registry-6979dcbcdb-lch8w 1/1 Running 0 12m openshift-image-registry node-ca-z796r 1/1 Running 0 134m openshift-ingress-canary ingress-canary-npccs 1/1 Running 0 137m openshift-ingress router-default-56f657765b-bkprb 1/1 Running 0 12m openshift-machine-config-operator machine-config-daemon-vx2wm 2/2 Running 0 17s openshift-marketplace certified-operators-9xplr 1/1 Running 0 9m11s openshift-marketplace community-operators-v6kbk 1/1 Running 0 11m openshift-marketplace redhat-marketplace-2x4qd 1/1 Running 0 11m openshift-marketplace redhat-operators-hl6ld 1/1 Running 0 11m openshift-monitoring alertmanager-main-0 5/5 Running 0 11m openshift-monitoring alertmanager-main-2 5/5 Running 0 11m openshift-monitoring node-exporter-s8csj 2/2 Running 0 134m openshift-monitoring prometheus-adapter-5c6c9845f6-btdmr 1/1 Running 0 12m openshift-monitoring prometheus-k8s-0 7/7 Running 1 11m openshift-monitoring thanos-querier-6996dfb5d5-2rwtw 5/5 Running 0 12m openshift-multus multus-szqsg 1/1 Running 0 126m openshift-multus network-metrics-daemon-zjqtr 2/2 Running 0 131m openshift-network-diagnostics network-check-source-8b577f64-n24zz 1/1 Running 0 12m openshift-network-diagnostics network-check-target-xp6jn 1/1 Running 0 131m openshift-sdn ovs-jm5q7 1/1 Running 0 128m openshift-sdn sdn-jmb6x 2/2 Running 0 130m $ oc -n openshift-machine-config-operator logs -f machine-config-daemon-vx2wm -c machine-config-daemon I0119 22:05:31.455555 29704 start.go:108] Version: v4.7.0-202101190826.p0-dirty (82929b72f6901650c5849ba4afe7ccdbed48b17c) I0119 22:05:31.458807 29704 start.go:121] Calling chroot("/rootfs") I0119 22:05:31.458894 29704 rpm-ostree.go:261] Running captured: rpm-ostree status --json I0119 22:05:32.542155 29704 daemon.go:224] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8672a3ce64788e4138bec0d2cafe701d96b55c35482314d393b52dd414e635a4 (47.83.202101171239-0) I0119 22:05:33.084742 29704 daemon.go:231] Installed Ignition binary version: 2.9.0 I0119 22:05:33.123524 29704 start.go:97] Copied self to /run/bin/machine-config-daemon on host I0119 22:05:33.129018 29704 update.go:1852] Starting to manage node: ip-10-0-149-13.us-west-2.compute.internal I0119 22:05:33.130494 29704 metrics.go:105] Registering Prometheus metrics I0119 22:05:33.130571 29704 metrics.go:110] Starting metrics listener on 127.0.0.1:8797 I0119 22:05:33.145313 29704 rpm-ostree.go:261] Running captured: rpm-ostree status I0119 22:05:33.308288 29704 daemon.go:863] State: idle Deployments: * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8672a3ce64788e4138bec0d2cafe701d96b55c35482314d393b52dd414e635a4 CustomOrigin: Managed by machine-config-operator Version: 47.83.202101171239-0 (2021-01-17T12:42:48Z) RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-240.10.1.el8_3 LayeredPackages: kernel-rt-core kernel-rt-kvm kernel-rt-modules kernel-rt-modules-extra pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:eee4994c87fd8aef1da4676820522de94c82679b9a248578f828592d1c17cafa CustomOrigin: Managed by machine-config-operator Version: 46.82.202012151054-0 (2020-12-15T10:57:15Z) RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-193.37.1.el8_2 LayeredPackages: kernel-rt-core kernel-rt-kvm kernel-rt-modules kernel-rt-modules-extra I0119 22:05:33.308310 29704 rpm-ostree.go:261] Running captured: journalctl --list-boots I0119 22:05:33.317335 29704 daemon.go:870] journalctl --list-boots: -3 d537cd4071c540aea8c1289ad032a38a Tue 2021-01-19 18:16:36 UTC—Tue 2021-01-19 18:24:21 UTC -2 558245af81c148c3af8a616dacd54cb5 Tue 2021-01-19 18:24:37 UTC—Tue 2021-01-19 20:23:58 UTC -1 d9576b7e83db43bfba9ad9c5651e9561 Tue 2021-01-19 20:24:15 UTC—Tue 2021-01-19 21:51:05 UTC 0 c3b56729f6d746d5a5553f3e8d21e2b0 Tue 2021-01-19 21:52:54 UTC—Tue 2021-01-19 22:05:33 UTC I0119 22:05:33.317359 29704 rpm-ostree.go:261] Running captured: systemctl list-units --state=failed --no-legend I0119 22:05:33.330508 29704 daemon.go:885] systemd service state: OK I0119 22:05:33.330528 29704 daemon.go:617] Starting MachineConfigDaemon I0119 22:05:33.330643 29704 daemon.go:624] Enabling Kubelet Healthz Monitor I0119 22:05:39.231392 29704 daemon.go:401] Node ip-10-0-149-13.us-west-2.compute.internal is not labeled node-role.kubernetes.io/master I0119 22:05:39.250356 29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b I0119 22:05:39.271534 29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b E0119 22:05:39.284789 29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644 I0119 22:05:41.367319 29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b I0119 22:05:41.373858 29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b E0119 22:05:41.380791 29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644 I0119 22:05:49.398285 29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b I0119 22:05:49.405959 29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b E0119 22:05:49.432608 29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644 I0119 22:06:08.950911 29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b I0119 22:06:08.958261 29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b E0119 22:06:08.966246 29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644 I0119 22:06:42.161728 29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b I0119 22:06:42.184905 29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b E0119 22:06:42.208143 29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644 ^C $
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633