Bug 1891825 - Error message not very informative in case of mode mismatch
Summary: Error message not very informative in case of mode mismatch
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.5
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.7.0
Assignee: Robin Cernin
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks: 1897552 1918535
TreeView+ depends on / blocked
 
Reported: 2020-10-27 13:24 UTC by David Juran
Modified: 2021-02-24 15:29 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
This change clarifies the error message in case of mode mismatch on files. It doesn't require any doc update as it doesn't impact any operations and only improves logging message.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:28:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2340 0 None closed Bug 1891825: Mode mismatch fix for confusing strings 2021-02-15 04:31:52 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:29:09 UTC

Description David Juran 2020-10-27 13:24:00 UTC
Description of problem:
In the case a file managed by the MCO (mistakenly) has been edited manually, subsequent changes to a machine config will fail and the following not very informative error message will be shown:

# oc -n openshift-machine-config-operator logs -f pods/machine-config-daemon-fg4f9 machine-config-daemon 
...
E1027 11:59:00.136374    3741 daemon.go:1403] mode mismatch for file: "/etc/sysconfig/network-scripts/ifcfg-bond0"; expected: --w----r--; received: --w----r--
E1027 11:59:00.136412    3741 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-679db1aec71c1b90263381d5dcf2d515

Do note how the "expected" and "received" fields are identical. Some more information on what actually is amiss would be appreciated...

Version-Release number of selected component (if applicable):
root@infravm ~]# oc get co/machine-config
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
machine-config 4.5.11 True False False 5d

Comment 3 Michael Nguyen 2021-01-19 22:10:57 UTC
Verified on 4.7.0-0.nightly-2021-01-19-095812

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2021-01-19-095812   True        False         78m     Cluster version is 4.7.0-0.nightly-2021-01-19-095812
$ cat file.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: test-file
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf;base64,c2VydmVyIGZvby5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmFyLmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCnNlcnZlciBiYXouZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUK
        mode: 0644
        path: /etc/test
$ oc create -f file.yaml 
machineconfig.machineconfiguration.openshift.io/test-file created
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-b8df3d134de8245b16227eec2c44b003   True      False      False      3              3                   3                     0                      3h36m
$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             3h34m
00-worker                                          82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             3h34m
01-master-container-runtime                        82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             3h34m
01-master-kubelet                                  82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             3h34m
01-worker-container-runtime                        82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             3h34m
01-worker-kubelet                                  82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             3h34m
99-master-generated-registries                     82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             3h34m
99-master-ssh                                                                                 3.1.0             3h41m
99-worker-generated-registries                     82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             3h34m
99-worker-realtime                                                                                              3h41m
99-worker-ssh                                                                                 3.1.0             3h41m
rendered-master-34dc347728dd226856851aa3b2efdc96   eb9778355a9020673e8ce9aee092cb98d80cde5e   3.1.0             3h34m
rendered-master-e7970f3bfa5fb5e620cb40b3bc75bc3a   82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             98m
rendered-worker-b8df3d134de8245b16227eec2c44b003   82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             98m
rendered-worker-c651389fdb0057a8941007d3e5b15f4b   82929b72f6901650c5849ba4afe7ccdbed48b17c   3.2.0             4s
rendered-worker-fca111ae774c0766b04dfd13759446a7   eb9778355a9020673e8ce9aee092cb98d80cde5e   3.1.0             3h34m
test-file                                                                                     3.1.0             9s
$ watch oc get mcp/worker
$ oc get nodes
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-128-71.us-west-2.compute.internal    Ready    master   3h47m   v1.20.0+d9c52cc
ip-10-0-149-13.us-west-2.compute.internal    Ready    worker   3h34m   v1.20.0+d9c52cc
ip-10-0-175-147.us-west-2.compute.internal   Ready    master   3h43m   v1.20.0+d9c52cc
ip-10-0-175-54.us-west-2.compute.internal    Ready    worker   3h38m   v1.20.0+d9c52cc
ip-10-0-213-88.us-west-2.compute.internal    Ready    worker   3h34m   v1.20.0+d9c52cc
ip-10-0-220-222.us-west-2.compute.internal   Ready    master   3h43m   v1.20.0+d9c52cc
$ oc debug node/ip-10-0-149-13.us-west-2.compute.internal
Starting pod/ip-10-0-149-13us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cd /etc
sh-4.4# ls -la test
-rw-r--r--. 1 root root 132 Jan 19 21:50 test
sh-4.4# chmod  1644 test
sh-4.4# ls -la test
-rw-r--r-T. 1 root root 132 Jan 19 21:50 test
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
$ oc get pods -A --field-selector spec.nodeName=ip-10-0-149-13.us-west-2.compute.internal
NAMESPACE                                NAME                                  READY   STATUS    RESTARTS   AGE
openshift-cluster-csi-drivers            aws-ebs-csi-driver-node-tlp78         3/3     Running   0          132m
openshift-cluster-node-tuning-operator   tuned-9xsrq                           1/1     Running   0          134m
openshift-console                        downloads-7ffdbf746f-n2zgr            1/1     Running   0          10m
openshift-dns                            dns-default-2p6ft                     3/3     Running   0          118m
openshift-image-registry                 image-registry-6979dcbcdb-lch8w       1/1     Running   0          10m
openshift-image-registry                 node-ca-z796r                         1/1     Running   0          132m
openshift-ingress-canary                 ingress-canary-npccs                  1/1     Running   0          135m
openshift-ingress                        router-default-56f657765b-bkprb       1/1     Running   0          10m
openshift-machine-config-operator        machine-config-daemon-f9whr           2/2     Running   0          117m
openshift-marketplace                    certified-operators-9xplr             1/1     Running   0          7m20s
openshift-marketplace                    community-operators-v6kbk             1/1     Running   0          9m51s
openshift-marketplace                    redhat-marketplace-2x4qd              1/1     Running   0          9m52s
openshift-marketplace                    redhat-operators-hl6ld                1/1     Running   0          9m51s
openshift-monitoring                     alertmanager-main-0                   5/5     Running   0          9m51s
openshift-monitoring                     alertmanager-main-2                   5/5     Running   0          9m51s
openshift-monitoring                     node-exporter-s8csj                   2/2     Running   0          132m
openshift-monitoring                     prometheus-adapter-5c6c9845f6-btdmr   1/1     Running   0          10m
openshift-monitoring                     prometheus-k8s-0                      7/7     Running   1          9m51s
openshift-monitoring                     thanos-querier-6996dfb5d5-2rwtw       5/5     Running   0          10m
openshift-multus                         multus-szqsg                          1/1     Running   0          124m
openshift-multus                         network-metrics-daemon-zjqtr          2/2     Running   0          129m
openshift-network-diagnostics            network-check-source-8b577f64-n24zz   1/1     Running   0          10m
openshift-network-diagnostics            network-check-target-xp6jn            1/1     Running   0          130m
openshift-sdn                            ovs-jm5q7                             1/1     Running   0          126m
openshift-sdn                            sdn-jmb6x                             2/2     Running   0          128m
$ oc -n openshift-machine-config-operator logs -f machine-config-daemon-f9whr -c machine-config-daemon
...SNIP...
E0119 21:53:34.627464    1861 reflector.go:138] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to watch *v1.MachineConfig: failed to list *v1.MachineConfig: Get "https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0": dial tcp 172.30.0.1:443: connect: no route to host
I0119 21:53:37.093598    1861 daemon.go:401] Node ip-10-0-149-13.us-west-2.compute.internal is not labeled node-role.kubernetes.io/master
I0119 21:53:37.202319    1861 daemon.go:816] Current config: rendered-worker-b8df3d134de8245b16227eec2c44b003
I0119 21:53:37.202340    1861 daemon.go:817] Desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b
I0119 21:53:37.209989    1861 update.go:1852] Disk currentConfig rendered-worker-c651389fdb0057a8941007d3e5b15f4b overrides node's currentConfig annotation rendered-worker-b8df3d134de8245b16227eec2c44b003
I0119 21:53:37.214066    1861 daemon.go:1099] Validating against pending config rendered-worker-c651389fdb0057a8941007d3e5b15f4b
I0119 21:53:37.233510    1861 daemon.go:1110] Validated on-disk state
I0119 21:53:37.249538    1861 daemon.go:1165] Completing pending config rendered-worker-c651389fdb0057a8941007d3e5b15f4b
I0119 21:53:37.259579    1861 update.go:1852] completed update for config rendered-worker-c651389fdb0057a8941007d3e5b15f4b
I0119 21:53:37.263088    1861 daemon.go:1181] In desired config rendered-worker-c651389fdb0057a8941007d3e5b15f4b
^C
$ oc -n openshift-machine-config-operator delete pod machine-config-daemon-f9whr 
pod "machine-config-daemon-f9whr" deleted
$ oc get pods -A --field-selector spec.nodeName=ip-10-0-149-13.us-west-2.compute.internal
NAMESPACE                                NAME                                  READY   STATUS    RESTARTS   AGE
openshift-cluster-csi-drivers            aws-ebs-csi-driver-node-tlp78         3/3     Running   0          134m
openshift-cluster-node-tuning-operator   tuned-9xsrq                           1/1     Running   0          136m
openshift-console                        downloads-7ffdbf746f-n2zgr            1/1     Running   0          12m
openshift-dns                            dns-default-2p6ft                     3/3     Running   0          120m
openshift-image-registry                 image-registry-6979dcbcdb-lch8w       1/1     Running   0          12m
openshift-image-registry                 node-ca-z796r                         1/1     Running   0          134m
openshift-ingress-canary                 ingress-canary-npccs                  1/1     Running   0          137m
openshift-ingress                        router-default-56f657765b-bkprb       1/1     Running   0          12m
openshift-machine-config-operator        machine-config-daemon-vx2wm           2/2     Running   0          17s
openshift-marketplace                    certified-operators-9xplr             1/1     Running   0          9m11s
openshift-marketplace                    community-operators-v6kbk             1/1     Running   0          11m
openshift-marketplace                    redhat-marketplace-2x4qd              1/1     Running   0          11m
openshift-marketplace                    redhat-operators-hl6ld                1/1     Running   0          11m
openshift-monitoring                     alertmanager-main-0                   5/5     Running   0          11m
openshift-monitoring                     alertmanager-main-2                   5/5     Running   0          11m
openshift-monitoring                     node-exporter-s8csj                   2/2     Running   0          134m
openshift-monitoring                     prometheus-adapter-5c6c9845f6-btdmr   1/1     Running   0          12m
openshift-monitoring                     prometheus-k8s-0                      7/7     Running   1          11m
openshift-monitoring                     thanos-querier-6996dfb5d5-2rwtw       5/5     Running   0          12m
openshift-multus                         multus-szqsg                          1/1     Running   0          126m
openshift-multus                         network-metrics-daemon-zjqtr          2/2     Running   0          131m
openshift-network-diagnostics            network-check-source-8b577f64-n24zz   1/1     Running   0          12m
openshift-network-diagnostics            network-check-target-xp6jn            1/1     Running   0          131m
openshift-sdn                            ovs-jm5q7                             1/1     Running   0          128m
openshift-sdn                            sdn-jmb6x                             2/2     Running   0          130m
$ oc -n openshift-machine-config-operator logs -f machine-config-daemon-vx2wm -c machine-config-daemon
I0119 22:05:31.455555   29704 start.go:108] Version: v4.7.0-202101190826.p0-dirty (82929b72f6901650c5849ba4afe7ccdbed48b17c)
I0119 22:05:31.458807   29704 start.go:121] Calling chroot("/rootfs")
I0119 22:05:31.458894   29704 rpm-ostree.go:261] Running captured: rpm-ostree status --json
I0119 22:05:32.542155   29704 daemon.go:224] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8672a3ce64788e4138bec0d2cafe701d96b55c35482314d393b52dd414e635a4 (47.83.202101171239-0)
I0119 22:05:33.084742   29704 daemon.go:231] Installed Ignition binary version: 2.9.0
I0119 22:05:33.123524   29704 start.go:97] Copied self to /run/bin/machine-config-daemon on host
I0119 22:05:33.129018   29704 update.go:1852] Starting to manage node: ip-10-0-149-13.us-west-2.compute.internal
I0119 22:05:33.130494   29704 metrics.go:105] Registering Prometheus metrics
I0119 22:05:33.130571   29704 metrics.go:110] Starting metrics listener on 127.0.0.1:8797
I0119 22:05:33.145313   29704 rpm-ostree.go:261] Running captured: rpm-ostree status
I0119 22:05:33.308288   29704 daemon.go:863] State: idle
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8672a3ce64788e4138bec0d2cafe701d96b55c35482314d393b52dd414e635a4
              CustomOrigin: Managed by machine-config-operator
                   Version: 47.83.202101171239-0 (2021-01-17T12:42:48Z)
       RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-240.10.1.el8_3
           LayeredPackages: kernel-rt-core kernel-rt-kvm kernel-rt-modules
                            kernel-rt-modules-extra

  pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:eee4994c87fd8aef1da4676820522de94c82679b9a248578f828592d1c17cafa
              CustomOrigin: Managed by machine-config-operator
                   Version: 46.82.202012151054-0 (2020-12-15T10:57:15Z)
       RemovedBasePackages: kernel-core kernel-modules kernel kernel-modules-extra 4.18.0-193.37.1.el8_2
           LayeredPackages: kernel-rt-core kernel-rt-kvm kernel-rt-modules
                            kernel-rt-modules-extra
I0119 22:05:33.308310   29704 rpm-ostree.go:261] Running captured: journalctl --list-boots
I0119 22:05:33.317335   29704 daemon.go:870] journalctl --list-boots:
-3 d537cd4071c540aea8c1289ad032a38a Tue 2021-01-19 18:16:36 UTC—Tue 2021-01-19 18:24:21 UTC
-2 558245af81c148c3af8a616dacd54cb5 Tue 2021-01-19 18:24:37 UTC—Tue 2021-01-19 20:23:58 UTC
-1 d9576b7e83db43bfba9ad9c5651e9561 Tue 2021-01-19 20:24:15 UTC—Tue 2021-01-19 21:51:05 UTC
 0 c3b56729f6d746d5a5553f3e8d21e2b0 Tue 2021-01-19 21:52:54 UTC—Tue 2021-01-19 22:05:33 UTC
I0119 22:05:33.317359   29704 rpm-ostree.go:261] Running captured: systemctl list-units --state=failed --no-legend
I0119 22:05:33.330508   29704 daemon.go:885] systemd service state: OK
I0119 22:05:33.330528   29704 daemon.go:617] Starting MachineConfigDaemon
I0119 22:05:33.330643   29704 daemon.go:624] Enabling Kubelet Healthz Monitor
I0119 22:05:39.231392   29704 daemon.go:401] Node ip-10-0-149-13.us-west-2.compute.internal is not labeled node-role.kubernetes.io/master
I0119 22:05:39.250356   29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b
I0119 22:05:39.271534   29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b
E0119 22:05:39.284789   29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644
I0119 22:05:41.367319   29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b
I0119 22:05:41.373858   29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b
E0119 22:05:41.380791   29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644
I0119 22:05:49.398285   29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b
I0119 22:05:49.405959   29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b
E0119 22:05:49.432608   29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644
I0119 22:06:08.950911   29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b
I0119 22:06:08.958261   29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b
E0119 22:06:08.966246   29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644
I0119 22:06:42.161728   29704 daemon.go:809] Current+desired config: rendered-worker-c651389fdb0057a8941007d3e5b15f4b
I0119 22:06:42.184905   29704 daemon.go:1102] Validating against current config rendered-worker-c651389fdb0057a8941007d3e5b15f4b
E0119 22:06:42.208143   29704 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-c651389fdb0057a8941007d3e5b15f4b: mode mismatch for file: "/etc/test"; expected: -rw-r--r--/420/0644; received: trw-r--r--/1048996/04000644
^C
$

Comment 7 errata-xmlrpc 2021-02-24 15:28:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.