Bug 1766515 - MCD could shadow errors since the force validation file isn't deleted once used [NEEDINFO]
Summary: MCD could shadow errors since the force validation file isn't deleted once used
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.1.z
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.1.z
Assignee: Antonio Murdaca
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On: 1766513 1766514
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-29 10:29 UTC by Antonio Murdaca
Modified: 2020-02-13 06:14 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1766514
Environment:
Last Closed: 2020-02-13 06:14:02 UTC
Target Upstream Version:
mnguyen: needinfo? (amurdaca)
amurdaca: needinfo? (walters)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1147 'None' closed Bug 1766515: pkg/daemon: remove force validation file if it exists 2020-06-08 10:02:42 UTC
Red Hat Product Errata RHBA-2020:0399 None None None 2020-02-13 06:14:11 UTC

Description Antonio Murdaca 2019-10-29 10:29:33 UTC
+++ This bug was initially created as a clone of Bug #1766514 +++

+++ This bug was initially created as a clone of Bug #1766513 +++

Description of problem:

MCO added a "force overwrite" file to allow people to fix things when they know they wanted to synchronize to the current state. However, persisting it on disk by default thereafter wasn't explicitly intended - it will silently mask errors.

The flag is much better as a "one time" intervention.

Version-Release number of selected component (if applicable):

from >=4.1


How reproducible:

always


Steps to Reproduce:
1. create a skew in configs (like certs recovery)
2. fix the above using the validation file
3. re-create the skew and notice the error is not bubbled up

Actual results:

errors being shadowed

Expected results:

errors properly bubbled up

Additional info:

Comment 2 Michael Nguyen 2020-01-27 22:01:16 UTC
I was unable to verify this on 4.1.0-0.nightly-2020-01-24-051818.

It looks like `/run/machine-config-daemon-force` file does get deleted but the presence of the file does not get rid of the the `unexpected on-disk state validating against rendered-worker-<hash>` messages in the machine config daemon logs.

---

$ cat <<EOF> file.yaml
> apiVersion: machineconfiguration.openshift.io/v1
> kind: MachineConfig
> metadata:
>   labels:
>     machineconfiguration.openshift.io/role: worker
>   name: test-file
> spec:
>   config:
>     ignition:
>       version: 2.2.0
>     storage:
>       files:
>       - contents:
>           source: data:text/plain;charset=utf;base64,c2VydmVyIGZvby5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmFyLmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCnNlcnZlciBiYXouZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUK
>         filesystem: root
>         mode: 420
>         path: /etc/test
> EOF
$ oc apply -f file.yaml 
machineconfig.machineconfiguration.openshift.io/test-file created
$ oc get machineconfig
NAME                                                        GENERATEDBYCONTROLLER                      IGNITIONVERSION   CREATED
00-master                                                   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
00-worker                                                   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
01-master-container-runtime                                 f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
01-master-kubelet                                           f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
01-worker-container-runtime                                 f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
01-worker-kubelet                                           f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
99-master-83fb059a-4141-11ea-9470-021dc71b55f4-registries   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
99-master-ssh                                                                                          2.2.0             36m
99-worker-83fcabf5-4141-11ea-9470-021dc71b55f4-registries   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
99-worker-ssh                                                                                          2.2.0             36m
rendered-master-11cbe9816f24ece46bcf6deebaa97aab            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
rendered-worker-7431c125d6269142bc764c08498b0f51            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             35m
rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             12s
test-file                                                                                              2.2.0             17s
$ oc get machineconfigpool
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED
master   rendered-master-11cbe9816f24ece46bcf6deebaa97aab   True      False      False
worker   rendered-worker-7431c125d6269142bc764c08498b0f51   False     True       False
$ oc get node
NAME                                         STATUS                     ROLES    AGE   VERSION
ip-10-0-128-38.us-east-2.compute.internal    Ready                      master   38m   v1.13.4+66712a2cb
ip-10-0-129-85.us-east-2.compute.internal    Ready                      worker   33m   v1.13.4+66712a2cb
ip-10-0-132-111.us-east-2.compute.internal   Ready,SchedulingDisabled   worker   33m   v1.13.4+66712a2cb
ip-10-0-140-59.us-east-2.compute.internal    Ready                      master   38m   v1.13.4+66712a2cb
ip-10-0-150-154.us-east-2.compute.internal   Ready                      master   38m   v1.13.4+66712a2cb
ip-10-0-155-193.us-east-2.compute.internal   Ready                      worker   33m   v1.13.4+66712a2cb
$ oc get node
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-128-38.us-east-2.compute.internal    Ready    master   47m   v1.13.4+66712a2cb
ip-10-0-129-85.us-east-2.compute.internal    Ready    worker   42m   v1.13.4+66712a2cb
ip-10-0-132-111.us-east-2.compute.internal   Ready    worker   42m   v1.13.4+66712a2cb
ip-10-0-140-59.us-east-2.compute.internal    Ready    master   46m   v1.13.4+66712a2cb
ip-10-0-150-154.us-east-2.compute.internal   Ready    master   47m   v1.13.4+66712a2cb
ip-10-0-155-193.us-east-2.compute.internal   Ready    worker   42m   v1.13.4+66712a2cb
$ oc debug node/ip-10-0-132-111.us-east-2.compute.internal
Starting pod/ip-10-0-132-111us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# ls
bin  boot  dev	etc  home  lib	lib64  media  mnt  opt	ostree	proc  root  run  sbin  srv  sys  sysroot  tmp  usr  var
sh-4.4# vi /etc/test
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
$ oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon --field-selector spec.nodeName=ip-10-0-132-111.us-east-2.compute.internal
NAME                          READY   STATUS    RESTARTS   AGE
machine-config-daemon-gb5hx   1/1     Running   1          43m
$ oc  -n openshift-machine-config-operator logs  machine-config-daemon-gb5hx 
I0127 20:51:44.247947    2454 start.go:67] Version: 4.1.33-202001240309-dirty (f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b)
I0127 20:51:44.250504    2454 start.go:100] Starting node writer
I0127 20:51:44.255466    2454 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I0127 20:51:44.318067    2454 daemon.go:200] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c97112885ec4102014f8cd56825d7170f92dbcc9d17ab243566843040ab33aa2 (410.81.20200114.0)
I0127 20:51:44.342819    2454 start.go:196] Calling chroot("/rootfs")
I0127 20:51:44.342938    2454 start.go:206] Starting MachineConfigDaemon
I0127 20:51:44.343395    2454 update.go:848] Starting to manage node: ip-10-0-132-111.us-east-2.compute.internal
I0127 20:51:44.350349    2454 run.go:22] Running captured: rpm-ostree status
I0127 20:51:44.422768    2454 daemon.go:740] State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c97112885ec4102014f8cd56825d7170f92dbcc9d17ab243566843040ab33aa2
              CustomOrigin: Managed by pivot tool
                   Version: 410.81.20200114.0 (2020-01-14T04:56:15Z)

  pivot://docker-registry-default.cloud.registry.upshift.redhat.com/redhat-coreos/ootpa@sha256:683a6a866a8ec789fedb5da63b6a2ff68c1b0788ec90e7def778f0c4c13197a4
              CustomOrigin: Provisioned from oscontainer
                   Version: 410.8.20190520.0 (2019-05-20T20:10:04Z)
I0127 20:51:44.422911    2454 run.go:22] Running captured: journalctl --list-boots
I0127 20:51:44.430904    2454 daemon.go:747] journalctl --list-boots:
-2 319aa23a945d474ab8654abe721645a2 Mon 2020-01-27 20:15:11 UTC—Mon 2020-01-27 20:16:34 UTC
-1 4dd8d4aef72341ac86370ebe4f7b1f4d Mon 2020-01-27 20:17:00 UTC—Mon 2020-01-27 20:51:03 UTC
 0 d771bbdd87354e15be5e235ba67bf0c8 Mon 2020-01-27 20:51:29 UTC—Mon 2020-01-27 20:51:44 UTC
I0127 20:51:44.430931    2454 daemon.go:494] Enabling Kubelet Healthz Monitor
E0127 20:51:50.528104    2454 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
E0127 20:51:50.528114    2454 reflector.go:134] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to list *v1.MachineConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
I0127 20:51:54.752160    2454 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 20:51:54.816608    2454 daemon.go:702] Current config: rendered-worker-7431c125d6269142bc764c08498b0f51
I0127 20:51:54.816631    2454 daemon.go:703] Desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 20:51:54.816639    2454 update.go:848] using pending config same as desired
I0127 20:51:54.825700    2454 daemon.go:892] Using current config on disk rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 20:51:54.825749    2454 update.go:848] Using current config on disk rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 20:51:54.827752    2454 update.go:848] Validating against pending config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 20:51:54.830015    2454 daemon.go:909] Validating against pending config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 20:51:54.832973    2454 daemon.go:926] Validated on-disk state
I0127 20:51:54.842266    2454 update.go:813] logger doesn't support --jounald, logging json directly
I0127 20:51:54.845349    2454 daemon.go:960] Completing pending config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 20:51:54.851224    2454 update.go:848] completed update for config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 20:51:54.853423    2454 daemon.go:973] In desired config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
$ oc get machineconfigpool
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED
master   rendered-master-11cbe9816f24ece46bcf6deebaa97aab   True      False      False
worker   rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2   True      False      False
$ oc  -n openshift-machine-config-operator delete pod machine-config-daemon-gb5hx 
pod "machine-config-daemon-gb5hx" deleted
$ oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon --field-selector spec.nodeName=ip-10-0-132-111.us-east-2.compute.internal
NAME                          READY   STATUS    RESTARTS   AGE
machine-config-daemon-9tjcb   1/1     Running   0          4s
$ oc  -n openshift-machine-config-operator log machine-config-daemon-9tjcb
E0127 21:05:27.866909   35396 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:05:30.387785   35396 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:05:30.463014   35396 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:05:30.465457   35396 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:05:30.471620   35396 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:05:30.473786   35396 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:05:30.473874   35396 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:05:30.473902   35396 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:05:30.689310   35396 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:05:30.762203   35396 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:05:30.764394   35396 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:05:30.770157   35396 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:05:30.772270   35396 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:05:30.772385   35396 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:05:30.772423   35396 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
$ oc debug node/ip-10-0-132-111.us-east-2.compute.internal
Starting pod/ip-10-0-132-111us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# touch /run/machine-config-daemon-force
sh-4.4# cat /etc/test
server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline
MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline
sh-4.4# ls /run
NetworkManager		       containers  dmeventd-client  initctl	log    openshift-sdn  rpcbind	    sepermit  tmpfiles.d
agetty.reload		       crio	   dmeventd-server  initramfs	lvm    openvswitch    rpcbind.sock  setrans   udev
chrony			       criu	   faillock	    issue.d	mdadm  ostree-booted  runc-ctrs     sshd.pid  user
console			       cryptsetup  gluster	    kubernetes	media  podman	      samba	    sudo      utmp
console-login-helper-messages  dbus	   ignition.env     lock	mount  rpc.statd.pid  secrets	    systemd
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
$ oc  -n openshift-machine-config-operator logs machine-config-daemon-9tjcb
I0127 21:06:02.841042   35396 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:06:02.843515   35396 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:06:02.849476   35396 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:06:02.851572   35396 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:06:02.851701   35396 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:06:02.851742   35396 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:06:05.846792   35396 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:06:05.911678   35396 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:06:05.914175   35396 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:06:05.919966   35396 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:06:05.922040   35396 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:06:05.922166   35396 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:06:05.922207   35396 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:06:55.584571   35396 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:06:55.663204   35396 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:06:55.665652   35396 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:06:55.672101   35396 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:06:55.674815   35396 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:06:55.674852   35396 daemon.go:921] Skipping on-disk validation; /run/machine-config-daemon-force present
I0127 21:06:55.674873   35396 daemon.go:926] Validated on-disk state
I0127 21:06:55.682895   35396 daemon.go:973] In desired config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
$ oc  -n openshift-machine-config-operator delete pod machine-config-daemon-9tjcb
pod "machine-config-daemon-9tjcb" deleted
$ oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon --field-selector spec.nodeName=ip-10-0-132-111.us-east-2.compute.internal
NAME                          READY   STATUS    RESTARTS   AGE
machine-config-daemon-rrbhq   1/1     Running   0          2s
$ oc  -n openshift-machine-config-operator logs machine-config-daemon-rrbhq
A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:08:20.683225   42142 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:08:20.921972   42142 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:08:20.989107   42142 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:08:20.991413   42142 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:08:20.997469   42142 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:08:20.999538   42142 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:08:20.999655   42142 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:08:20.999697   42142 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:08:23.989606   42142 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:08:24.058663   42142 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:08:24.061127   42142 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:08:24.067693   42142 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:08:24.069987   42142 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:08:24.070098   42142 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:08:24.070136   42142 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2

$ oc get node
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-128-38.us-east-2.compute.internal    Ready    master   56m   v1.13.4+66712a2cb
ip-10-0-129-85.us-east-2.compute.internal    Ready    worker   51m   v1.13.4+66712a2cb
ip-10-0-132-111.us-east-2.compute.internal   Ready    worker   51m   v1.13.4+66712a2cb
ip-10-0-140-59.us-east-2.compute.internal    Ready    master   55m   v1.13.4+66712a2cb
ip-10-0-150-154.us-east-2.compute.internal   Ready    master   55m   v1.13.4+66712a2cb
ip-10-0-155-193.us-east-2.compute.internal   Ready    worker   51m   v1.13.4+66712a2cb
$ oc debug node/ip-10-0-132-111.us-east-2.compute.internal
Starting pod/ip-10-0-132-111us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# ls
bin  boot  dev	etc  home  lib	lib64  media  mnt  opt	ostree	proc  root  run  sbin  srv  sys  sysroot  tmp  usr  var
sh-4.4# cd /run
sh-4.4# ls
NetworkManager		       containers  dmeventd-client  initctl	log    openshift-sdn  rpcbind	    sepermit  tmpfiles.d
agetty.reload		       crio	   dmeventd-server  initramfs	lvm    openvswitch    rpcbind.sock  setrans   udev
chrony			       criu	   faillock	    issue.d	mdadm  ostree-booted  runc-ctrs     sshd.pid  user
console			       cryptsetup  gluster	    kubernetes	media  podman	      samba	    sudo      utmp
console-login-helper-messages  dbus	   ignition.env     lock	mount  rpc.statd.pid  secrets	    systemd
sh-4.4# touch machine-config-daemon-force
sh-4.4# ls
NetworkManager		       crio		faillock      kubernetes		   media	  rpc.statd.pid  sepermit    udev
agetty.reload		       criu		gluster       lock			   mount	  rpcbind	 setrans     user
chrony			       cryptsetup	ignition.env  log			   openshift-sdn  rpcbind.sock	 sshd.pid    utmp
console			       dbus		initctl       lvm			   openvswitch	  runc-ctrs	 sudo
console-login-helper-messages  dmeventd-client	initramfs     machine-config-daemon-force  ostree-booted  samba		 systemd
containers		       dmeventd-server	issue.d       mdadm			   podman	  secrets	 tmpfiles.d
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
$ oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon --field-selector spec.nodeName=ip-10-0-132-111.us-east-2.compute.internal
NAME                          READY   STATUS    RESTARTS   AGE
machine-config-daemon-rrbhq   1/1     Running   0          98s
$ oc  -n openshift-machine-config-operator delete pod machine-config-daemon-rrbhq
pod "machine-config-daemon-rrbhq" deleted
$ oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon --field-selector spec.nodeName=ip-10-0-132-111.us-east-2.compute.internal
NAME                          READY   STATUS    RESTARTS   AGE
machine-config-daemon-hgkvf   1/1     Running   0          3s
$ oc  -n openshift-machine-config-operator logs machine-config-daemon-hgkvf


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:10:02.718849   46701 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:02.981285   46701 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:10:03.067560   46701 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:10:03.070014   46701 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.075781   46701 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.077876   46701 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:10:03.077985   46701 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:10:03.078017   46701 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.709709   46701 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:10:03.780747   46701 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:10:03.783188   46701 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.790528   46701 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.792886   46701 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:10:03.792986   46701 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:10:03.793015   46701 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.797959   46701 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:10:03.866250   46701 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:10:03.868901   46701 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.877906   46701 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.881820   46701 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:10:03.885180   46701 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:10:03.885231   46701 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.894068   46701 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:10:03.966770   46701 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:10:03.969136   46701 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.975347   46701 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:03.977550   46701 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:10:03.977654   46701 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:10:03.977682   46701 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
$ oc  -n openshift-machine-config-operator logs machine-config-daemon-hgkvf
E0127 21:10:05.892959   46701 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:05.898238   46701 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:10:05.972224   46701 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:10:05.974479   46701 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:05.980012   46701 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:05.982187   46701 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:10:05.982275   46701 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:10:05.982304   46701 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:06.010701   46701 update.go:737] logger doesn't support --jounald, grepping the journal
I0127 21:10:06.087374   46701 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0127 21:10:06.089545   46701 daemon.go:695] Current+desired config: rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:06.095036   46701 update.go:848] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
I0127 21:10:06.097207   46701 daemon.go:913] Validating against current config rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2
E0127 21:10:06.097288   46701 daemon.go:1298] content mismatch for file /etc/test: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline


A: MY CHANGES HERE
server baz.example.net maxdelay 0.4 offline


B: server baz.example.net maxdelay 0.4 offline


E0127 21:10:06.097315   46701 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-8f5df6f2750ac53ad1853b79af6ed3a2

Comment 6 Michael Nguyen 2020-01-28 20:18:12 UTC
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2020-01-24-051818   True        False         53m     Cluster version is 4.1.0-0.nightly-2020-01-24-051818

$ cat file.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: test-file
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf;base64,c2VydmVyIGZvby5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmFyLmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCnNlcnZlciBiYXouZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUK
        filesystem: root
        mode: 420
        path: /etc/test
$ oc apply -f file.yaml 
machineconfig.machineconfiguration.openshift.io/test-file created
$ oc get nodes
NAME                                         STATUS                     ROLES    AGE   VERSION
ip-10-0-128-214.us-west-2.compute.internal   Ready                      master   23m   v1.13.4+66712a2cb
ip-10-0-135-85.us-west-2.compute.internal    Ready                      worker   16m   v1.13.4+66712a2cb
ip-10-0-141-8.us-west-2.compute.internal     Ready                      worker   16m   v1.13.4+66712a2cb
ip-10-0-142-188.us-west-2.compute.internal   Ready                      master   24m   v1.13.4+66712a2cb
ip-10-0-159-116.us-west-2.compute.internal   Ready,SchedulingDisabled   worker   16m   v1.13.4+66712a2cb
ip-10-0-159-207.us-west-2.compute.internal   Ready                      master   24m   v1.13.4+66712a2cb
$ oc get machineconfig
NAME                                                        GENERATEDBYCONTROLLER                      IGNITIONVERSION   CREATED
00-master                                                   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
00-worker                                                   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
01-master-container-runtime                                 f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
01-master-kubelet                                           f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
01-worker-container-runtime                                 f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
01-worker-kubelet                                           f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
99-master-0709f2ef-4201-11ea-b538-02f6b1bc85c8-registries   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
99-master-ssh                                                                                          2.2.0             22m
99-worker-070b87d3-4201-11ea-b538-02f6b1bc85c8-registries   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
99-worker-ssh                                                                                          2.2.0             22m
rendered-master-6bbd9797ae186414e76ecf06e886a149            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
rendered-worker-110129aa1ce6e1f794d39ec37480180e            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             19s
rendered-worker-4af35b7c2fddf5d4ff850e47d8dad58b            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
test-file                                                                                              2.2.0             24s
$ oc get machineconfigpool
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED
master   rendered-master-6bbd9797ae186414e76ecf06e886a149   True      False      False
worker   rendered-worker-4af35b7c2fddf5d4ff850e47d8dad58b   False     True       False
$ oc get node
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-128-214.us-west-2.compute.internal   Ready    master   28m   v1.13.4+66712a2cb
ip-10-0-135-85.us-west-2.compute.internal    Ready    worker   21m   v1.13.4+66712a2cb
ip-10-0-141-8.us-west-2.compute.internal     Ready    worker   22m   v1.13.4+66712a2cb
ip-10-0-142-188.us-west-2.compute.internal   Ready    master   29m   v1.13.4+66712a2cb
ip-10-0-159-116.us-west-2.compute.internal   Ready    worker   22m   v1.13.4+66712a2cb
ip-10-0-159-207.us-west-2.compute.internal   Ready    master   29m   v1.13.4+66712a2cb
$ oc get machineconfigpool
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED
master   rendered-master-6bbd9797ae186414e76ecf06e886a149   True      False      False
worker   rendered-worker-110129aa1ce6e1f794d39ec37480180e   True      False      False

$ oc debug node/ip-10-0-135-85.us-west-2.compute.internal
Starting pod/ip-10-0-135-85us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cat /etc/test
server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline
server baz.example.net maxdelay 0.4 offline
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
$ oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon --field-selector spec.nodeName=ip-10-0-135-85.us-west-2.compute.internal
NAME                          READY   STATUS    RESTARTS   AGE
machine-config-daemon-dmld8   1/1     Running   1          22m
$ oc -n openshift-machine-config-operator log -f ip-10-0-135-85.us-west-2.compute.internal
Error from server (NotFound): pods "ip-10-0-135-85.us-west-2.compute.internal" not found
$ oc -n openshift-machine-config-operator log -f machine-config-daemon-dmld8
I0128 19:31:10.192177    2499 start.go:67] Version: 4.1.33-202001240309-dirty (f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b)
I0128 19:31:10.193418    2499 start.go:100] Starting node writer
I0128 19:31:10.198475    2499 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I0128 19:31:10.298067    2499 daemon.go:200] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c97112885ec4102014f8cd56825d7170f92dbcc9d17ab243566843040ab33aa2 (410.81.20200114.0)
I0128 19:31:10.312937    2499 start.go:196] Calling chroot("/rootfs")
I0128 19:31:10.313047    2499 start.go:206] Starting MachineConfigDaemon
I0128 19:31:10.313831    2499 update.go:848] Starting to manage node: ip-10-0-135-85.us-west-2.compute.internal
I0128 19:31:10.324413    2499 run.go:22] Running captured: rpm-ostree status
I0128 19:31:10.402438    2499 daemon.go:740] State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c97112885ec4102014f8cd56825d7170f92dbcc9d17ab243566843040ab33aa2
              CustomOrigin: Managed by pivot tool
                   Version: 410.81.20200114.0 (2020-01-14T04:56:15Z)

  pivot://docker-registry-default.cloud.registry.upshift.redhat.com/redhat-coreos/ootpa@sha256:683a6a866a8ec789fedb5da63b6a2ff68c1b0788ec90e7def778f0c4c13197a4
              CustomOrigin: Provisioned from oscontainer
                   Version: 410.8.20190520.0 (2019-05-20T20:10:04Z)
I0128 19:31:10.402578    2499 run.go:22] Running captured: journalctl --list-boots
I0128 19:31:10.409642    2499 daemon.go:747] journalctl --list-boots:
-2 70ecf3ac3f39470498ef88286fc264d4 Tue 2020-01-28 19:06:36 UTC—Tue 2020-01-28 19:08:32 UTC
-1 047eeb1626d14f57ab32088b4bc19d4c Tue 2020-01-28 19:08:59 UTC—Tue 2020-01-28 19:30:22 UTC
 0 a77073bf32d845e89e9ebdcc28f8a669 Tue 2020-01-28 19:30:54 UTC—Tue 2020-01-28 19:31:10 UTC
I0128 19:31:10.409668    2499 daemon.go:494] Enabling Kubelet Healthz Monitor
E0128 19:31:15.933015    2499 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
E0128 19:31:15.933016    2499 reflector.go:134] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to list *v1.MachineConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
I0128 19:31:19.553639    2499 update.go:737] logger doesn't support --jounald, grepping the journal
I0128 19:31:19.605842    2499 daemon.go:702] Current config: rendered-worker-4af35b7c2fddf5d4ff850e47d8dad58b
I0128 19:31:19.605864    2499 daemon.go:703] Desired config: rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:31:19.605874    2499 update.go:848] using pending config same as desired
I0128 19:31:19.615323    2499 daemon.go:892] Using current config on disk rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:31:19.615347    2499 update.go:848] Using current config on disk rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:31:19.617388    2499 update.go:848] Validating against pending config rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:31:19.619557    2499 daemon.go:909] Validating against pending config rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:31:19.621896    2499 daemon.go:926] Validated on-disk state
I0128 19:31:19.630533    2499 update.go:813] logger doesn't support --jounald, logging json directly
I0128 19:31:19.632718    2499 daemon.go:960] Completing pending config rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:31:19.642474    2499 update.go:848] completed update for config rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:31:19.644790    2499 daemon.go:973] In desired config rendered-worker-110129aa1ce6e1f794d39ec37480180e

$ oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon --field-selector spec.nodeName=ip-10-0-135-85.us-west-2.compute.internal
NAME                          READY   STATUS    RESTARTS   AGE
machine-config-daemon-dmld8   1/1     Running   1          30m
$ oc -n openshift-machine-config-operator delete pod machine-config-daemon-dmld8
pod "machine-config-daemon-dmld8" deleted
$ oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon --field-selector spec.nodeName=ip-10-0-135-85.us-west-2.compute.internal
NAME                          READY   STATUS    RESTARTS   AGE
machine-config-daemon-zfr5s   1/1     Running   0          4s
$ oc -n openshift-machine-config-operator log -f machine-config-daemon-zfr5s
E0128 19:42:58.999590   14755 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:43:00.865991   14755 update.go:737] logger doesn't support --jounald, grepping the journal
I0128 19:43:00.914288   14755 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0128 19:43:00.916897   14755 daemon.go:695] Current+desired config: rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:43:00.922716   14755 update.go:848] Validating against current config rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:43:00.924887   14755 daemon.go:913] Validating against current config rendered-worker-110129aa1ce6e1f794d39ec37480180e
E0128 19:43:00.924998   14755 daemon.go:1298] content mismatch for file /etc/test: 

A: aserver foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline
server baz.example.net maxdelay 0.4 offline


B: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline
server baz.example.net maxdelay 0.4 offline


E0128 19:43:00.925030   14755 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-110129aa1ce6e1f794d39ec37480180e

$ oc debug node/ip-10-0-135-85.us-west-2.compute.internal
Starting pod/ip-10-0-135-85us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# touch /run/machine-config-daemon-force && ls /run
NetworkManager	console-login-helper-messages  cryptsetup	faillock      initramfs   log			       media	      ostree-booted  rpcbind.sock  sepermit  systemd	 utmp
agetty.reload	containers		       dbus		gluster       issue.d	  lvm			       mount	      podman	     runc-ctrs	   setrans   tmpfiles.d
chrony		crio			       dmeventd-client	ignition.env  kubernetes  machine-config-daemon-force  openshift-sdn  rpc.statd.pid  samba	   sshd.pid  udev
console		criu			       dmeventd-server	initctl       lock	  mdadm			       openvswitch    rpcbind	     secrets	   sudo      user
sh-4.4# ls run
NetworkManager	console-login-helper-messages  cryptsetup	faillock      initramfs   log			       media	      ostree-booted  rpcbind.sock  sepermit  systemd	 utmp
agetty.reload	containers		       dbus		gluster       issue.d	  lvm			       mount	      podman	     runc-ctrs	   setrans   tmpfiles.d
chrony		crio			       dmeventd-client	ignition.env  kubernetes  machine-config-daemon-force  openshift-sdn  rpc.statd.pid  samba	   sshd.pid  udev
console		criu			       dmeventd-server	initctl       lock	  mdadm			       openvswitch    rpcbind	     secrets	   sudo      user
sh-4.4# ls /run
NetworkManager	console-login-helper-messages  cryptsetup	faillock      initramfs   log	 mount		podman	       runc-ctrs  setrans   tmpfiles.d
agetty.reload	containers		       dbus		gluster       issue.d	  lvm	 openshift-sdn	rpc.statd.pid  samba	  sshd.pid  udev
chrony		crio			       dmeventd-client	ignition.env  kubernetes  mdadm  openvswitch	rpcbind        secrets	  sudo	    user
console		criu			       dmeventd-server	initctl       lock	  media  ostree-booted	rpcbind.sock   sepermit   systemd   utmp
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
(reverse-i-search)`jour': ^Curnalctl -b --no-pager | grep -i audio
$ ^C
$ oc -n openshift-machine-config-operator log -f machine-config-daemon-zfr5s
E0128 19:44:39.196371   14755 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:44:41.067155   14755 update.go:737] logger doesn't support --jounald, grepping the journal
I0128 19:44:41.125626   14755 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0128 19:44:41.128094   14755 daemon.go:695] Current+desired config: rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:44:41.134812   14755 update.go:848] Validating against current config rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:44:41.137025   14755 daemon.go:913] Validating against current config rendered-worker-110129aa1ce6e1f794d39ec37480180e
E0128 19:44:41.137129   14755 daemon.go:1298] content mismatch for file /etc/test: 

A: aserver foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline
server baz.example.net maxdelay 0.4 offline


B: server foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline
server baz.example.net maxdelay 0.4 offline


E0128 19:44:41.137159   14755 writer.go:132] Marking Degraded due to: during bootstrap: unexpected on-disk state validating against rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:45:13.733535   14755 update.go:737] logger doesn't support --jounald, grepping the journal
I0128 19:45:13.782534   14755 update.go:848] error loading pending config open /etc/machine-config-daemon/state.json: no such file or directory
I0128 19:45:13.785213   14755 daemon.go:695] Current+desired config: rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:45:13.791112   14755 update.go:848] Validating against current config rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:45:13.793269   14755 daemon.go:913] Validating against current config rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:45:13.793306   14755 daemon.go:921] Skipping on-disk validation; /run/machine-config-daemon-force present
I0128 19:45:13.793326   14755 daemon.go:926] Validated on-disk state
I0128 19:45:13.800341   14755 daemon.go:973] In desired config rendered-worker-110129aa1ce6e1f794d39ec37480180e

$ cat<< EOF |base64 -w0
> aserver foo.example.net maxdelay 0.4 offline
> server bar.example.net maxdelay 0.4 offline
> server baz.example.net maxdelay 0.4 offline
> EOF
YXNlcnZlciBmb28uZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUKc2VydmVyIGJhci5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmF6LmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCg==$ echo YXNlcnZlciBmb28uZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUKc2VydmVyIGJhci5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmF6LmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCg== |base64 -d
aserver foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline
server baz.example.net maxdelay 0.4 offline
$ vi file2.yaml 
$ cat file2.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: test-file
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf;base64,YXNlcnZlciBmb28uZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUKc2VydmVyIGJhci5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmF6LmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCg==
        filesystem: root
        mode: 420
        path: /etc/test

$ oc apply -f file2.yaml 
machineconfig.machineconfiguration.openshift.io/test-file configured
$ oc get machineconfig
NAME                                                        GENERATEDBYCONTROLLER                      IGNITIONVERSION   CREATED
00-master                                                   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
00-worker                                                   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
01-master-container-runtime                                 f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
01-master-kubelet                                           f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
01-worker-container-runtime                                 f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
01-worker-kubelet                                           f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
99-master-0709f2ef-4201-11ea-b538-02f6b1bc85c8-registries   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
99-master-ssh                                                                                          2.2.0             45m
99-worker-070b87d3-4201-11ea-b538-02f6b1bc85c8-registries   f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
99-worker-ssh                                                                                          2.2.0             45m
rendered-master-6bbd9797ae186414e76ecf06e886a149            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
rendered-worker-09cd02b3b9f86900b08dd56a9be1f247            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             1s
rendered-worker-110129aa1ce6e1f794d39ec37480180e            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             22m
rendered-worker-4af35b7c2fddf5d4ff850e47d8dad58b            f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b   2.2.0             45m
test-file                                                                                              2.2.0             22m
$ oc get machineconfigpool
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED
master   rendered-master-6bbd9797ae186414e76ecf06e886a149   True      False      False
worker   rendered-worker-110129aa1ce6e1f794d39ec37480180e   False     True       False
$ oc get machineconfig/test-file -o yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"machineconfiguration.openshift.io/v1","kind":"MachineConfig","metadata":{"annotations":{},"labels":{"machineconfiguration.openshift.io/role":"worker"},"name":"test-file"},"spec":{"config":{"ignition":{"version":"2.2.0"},"storage":{"files":[{"contents":{"source":"data:text/plain;charset=utf;base64,YXNlcnZlciBmb28uZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUKc2VydmVyIGJhci5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmF6LmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCg=="},"filesystem":"root","mode":420,"path":"/etc/test"}]}}}}
  creationTimestamp: "2020-01-28T19:27:09Z"
  generation: 2
  labels:
    machineconfiguration.openshift.io/role: worker
  name: test-file
  resourceVersion: "26295"
  selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigs/test-file
  uid: 2d70a384-4204-11ea-a370-02ba63c26eac
spec:
  config:
    ignition:
      version: 2.2.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf;base64,YXNlcnZlciBmb28uZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUKc2VydmVyIGJhci5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmF6LmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCg==
        filesystem: root
        mode: 420
        path: /etc/test
$ oc get node
NAME                                         STATUS                     ROLES    AGE   VERSION
ip-10-0-128-214.us-west-2.compute.internal   Ready                      master   46m   v1.13.4+66712a2cb
ip-10-0-135-85.us-west-2.compute.internal    Ready                      worker   39m   v1.13.4+66712a2cb
ip-10-0-141-8.us-west-2.compute.internal     Ready                      worker   40m   v1.13.4+66712a2cb
ip-10-0-142-188.us-west-2.compute.internal   Ready                      master   47m   v1.13.4+66712a2cb
ip-10-0-159-116.us-west-2.compute.internal   Ready,SchedulingDisabled   worker   40m   v1.13.4+66712a2cb
ip-10-0-159-207.us-west-2.compute.internal   Ready                      master   47m   v1.13.4+66712a2cb
$ oc get machineconfigpool
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED
master   rendered-master-6bbd9797ae186414e76ecf06e886a149   True      False      False
worker   rendered-worker-09cd02b3b9f86900b08dd56a9be1f247   True      False      False
$ oc get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-128-214.us-west-2.compute.internal   Ready    master   70m   v1.13.4+66712a2cb
ip-10-0-135-85.us-west-2.compute.internal    Ready    worker   62m   v1.13.4+66712a2cb
ip-10-0-141-8.us-west-2.compute.internal     Ready    worker   63m   v1.13.4+66712a2cb
ip-10-0-142-188.us-west-2.compute.internal   Ready    master   70m   v1.13.4+66712a2cb
ip-10-0-159-116.us-west-2.compute.internal   Ready    worker   63m   v1.13.4+66712a2cb
ip-10-0-159-207.us-west-2.compute.internal   Ready    master   70m   v1.13.4+66712a2cb
(failed reverse-i-search)`': ^C get nodes
$ oc debug node/ip-10-0-135-85.us-west-2.compute.internal
Starting pod/ip-10-0-135-85us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cat /etc/test
aserver foo.example.net maxdelay 0.4 offline
server bar.example.net maxdelay 0.4 offline
server baz.example.net maxdelay 0.4 offline
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
$ oc get pods -n openshift-machine-config-operator -l k8s-app=machine-config-daemon --field-selector spec.nodeName=ip-10-0-135-85.us-west-2.compute.internal
NAME                          READY   STATUS    RESTARTS   AGE
machine-config-daemon-zfr5s   1/1     Running   1          31m
$ oc -n openshift-machine-config-operator log -f machine-config-daemon-zfr5s
I0128 19:53:37.839835    3146 start.go:67] Version: 4.1.33-202001240309-dirty (f56d736e74af8fb0dc85c4b1ee3cc8d1d1f6600b)
I0128 19:53:37.840640    3146 start.go:100] Starting node writer
I0128 19:53:37.846260    3146 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I0128 19:53:37.951298    3146 daemon.go:200] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c97112885ec4102014f8cd56825d7170f92dbcc9d17ab243566843040ab33aa2 (410.81.20200114.0)
I0128 19:53:37.978462    3146 start.go:196] Calling chroot("/rootfs")
I0128 19:53:37.978508    3146 start.go:206] Starting MachineConfigDaemon
I0128 19:53:37.978667    3146 update.go:848] Starting to manage node: ip-10-0-135-85.us-west-2.compute.internal
I0128 19:53:37.985613    3146 run.go:22] Running captured: rpm-ostree status
I0128 19:53:38.081412    3146 daemon.go:740] State: idle
AutomaticUpdates: disabled
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c97112885ec4102014f8cd56825d7170f92dbcc9d17ab243566843040ab33aa2
              CustomOrigin: Managed by pivot tool
                   Version: 410.81.20200114.0 (2020-01-14T04:56:15Z)

  pivot://docker-registry-default.cloud.registry.upshift.redhat.com/redhat-coreos/ootpa@sha256:683a6a866a8ec789fedb5da63b6a2ff68c1b0788ec90e7def778f0c4c13197a4
              CustomOrigin: Provisioned from oscontainer
                   Version: 410.8.20190520.0 (2019-05-20T20:10:04Z)
I0128 19:53:38.081534    3146 run.go:22] Running captured: journalctl --list-boots
I0128 19:53:38.092263    3146 daemon.go:747] journalctl --list-boots:
-3 70ecf3ac3f39470498ef88286fc264d4 Tue 2020-01-28 19:06:36 UTC—Tue 2020-01-28 19:08:32 UTC
-2 047eeb1626d14f57ab32088b4bc19d4c Tue 2020-01-28 19:08:59 UTC—Tue 2020-01-28 19:30:22 UTC
-1 a77073bf32d845e89e9ebdcc28f8a669 Tue 2020-01-28 19:30:54 UTC—Tue 2020-01-28 19:52:54 UTC
 0 1c7b9beb9eea4fca864bccf62effde87 Tue 2020-01-28 19:53:22 UTC—Tue 2020-01-28 19:53:38 UTC
I0128 19:53:38.092363    3146 daemon.go:494] Enabling Kubelet Healthz Monitor
E0128 19:53:42.094209    3146 reflector.go:134] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to list *v1.MachineConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
E0128 19:53:42.094209    3146 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: no route to host
I0128 19:53:44.140603    3146 update.go:737] logger doesn't support --jounald, grepping the journal
I0128 19:53:44.226170    3146 daemon.go:702] Current config: rendered-worker-110129aa1ce6e1f794d39ec37480180e
I0128 19:53:44.226204    3146 daemon.go:703] Desired config: rendered-worker-09cd02b3b9f86900b08dd56a9be1f247
I0128 19:53:44.226216    3146 update.go:848] using pending config same as desired
I0128 19:53:44.235981    3146 daemon.go:892] Using current config on disk rendered-worker-09cd02b3b9f86900b08dd56a9be1f247
I0128 19:53:44.236001    3146 update.go:848] Using current config on disk rendered-worker-09cd02b3b9f86900b08dd56a9be1f247
I0128 19:53:44.238148    3146 update.go:848] Validating against pending config rendered-worker-09cd02b3b9f86900b08dd56a9be1f247
I0128 19:53:44.240312    3146 daemon.go:909] Validating against pending config rendered-worker-09cd02b3b9f86900b08dd56a9be1f247
I0128 19:53:44.243386    3146 daemon.go:926] Validated on-disk state
I0128 19:53:44.251049    3146 update.go:813] logger doesn't support --jounald, logging json directly
I0128 19:53:44.253240    3146 daemon.go:960] Completing pending config rendered-worker-09cd02b3b9f86900b08dd56a9be1f247
I0128 19:53:44.258589    3146 update.go:848] completed update for config rendered-worker-09cd02b3b9f86900b08dd56a9be1f247
I0128 19:53:44.261200    3146 daemon.go:973] In desired config rendered-worker-09cd02b3b9f86900b08dd56a9be1f247

Comment 11 Michael Nguyen 2020-01-30 13:48:54 UTC
Verified. See comment 6.

Comment 13 errata-xmlrpc 2020-02-13 06:14:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0399


Note You need to log in before you can comment on or make changes to this bug.