In 4.11 we added daemon validation to the MCD via https://github.com/openshift/machine-config-operator/pull/3105 While trying to test another failure, we encountered the following error: Last Transition Time: 2022-07-07T15:05:36Z Message: Node ip-10-0-63-89.us-east-2.compute.internal is reporting: "unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: error running rpm-ostree kargs: exit status 1\nJob for rpm-ostreed.service failed.\nSee \"systemctl status rpm-ostreed.service\" and \"journalctl -xe\" for details.\n\x1b[0m\x1b[31merror: \x1b[0mexit status: 1\n" Reason: 1 nodes are reporting degraded status on sync Status: True Type: NodeDegraded Matching timestamps in the MCD log: E0707 15:05:33.409582 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: error running rpm-ostree kargs: exit status 1 Job for rpm-ostreed.service failed. See "systemctl status rpm-ostreed.service" and "journalctl -xe" for details. error: exit status: 1 and correspondingly in rpm-ostreed Jul 07 15:05:31 ip-10-0-63-89 rpm-ostree[2631]: In idle state; will auto-exit in 61 seconds Jul 07 15:05:33 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Start request repeated too quickly. Jul 07 15:05:33 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Failed with result 'start-limit-hit'. Jul 07 15:05:33 ip-10-0-63-89 systemd[1]: Failed to start rpm-ostree System Management Daemon. Jul 07 15:05:33 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Consumed 98ms CPU time So it looks like rpm-ostreed didn't start yet (but eventually was successful), the validator tried running rpm-ostree commands very early, and we failed without a retry and degraded due to it. The MCD does continue: I0707 15:06:33.427468 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:06:33.427487 2466 daemon.go:1175] state: Degraded I0707 15:06:33.430410 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:06:33.430427 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:06:33.968530 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:07:33.985443 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:07:33.985470 2466 daemon.go:1175] state: Degraded but this subsequent error doesn't overwrite the original issue and as such we are left in the old state reporting the wrong error. A few questions then: 1. why is rpm-ostree.d failing like that, and is it expected 2. should we be retrying the operation (it looks it it got past it eventually on the next sync, so maybe not an issue) 3. why the MCO doesn't correctly update the error state
Full daemon logs: I0707 15:05:29.238914 2466 start.go:112] Version: v4.11.0-202206250809.p0.g4aa21da.assembly.stream-dirty (4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a) I0707 15:05:29.243232 2466 start.go:125] Calling chroot("/rootfs") I0707 15:05:29.245594 2466 update.go:1962] Running: systemctl start rpm-ostreed I0707 15:05:30.039148 2466 rpm-ostree.go:324] Running captured: rpm-ostree status --json I0707 15:05:30.116122 2466 rpm-ostree.go:324] Running captured: rpm-ostree status --json I0707 15:05:30.169454 2466 daemon.go:236] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:02eec9efc74b23a730588ce6a236706b5aeeef5f52010ba25720a611cf1d929c (411.86.202206291708-0) I0707 15:05:30.290695 2466 start.go:101] Copied self to /run/bin/machine-config-daemon on host I0707 15:05:30.292230 2466 start.go:189] overriding kubernetes api to https://api-int.sregidor-ssh2.qe.devcluster.openshift.com:6443 I0707 15:05:30.293259 2466 metrics.go:106] Registering Prometheus metrics I0707 15:05:30.293328 2466 metrics.go:111] Starting metrics listener on 127.0.0.1:8797 I0707 15:05:30.296343 2466 writer.go:93] NodeWriter initialized with credentials from /var/lib/kubelet/kubeconfig I0707 15:05:30.296475 2466 update.go:1977] Starting to manage node: ip-10-0-63-89.us-east-2.compute.internal I0707 15:05:30.335927 2466 rpm-ostree.go:324] Running captured: rpm-ostree status I0707 15:05:30.404093 2466 daemon.go:1220] State: idle Deployments: * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:02eec9efc74b23a730588ce6a236706b5aeeef5f52010ba25720a611cf1d929c CustomOrigin: Managed by machine-config-operator Version: 411.86.202206291708-0 (2022-06-29T17:11:39Z) f9d88d07921009f524c39773d0935a7d1642a02bd37e0d621696bf4f766a0540 Version: 45.82.202008010929-0 (2020-08-01T09:33:23Z) I0707 15:05:30.404613 2466 coreos.go:95] CoreOS aleph version: mtime=2020-08-01 09:35:48.964 +0000 UTC build=45.82.202008010929-0 imgid=rhcos-45.82.202008010929-0-qemu.x86_64.qcow2 No /etc/.ignition-result.json foundI0707 15:05:30.404699 2466 rpm-ostree.go:324] Running captured: journalctl --list-boots I0707 15:05:30.408951 2466 daemon.go:1229] journalctl --list-boots: -1 7f00b2ad240347b198afc64b62f0a281 Thu 2022-07-07 15:02:18 UTC—Thu 2022-07-07 15:04:37 UTC 0 6b64211bdd9f4837af505ea3541c4e4d Thu 2022-07-07 15:04:49 UTC—Thu 2022-07-07 15:05:30 UTC I0707 15:05:30.409005 2466 rpm-ostree.go:324] Running captured: systemctl list-units --state=failed --no-legend I0707 15:05:30.417923 2466 daemon.go:1244] systemd service state: OK I0707 15:05:30.417959 2466 daemon.go:909] Starting MachineConfigDaemon I0707 15:05:30.418067 2466 daemon.go:916] Enabling Kubelet Healthz Monitor I0707 15:05:31.332825 2466 daemon.go:451] Node ip-10-0-63-89.us-east-2.compute.internal is not labeled node-role.kubernetes.io/master I0707 15:05:31.332938 2466 node.go:24] No machineconfiguration.openshift.io/currentConfig annotation on node ip-10-0-63-89.us-east-2.compute.internal: map[cloud.network.openshift.io/egress-ipconfig:[{"interface":"eni-0f535398b33d51315","ifaddr":{"ipv4":"10.0.48.0/20"},"capacity":{"ipv4":14,"ipv6":15}}] machine.openshift.io/machine:openshift-machine-api/sregidor-ssh2-dlnh9-worker-us-east-2a-mod-pflkc machineconfiguration.openshift.io/controlPlaneTopology:HighlyAvailable volumes.kubernetes.io/controller-managed-attach-detach:true], in cluster bootstrap, loading initial node annotation from /etc/machine-config-daemon/node-annotations.json I0707 15:05:31.333435 2466 node.go:45] Setting initial node config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:05:31.352098 2466 daemon.go:1137] In bootstrap mode I0707 15:05:31.352268 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:05:31.352313 2466 daemon.go:1175] state: Done I0707 15:05:31.356550 2466 daemon.go:1425] No bootstrap pivot required; unlinking bootstrap node annotations I0707 15:05:31.356635 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:05:31.356662 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:05:31.429554 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:05:33.352033 2466 daemon.go:500] Transitioned from state: Done -> Degraded I0707 15:05:33.355528 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:05:33.355544 2466 daemon.go:1175] state: Degraded I0707 15:05:33.359275 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:05:33.359294 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:05:33.409582 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: error running rpm-ostree kargs: exit status 1 Job for rpm-ostreed.service failed. See "systemctl status rpm-ostreed.service" and "journalctl -xe" for details. error: exit status: 1 I0707 15:06:33.427468 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:06:33.427487 2466 daemon.go:1175] state: Degraded I0707 15:06:33.430410 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:06:33.430427 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:06:33.968530 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:07:33.985443 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:07:33.985470 2466 daemon.go:1175] state: Degraded I0707 15:07:33.988396 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:07:33.988413 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:07:34.020335 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:08:34.035048 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:08:34.035066 2466 daemon.go:1175] state: Degraded I0707 15:08:34.037822 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:08:34.037836 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:08:34.071001 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:09:34.081452 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:09:34.081468 2466 daemon.go:1175] state: Degraded I0707 15:09:34.084325 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:09:34.084339 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:09:34.117176 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:10:34.135071 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:10:34.135088 2466 daemon.go:1175] state: Degraded I0707 15:10:34.137784 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:10:34.137797 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:10:34.169218 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:11:34.185006 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:11:34.185029 2466 daemon.go:1175] state: Degraded I0707 15:11:34.187940 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:11:34.187954 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:11:34.220806 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:12:34.241692 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:12:34.241708 2466 daemon.go:1175] state: Degraded I0707 15:12:34.244578 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:12:34.244591 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:12:34.277219 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:13:34.290973 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:13:34.290997 2466 daemon.go:1175] state: Degraded I0707 15:13:34.293871 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:13:34.293884 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:13:34.325982 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:14:34.341341 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:14:34.341359 2466 daemon.go:1175] state: Degraded I0707 15:14:34.344093 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:14:34.344107 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:14:34.375638 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:15:34.391092 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:15:34.391108 2466 daemon.go:1175] state: Degraded I0707 15:15:34.394009 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:15:34.394026 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:15:35.007586 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:16:35.027338 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:16:35.027355 2466 daemon.go:1175] state: Degraded I0707 15:16:35.030243 2466 daemon.go:1463] Validating against pending config rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:16:35.030257 2466 rpm-ostree.go:324] Running captured: rpm-ostree kargs E0707 15:16:35.062333 2466 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-5a883f3d21fbb30631f333f2923a19c2: mode mismatch for file: "/etc/crio/crio.conf.d/01-ctrcfg-pidsLimit"; expected: -rw-r--r--/420/0644; received: ----------/0/0 I0707 15:17:35.074459 2466 daemon.go:1165] Current+desired config: rendered-worker-5a883f3d21fbb30631f333f2923a19c2 I0707 15:17:35.074478 2466 daemon.go:1175] state: Degraded Full rpm-ostreed logs: sh-4.4# journalctl -u rpm-ostreed -- Logs begin at Thu 2022-07-07 15:02:18 UTC, end at Thu 2022-07-07 15:24:35 UTC. -- Jul 07 15:03:17 ip-10-0-63-89 systemd[1]: Starting rpm-ostree System Management Daemon... Jul 07 15:03:18 ip-10-0-63-89 rpm-ostree[2199]: Reading config file '/etc/rpm-ostreed.conf' Jul 07 15:03:19 ip-10-0-63-89 rpm-ostree[2199]: In idle state; will auto-exit in 64 seconds Jul 07 15:03:19 ip-10-0-63-89 systemd[1]: Started rpm-ostree System Management Daemon. Jul 07 15:03:19 ip-10-0-63-89 rpm-ostree[2199]: client(id:cli dbus:1.22 unit:machine-config-daemon-firstboot.service uid:0) added; new total=1 Jul 07 15:03:19 ip-10-0-63-89 rpm-ostree[2199]: client(id:cli dbus:1.22 unit:machine-config-daemon-firstboot.service uid:0) vanished; remaining=0 Jul 07 15:03:19 ip-10-0-63-89 rpm-ostree[2199]: In idle state; will auto-exit in 60 seconds Jul 07 15:03:19 ip-10-0-63-89 rpm-ostree[2199]: client(id:cli dbus:1.23 unit:machine-config-daemon-firstboot.service uid:0) added; new total=1 Jul 07 15:03:19 ip-10-0-63-89 rpm-ostree[2199]: client(id:cli dbus:1.23 unit:machine-config-daemon-firstboot.service uid:0) vanished; remaining=0 Jul 07 15:03:19 ip-10-0-63-89 rpm-ostree[2199]: In idle state; will auto-exit in 62 seconds Jul 07 15:03:44 ip-10-0-63-89 rpm-ostree[2199]: client(id:cli dbus:1.24 unit:machine-config-daemon-firstboot.service uid:0) added; new total=1 Jul 07 15:03:44 ip-10-0-63-89 rpm-ostree[2199]: client(id:cli dbus:1.24 unit:machine-config-daemon-firstboot.service uid:0) vanished; remaining=0 Jul 07 15:03:44 ip-10-0-63-89 rpm-ostree[2199]: In idle state; will auto-exit in 62 seconds Jul 07 15:03:44 ip-10-0-63-89 rpm-ostree[2199]: client(id:cli dbus:1.25 unit:machine-config-daemon-firstboot.service uid:0) added; new total=1 Jul 07 15:03:44 ip-10-0-63-89 rpm-ostree[2199]: Initiated txn UpdateDeployment for client(id:cli dbus:1.25 unit:machine-config-daemon-firstboot.service uid:0): /org/projectatomic/rpmostree1/rhcos Jul 07 15:04:05 ip-10-0-63-89 rpm-ostree[2199]: Created new deployment /ostree/deploy/rhcos/deploy/cff8e8dfd35b4a8e42f3bd4eb3afbbe9cd2d4fb3896c782585596372a8d059e4.0 Jul 07 15:04:05 ip-10-0-63-89 rpm-ostree[2199]: sanitycheck(/usr/bin/true) successful Jul 07 15:04:18 ip-10-0-63-89 rpm-ostree[2199]: Txn UpdateDeployment on /org/projectatomic/rpmostree1/rhcos successful Jul 07 15:04:18 ip-10-0-63-89 rpm-ostree[2199]: client(id:cli dbus:1.25 unit:machine-config-daemon-firstboot.service uid:0) vanished; remaining=0 Jul 07 15:04:18 ip-10-0-63-89 rpm-ostree[2199]: In idle state; will auto-exit in 60 seconds Jul 07 15:04:19 ip-10-0-63-89 systemd[1]: Stopping rpm-ostree System Management Daemon... Jul 07 15:04:19 ip-10-0-63-89 systemd[1]: Stopped rpm-ostree System Management Daemon. Jul 07 15:04:19 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Consumed 12.887s CPU time -- Reboot -- Jul 07 15:05:29 ip-10-0-63-89 systemd[1]: Starting rpm-ostree System Management Daemon... Jul 07 15:05:29 ip-10-0-63-89 rpm-ostree[2631]: Reading config file '/etc/rpm-ostreed.conf' Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: In idle state; will auto-exit in 64 seconds Jul 07 15:05:30 ip-10-0-63-89 systemd[1]: Started rpm-ostree System Management Daemon. Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: client(id:machine-config-operator dbus:1.85 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: client(id:machine-config-operator dbus:1.85 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: In idle state; will auto-exit in 64 seconds Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: client(id:machine-config-operator dbus:1.86 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: client(id:machine-config-operator dbus:1.86 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: In idle state; will auto-exit in 63 seconds Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: client(id:machine-config-operator dbus:1.87 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: client(id:machine-config-operator dbus:1.87 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:05:30 ip-10-0-63-89 rpm-ostree[2631]: In idle state; will auto-exit in 63 seconds Jul 07 15:05:31 ip-10-0-63-89 rpm-ostree[2631]: client(id:machine-config-operator dbus:1.94 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:05:31 ip-10-0-63-89 rpm-ostree[2631]: client(id:machine-config-operator dbus:1.94 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:05:31 ip-10-0-63-89 rpm-ostree[2631]: In idle state; will auto-exit in 61 seconds Jul 07 15:05:33 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Start request repeated too quickly. Jul 07 15:05:33 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Failed with result 'start-limit-hit'. Jul 07 15:05:33 ip-10-0-63-89 systemd[1]: Failed to start rpm-ostree System Management Daemon. Jul 07 15:05:33 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Consumed 98ms CPU time Jul 07 15:05:41 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Got notification message from PID 2631, but reception only permitted for main PID which is currently not known Jul 07 15:05:51 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Got notification message from PID 2631, but reception only permitted for main PID which is currently not known Jul 07 15:06:01 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Got notification message from PID 2631, but reception only permitted for main PID which is currently not known Jul 07 15:06:11 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Got notification message from PID 2631, but reception only permitted for main PID which is currently not known Jul 07 15:06:21 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Got notification message from PID 2631, but reception only permitted for main PID which is currently not known Jul 07 15:06:31 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Got notification message from PID 2631, but reception only permitted for main PID which is currently not known Jul 07 15:06:32 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Got notification message from PID 2631, but reception only permitted for main PID which is currently not known Jul 07 15:06:32 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Got notification message from PID 2631, but reception only permitted for main PID which is currently not known Jul 07 15:06:33 ip-10-0-63-89 systemd[1]: Starting rpm-ostree System Management Daemon... Jul 07 15:06:33 ip-10-0-63-89 rpm-ostree[5215]: Reading config file '/etc/rpm-ostreed.conf' Jul 07 15:06:33 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 60 seconds Jul 07 15:06:33 ip-10-0-63-89 systemd[1]: Started rpm-ostree System Management Daemon. Jul 07 15:06:33 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.149 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:06:33 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.149 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:06:33 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 63 seconds Jul 07 15:07:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.150 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:07:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.150 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:07:34 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 62 seconds Jul 07 15:08:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.151 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:08:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.151 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:08:34 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 61 seconds Jul 07 15:09:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.156 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:09:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.156 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:09:34 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 63 seconds Jul 07 15:10:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.161 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:10:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.161 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:10:34 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 64 seconds Jul 07 15:11:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.162 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:11:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.162 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:11:34 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 60 seconds Jul 07 15:12:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.163 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:12:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.163 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:12:34 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 61 seconds Jul 07 15:13:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.164 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:13:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.164 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:13:34 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 60 seconds Jul 07 15:14:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.165 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:14:34 ip-10-0-63-89 rpm-ostree[5215]: client(id:machine-config-operator dbus:1.165 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:14:34 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 60 seconds Jul 07 15:15:34 ip-10-0-63-89 rpm-ostree[5215]: In idle state; will auto-exit in 62 seconds Jul 07 15:15:34 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Succeeded. Jul 07 15:15:34 ip-10-0-63-89 systemd[1]: rpm-ostreed.service: Consumed 130ms CPU time Jul 07 15:15:34 ip-10-0-63-89 systemd[1]: Starting rpm-ostree System Management Daemon... Jul 07 15:15:34 ip-10-0-63-89 rpm-ostree[11450]: Reading config file '/etc/rpm-ostreed.conf' Jul 07 15:15:34 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 64 seconds Jul 07 15:15:34 ip-10-0-63-89 systemd[1]: Started rpm-ostree System Management Daemon. Jul 07 15:15:34 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.167 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:15:34 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.167 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:15:34 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 61 seconds Jul 07 15:16:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.168 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:16:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.168 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:16:35 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 63 seconds Jul 07 15:17:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.169 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:17:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.169 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:17:35 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 61 seconds Jul 07 15:18:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.170 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:18:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.170 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:18:35 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 61 seconds Jul 07 15:19:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.171 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:19:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.171 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:19:35 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 62 seconds Jul 07 15:20:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.172 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:20:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.172 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:20:35 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 63 seconds Jul 07 15:21:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.173 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:21:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.173 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:21:35 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 63 seconds Jul 07 15:22:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.174 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:22:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.174 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:22:35 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 64 seconds Jul 07 15:23:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.175 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:23:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.175 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:23:35 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 63 seconds Jul 07 15:24:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.176 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) added; new total=1 Jul 07 15:24:35 ip-10-0-63-89 rpm-ostree[11450]: client(id:machine-config-operator dbus:1.176 unit:crio-70a105f9061333d7098020fd9a27314dcbcdc2df057fb2da4fd5862fedd67ac5.scope uid:0) vanished; remaining=0 Jul 07 15:24:35 ip-10-0-63-89 rpm-ostree[11450]: In idle state; will auto-exit in 63 seconds
> but this subsequent error doesn't overwrite the original issue and as such we are left in the old state reporting the wrong error. Seems like an MCO bug here. The underlying triggering problem though is https://bugzilla.redhat.com/show_bug.cgi?id=2108320
Ok, renaming this to track the other issue. The issue here being that even if you fixed an error, if another error happens, the MCD won't make progress, and ALSO not report the new error up the stack, only in the MCD logs, causing confusion often.
Verified using IPI on AWS version: $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.0-0.nightly-2023-02-07-064924 True False 178m Cluster version is 4.13.0-0.nightly-2023-02-07-064924 Steps: 1. Deploy 2 MCs to create 2 test files in the nodes: apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: test-machine-config spec: config: ignition: security: version: 3.1.0 storage: files: - contents: source: data:text/plain;charset=utf-8;base64,dGVzdA== mode: 420 path: /etc/test-file.test and apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: test-machine-config-2 spec: config: ignition: security: version: 3.1.0 storage: files: - contents: source: data:text/plain;charset=utf-8;base64,dGVzdA== mode: 420 path: /etc/test-file-2.test 2. Edit file 1 and check the degraded reason in the MCP - lastTransitionTime: "2023-02-08T13:09:54Z" message: 'Node ip-10-0-130-16.us-east-2.compute.internal is reporting: "content mismatch for file \"/etc/test-file.test\""' reason: 1 nodes are reporting degraded status on sync status: "True" type: NodeDegraded 3. Fix the file 1: - lastTransitionTime: "2023-02-08T13:18:09Z" message: "" reason: "" status: "False" type: NodeDegraded 4. Edit file 2 and check the degraded reason in the MCP: - lastTransitionTime: "2023-02-08T13:19:18Z" message: 'Node ip-10-0-130-16.us-east-2.compute.internal is reporting: "content mismatch for file \"/etc/test-file-2.test\""' reason: 1 nodes are reporting degraded status on sync status: "True" type: NodeDegraded 5. Edit file one and check the degraded reason in the MCP: - lastTransitionTime: "2023-02-08T13:19:18Z" message: 'Node ip-10-0-130-16.us-east-2.compute.internal is reporting: "content mismatch for file \"/etc/test-file.test\""' reason: 1 nodes are reporting degraded status on sync status: "True" type: NodeDegraded We move the status to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.13.0 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:1326