+++ This bug was initially created as a clone of Bug #1895360 +++ Steps To Reproduce: 1. Create systemd dropin via a file 2. Realize that its suboptimal, convert into a dropin 3. Ignition will rewrite the file, but MCD's deleteStaleData would remove it file since its no longer in .Storage.Files 4. Machine would reboot and MCD would complain that file is not found. This is blocking OKD 4.5 -> 4.6 upgrade, as in 4.5 we placed kubelet MCO dropin for proxy setup via storage.files and in 4.6 its replaced by a unit dropin.
Verified on 4.6.0-0.nightly-2020-11-28-204928 $ cat << EOF | oc create -f - > apiVersion: machineconfiguration.openshift.io/v1 > kind: MachineConfig > metadata: > labels: > machineconfiguration.openshift.io/role: worker > name: drop-in-file > spec: > config: > ignition: > version: 3.1.0 > storage: > files: > - contents: > source: data:text/plain;charset=utf;base64,W1VuaXRdCg== > filesystem: root > mode: 0644 > path: /etc/systemd/system/machine-config-daemon-firstboot.service.d/override.conf > EOF machineconfig.machineconfiguration.openshift.io/drop-in-file created $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m 00-worker 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m 01-master-container-runtime 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m 01-master-kubelet 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m 01-worker-container-runtime 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m 01-worker-kubelet 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m 99-master-generated-registries 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m 99-master-ssh 3.1.0 4h4m 99-worker-generated-registries 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m 99-worker-ssh 3.1.0 4h4m drop-in-file 3.1.0 3s rendered-master-68cfb783ca9fea66121a140662b1eecd 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m rendered-worker-e2825f140e84a200a54950101efa3790 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 20m rendered-worker-eb2b3c116a90d6d1cc32431a7d060211 5a9e6b4eedaf72ecc2534355173843e011f19765 3.1.0 3h54m test-file 3.1.0 20m $ oc get mcp/worker NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-e2825f140e84a200a54950101efa3790 False True False 4 0 0 0 3h56m $ oc -n openshift-machine-api get machinesets NAME DESIRED CURRENT READY AVAILABLE AGE mnguyen46-mqt96-worker-us-west-2a 2 2 1 1 4h5m mnguyen46-mqt96-worker-us-west-2b 1 1 1 1 4h5m mnguyen46-mqt96-worker-us-west-2c 1 1 1 1 4h5m mnguyen46-mqt96-worker-us-west-2d 0 0 4h5m $ oc -n openshift-machine-api scale --replicas=1 machinesets/mnguyen46-mqt96-worker-us-west-2a machineset.machine.openshift.io/mnguyen46-mqt96-worker-us-west-2a scaled $ watch oc get mcp/worker $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-128-247.us-west-2.compute.internal Ready worker 23m v1.19.0+6d3423a ip-10-0-137-98.us-west-2.compute.internal Ready master 4h7m v1.19.0+6d3423a ip-10-0-166-100.us-west-2.compute.internal Ready worker 3h57m v1.19.0+6d3423a ip-10-0-183-163.us-west-2.compute.internal Ready master 4h6m v1.19.0+6d3423a ip-10-0-207-100.us-west-2.compute.internal Ready master 4h7m v1.19.0+6d3423a ip-10-0-219-169.us-west-2.compute.internal Ready worker 3h58m v1.19.0+6d3423a $ oc debug node/ip-10-0-128-247.us-west-2.compute.internal Starting pod/ip-10-0-128-247us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# cat /etc/systemd/system/machine-config-daemon-firstboot.service.d/override.conf [Unit] sh-4.4# exit exit sh-4.2# exit exit Removing debug pod ... $ oc edit mc/drop-in-file Edit cancelled, no changes made. $ oc get mc/drop-in-file -o yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: creationTimestamp: "2020-11-30T22:32:06Z" generation: 1 labels: machineconfiguration.openshift.io/role: worker managedFields: - apiVersion: machineconfiguration.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:labels: .: {} f:machineconfiguration.openshift.io/role: {} f:spec: .: {} f:config: .: {} f:ignition: .: {} f:version: {} f:storage: .: {} f:files: {} manager: oc operation: Update time: "2020-11-30T22:32:06Z" name: drop-in-file resourceVersion: "89283" selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigs/drop-in-file uid: 2c72ab85-7185-403d-b454-4c99552af2ce spec: config: ignition: version: 3.1.0 storage: files: - contents: source: data:text/plain;charset=utf;base64,W1VuaXRdCg== filesystem: root mode: 420 path: /etc/systemd/system/machine-config-daemon-firstboot.service.d/override.conf $ oc edit mc/drop-in-file machineconfig.machineconfiguration.openshift.io/drop-in-file edited $ oc get mcp/worker NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-684407152e08796d1e7460d0f6f3d5e3 True False False 3 3 3 0 4h7m $ oc get mcp/worker NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-684407152e08796d1e7460d0f6f3d5e3 True False False 3 3 3 0 4h7m $ watch oc get mcp/worker $ oc get mcp/worker NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-684407152e08796d1e7460d0f6f3d5e3 False True False 3 0 0 0 4h7m $ watch oc get mcp/worker $ oc get mcp/worker NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-e92d2a9987b13a846852737d77c0d1bb True False False 3 3 3 0 4h23m $ oc debug node/ip-10-0-128-247.us-west-2.compute.internal Starting pod/ip-10-0-128-247us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# cat /etc/systemd/system/machine-config-daemon-firstboot.service.d/override.conf [Unit] sh-4.4# exit exit sh-4.2# exit exit Removing debug pod ...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.6.8 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5259