Bug 1899735 - Machine Config Daemon removes a file although its defined in the dropin
Summary: Machine Config Daemon removes a file although its defined in the dropin
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.z
Assignee: Vadim Rutkovsky
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On: 1895360
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-19 21:15 UTC by OpenShift BugZilla Robot
Modified: 2020-12-14 13:51 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-14 13:50:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2246 0 None closed [release-4.6] Bug 1899735: pkg/daemon: don't delete a file if its replaced with a dropin 2020-11-30 22:30:20 UTC
Red Hat Product Errata RHSA-2020:5259 0 None None None 2020-12-14 13:51:16 UTC

Description OpenShift BugZilla Robot 2020-11-19 21:15:50 UTC
+++ This bug was initially created as a clone of Bug #1895360 +++

Steps To Reproduce:

1. Create systemd dropin via a file
2.  Realize that its suboptimal, convert into a dropin
3.  Ignition will rewrite the file, but MCD's deleteStaleData would remove it file since its no longer in .Storage.Files
4.  Machine would reboot and MCD would complain that file is not found.

This is blocking OKD 4.5 -> 4.6 upgrade, as in 4.5 we placed kubelet MCO dropin for proxy setup via storage.files and in 4.6 its replaced by a unit dropin.

Comment 3 Michael Nguyen 2020-11-30 23:02:07 UTC
Verified on 4.6.0-0.nightly-2020-11-28-204928

$ cat << EOF | oc create -f -
> apiVersion: machineconfiguration.openshift.io/v1
> kind: MachineConfig
> metadata:
>   labels:
>     machineconfiguration.openshift.io/role: worker
>   name: drop-in-file
> spec:
>   config:
>     ignition:
>       version: 3.1.0
>     storage:
>       files:
>       - contents:
>           source: data:text/plain;charset=utf;base64,W1VuaXRdCg==
>         filesystem: root
>         mode: 0644
>         path: /etc/systemd/system/machine-config-daemon-firstboot.service.d/override.conf
> EOF
machineconfig.machineconfiguration.openshift.io/drop-in-file created
$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
00-worker                                          5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
01-master-container-runtime                        5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
01-master-kubelet                                  5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
01-worker-container-runtime                        5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
01-worker-kubelet                                  5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
99-master-generated-registries                     5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
99-master-ssh                                                                                 3.1.0             4h4m
99-worker-generated-registries                     5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
99-worker-ssh                                                                                 3.1.0             4h4m
drop-in-file                                                                                  3.1.0             3s
rendered-master-68cfb783ca9fea66121a140662b1eecd   5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
rendered-worker-e2825f140e84a200a54950101efa3790   5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             20m
rendered-worker-eb2b3c116a90d6d1cc32431a7d060211   5a9e6b4eedaf72ecc2534355173843e011f19765   3.1.0             3h54m
test-file                                                                                     3.1.0             20m
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-e2825f140e84a200a54950101efa3790   False     True       False      4              0                   0                     0                      3h56m
$ oc -n openshift-machine-api get machinesets
NAME                                DESIRED   CURRENT   READY   AVAILABLE   AGE
mnguyen46-mqt96-worker-us-west-2a   2         2         1       1           4h5m
mnguyen46-mqt96-worker-us-west-2b   1         1         1       1           4h5m
mnguyen46-mqt96-worker-us-west-2c   1         1         1       1           4h5m
mnguyen46-mqt96-worker-us-west-2d   0         0                             4h5m
$ oc -n openshift-machine-api scale --replicas=1 machinesets/mnguyen46-mqt96-worker-us-west-2a
machineset.machine.openshift.io/mnguyen46-mqt96-worker-us-west-2a scaled
$ watch oc get mcp/worker
$ oc get nodes
NAME                                         STATUS   ROLES    AGE     VERSION
ip-10-0-128-247.us-west-2.compute.internal   Ready    worker   23m     v1.19.0+6d3423a
ip-10-0-137-98.us-west-2.compute.internal    Ready    master   4h7m    v1.19.0+6d3423a
ip-10-0-166-100.us-west-2.compute.internal   Ready    worker   3h57m   v1.19.0+6d3423a
ip-10-0-183-163.us-west-2.compute.internal   Ready    master   4h6m    v1.19.0+6d3423a
ip-10-0-207-100.us-west-2.compute.internal   Ready    master   4h7m    v1.19.0+6d3423a
ip-10-0-219-169.us-west-2.compute.internal   Ready    worker   3h58m   v1.19.0+6d3423a
$ oc debug node/ip-10-0-128-247.us-west-2.compute.internal
Starting pod/ip-10-0-128-247us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cat /etc/systemd/system/machine-config-daemon-firstboot.service.d/override.conf 
[Unit]
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...
$ oc edit mc/drop-in-file
Edit cancelled, no changes made.
$ oc get mc/drop-in-file -o yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  creationTimestamp: "2020-11-30T22:32:06Z"
  generation: 1
  labels:
    machineconfiguration.openshift.io/role: worker
  managedFields:
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:machineconfiguration.openshift.io/role: {}
      f:spec:
        .: {}
        f:config:
          .: {}
          f:ignition:
            .: {}
            f:version: {}
          f:storage:
            .: {}
            f:files: {}
    manager: oc
    operation: Update
    time: "2020-11-30T22:32:06Z"
  name: drop-in-file
  resourceVersion: "89283"
  selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigs/drop-in-file
  uid: 2c72ab85-7185-403d-b454-4c99552af2ce
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf;base64,W1VuaXRdCg==
        filesystem: root
        mode: 420
        path: /etc/systemd/system/machine-config-daemon-firstboot.service.d/override.conf
$ oc edit mc/drop-in-file
machineconfig.machineconfiguration.openshift.io/drop-in-file edited
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-684407152e08796d1e7460d0f6f3d5e3   True      False      False      3              3                   3                     0                      4h7m
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-684407152e08796d1e7460d0f6f3d5e3   True      False      False      3              3                   3                     0                      4h7m
$ watch oc get mcp/worker
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-684407152e08796d1e7460d0f6f3d5e3   False     True       False      3              0                   0                     0                      4h7m
$ watch oc get mcp/worker
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-e92d2a9987b13a846852737d77c0d1bb   True      False      False      3              3                   3                     0                      4h23m
$ oc debug node/ip-10-0-128-247.us-west-2.compute.internal
Starting pod/ip-10-0-128-247us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cat /etc/systemd/system/machine-config-daemon-firstboot.service.d/override.conf 
[Unit]
sh-4.4# exit
exit
sh-4.2# exit
exit

Removing debug pod ...

Comment 5 errata-xmlrpc 2020-12-14 13:50:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.6.8 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5259


Note You need to log in before you can comment on or make changes to this bug.