Bug 1853890 - Machine config daemon does not preserve kernel arguments order
Summary: Machine config daemon does not preserve kernel arguments order
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.6
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.6.0
Assignee: Antonio Murdaca
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-05 09:49 UTC by Artyom
Modified: 2020-10-27 16:12 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:12:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1899 0 None closed Bug 1853890: pkg/daemon: preserve kargs order 2020-08-05 20:03:16 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:12:53 UTC

Description Artyom 2020-07-05 09:49:20 UTC
Description of problem:
I have a machineconfig CR with the kernel arguments section
"kernelArguments": [
                    "skew_tick=1",
                    "nohz=on",
                    "rcu_nocbs=1-3",
                    "tuned.non_isolcpus=00000001",
                    "intel_pstate=disable",
                    "nosoftlockup",
                    "tsc=nowatchdog",
                    "intel_iommu=on",
                    "iommu=pt",
                    "systemd.cpu_affinity=0",
                    "default_hugepagesz=1G",
                    "nmi_watchdog=0",
                    "audit=0",
                    "mce=off",
                    "processor.max_cstate=1",
                    "idle=poll",
                    "intel_idle.max_cstate=0"
                ],
but once machine-config daemon runs the rpm-ostree kargs command the order of arguments not preserved

kargs --append=skew_tick=1 --append=tuned.non_isolcpus=00000001 --append=tsc=nowatchdog --append=processor.max_cstate=1 --append=rcu_nocbs=1-3 --append=intel_iommu=on --append=systemd.cpu_affinity=0 --append=nmi_watchdog=0 --append=mce=off --append=nohz=on --append=intel_pstate=disable --append=nosoftlockup --append=iommu=pt --append=default_hugepagesz=1G --append=audit=0 --append=idle=poll --append=intel_idle.max_cstate=0

It happens because code generates maps from the kernel arguments - https://github.com/openshift/machine-config-operator/blob/7cc08753e1bfd10503df5bb7a730c8a977a5d204/pkg/daemon/update.go#L670 and go map does not preserve the order.



Version-Release number of selected component (if applicable):
master

How reproducible:
Always

Steps to Reproduce:
1. Create the machineconfig with the above kernel arguments and attach it to worker MCP.
2. Gives to the MCP some time for the update.
3. Verify the rpm-ostree command under the relevant machine-config log.

Actual results:
The order not preserved

Expected results:
The order should be preserved

Additional info:
**IMPORTANT**: I see another problem with the current map implementation, it can miss some arguments in the case when you have arguments with the same name, but with the different value, for example you can specify multiple sizes of hugepages.
hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024

Comment 1 Antonio Murdaca 2020-07-06 13:29:35 UTC
> Additional info:
> **IMPORTANT**: I see another problem with the current map implementation, it
> can miss some arguments in the case when you have arguments with the same
> name, but with the different value, for example you can specify multiple
> sizes of hugepages.
> hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024

I'm gonna check this as I think we're allowing same-key args - if you look at the map, we're using the whole karg as the kay so we should allow duplicates so that may not be something to fix (ordering I get it instead)

Comment 2 Antonio Murdaca 2020-07-06 13:33:56 UTC
(In reply to Antonio Murdaca from comment #1)
> > Additional info:
> > **IMPORTANT**: I see another problem with the current map implementation, it
> > can miss some arguments in the case when you have arguments with the same
> > name, but with the different value, for example you can specify multiple
> > sizes of hugepages.
> > hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024
> 
> I'm gonna check this as I think we're allowing same-key args - if you look
> at the map, we're using the whole karg as the kay so we should allow
> duplicates so that may not be something to fix (ordering I get it instead)

Unless something else is removing duplicates, the MCO do run --append with duplicate keys so, I'll further check

Comment 3 Artyom 2020-07-06 13:46:49 UTC
Thanks for the check.

Comment 8 Michael Nguyen 2020-07-14 13:30:32 UTC
Verified on 4.6.0-0.nightly-2020-07-14-092216.  Order of kargs is preserved and same name kargs are preserved using both styles of specifying kernel argumetns. 

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-07-14-092216   True        False         15m     Cluster version is 4.6.0-0.nightly-2020-07-14-092216
$ cat kargs.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-openshift-machineconfig-worker-kargs
spec:
  kernelArguments:
  - 'z=10'
  - 'y=9'
  - 'x=8'
  - 'x=7'
  - 'w=6'
  - 'w=5'
  - 'v=4'
  - 'u=3'
  - 't=2'
  - 's=1'
  - 'r=0'
$ oc create -f kargs.yaml 
machineconfig.machineconfiguration.openshift.io/99-openshift-machineconfig-worker-kargs created
$ 
$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
00-worker                                          8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
01-master-container-runtime                        8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
01-master-kubelet                                  8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
01-worker-container-runtime                        8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
01-worker-kubelet                                  8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
99-master-generated-registries                     8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
99-master-ssh                                                                                 2.2.0             41m
99-openshift-machineconfig-worker-kargs                                                                         13s
99-worker-generated-registries                     8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
99-worker-ssh                                                                                 2.2.0             41m
rendered-master-42fe71a9854e52c79b44825e7a405182   8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
rendered-worker-8a90ad4c0b62799d96c63f36ac854048   8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             32m
rendered-worker-d1a40648c9dfc284485eef27547c80a8   8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             8s
$ oc get mc/99-openshift-machineconfig-worker-kargs -o yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  creationTimestamp: "2020-07-14T12:59:26Z"
  generation: 1
  labels:
    machineconfiguration.openshift.io/role: worker
  managedFields:
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:machineconfiguration.openshift.io/role: {}
      f:spec:
        .: {}
        f:kernelArguments: {}
    manager: oc
    operation: Update
    time: "2020-07-14T12:59:26Z"
  name: 99-openshift-machineconfig-worker-kargs
  resourceVersion: "37573"
  selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigs/99-openshift-machineconfig-worker-kargs
  uid: a16af0dd-0014-47aa-a1cc-9135e8ca3b00
spec:
  kernelArguments:
  - z=10
  - y=9
  - x=8
  - x=7
  - w=6
  - w=5
  - v=4
  - u=3
  - t=2
  - s=1
  - r=0
$ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-42fe71a9854e52c79b44825e7a405182   True      False      False      3              3                   3                     0                      33m
worker   rendered-worker-8a90ad4c0b62799d96c63f36ac854048   False     True       False      3              0                   0                     0                      33m
$ watch oc get node
$ oc get nodes
NAME                                         STATUS                     ROLES    AGE   VERSION
ip-10-0-131-12.us-west-2.compute.internal    Ready                      master   41m   v1.18.3+b9ac23f
ip-10-0-136-229.us-west-2.compute.internal   Ready                      worker   31m   v1.18.3+b9ac23f
ip-10-0-182-165.us-west-2.compute.internal   Ready                      master   41m   v1.18.3+b9ac23f
ip-10-0-186-178.us-west-2.compute.internal   Ready                      worker   31m   v1.18.3+b9ac23f
ip-10-0-192-116.us-west-2.compute.internal   Ready,SchedulingDisabled   worker   31m   v1.18.3+b9ac23f
ip-10-0-210-115.us-west-2.compute.internal   Ready                      master   41m   v1.18.3+b9ac23f
$ oc debug node/ip-10-0-186-178.us-west-2.compute.internal 
Starting pod/ip-10-0-186-178us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-e669223ccd2fad834876e719daca00f5fbf518850ff25e9551b1008b252cebf3/vmlinuz-4.18.0-211.el8.x86_64 rhcos.root=crypt_rootfs random.trust_cpu=on console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/e669223ccd2fad834876e719daca00f5fbf518850ff25e9551b1008b252cebf3/0 ignition.platform.id=aws z=10 y=9 x=8 x=7 w=6 w=5 v=4 u=3 t=2 s=1 r=0
sh-4.4# exit


$ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-42fe71a9854e52c79b44825e7a405182   True      False      False      3              3                   3                     0                      44m
worker   rendered-worker-d1a40648c9dfc284485eef27547c80a8   True      False      False      3              3                   3                     0                      44m
$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             43m
00-worker                                          8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             43m
01-master-container-runtime                        8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             43m
01-master-kubelet                                  8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             43m
01-worker-container-runtime                        8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             43m
01-worker-kubelet                                  8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             43m
99-master-generated-registries                     8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             43m
99-master-ssh                                                                                 2.2.0             52m
99-openshift-machineconfig-worker-kargs                                                                         11m
99-worker-generated-registries                     8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             43m
99-worker-ssh                                                                                 2.2.0             52m
rendered-master-42fe71a9854e52c79b44825e7a405182   8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             42m
rendered-worker-8a90ad4c0b62799d96c63f36ac854048   8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             42m
rendered-worker-d1a40648c9dfc284485eef27547c80a8   8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             11m
$ oc delete mc/99-openshift-machineconfig-worker-kargs
machineconfig.machineconfiguration.openshift.io "99-openshift-machineconfig-worker-kargs" deleted

$ cat kargs2.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 99-openshift-machineconfig-worker-kargs
spec:
  kernelArguments:
  - z=10 z=9 y=8 x=7 w=6 w=5 v=4 =u=3 t=2 s=1 r=0
$ oc create -f kargs2.yaml 
machineconfig.machineconfiguration.openshift.io/99-openshift-machineconfig-worker-kargs created
$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
00-worker                                          8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
01-master-container-runtime                        8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
01-master-kubelet                                  8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
01-worker-container-runtime                        8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
01-worker-kubelet                                  8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
99-master-generated-registries                     8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
99-master-ssh                                                                                 2.2.0             63m
99-openshift-machineconfig-worker-kargs                                                                         4s
99-worker-generated-registries                     8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
99-worker-ssh                                                                                 2.2.0             63m
rendered-master-42fe71a9854e52c79b44825e7a405182   8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
rendered-worker-8a90ad4c0b62799d96c63f36ac854048   8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             54m
rendered-worker-d1a40648c9dfc284485eef27547c80a8   8814f927dea5c74b3eabba6a097d6fbae4001b31   3.1.0             22m
$ oc get mc/99-openshift-machineconfig-worker-kargs -o yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  creationTimestamp: "2020-07-14T13:21:30Z"
  generation: 1
  labels:
    machineconfiguration.openshift.io/role: worker
  managedFields:
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:machineconfiguration.openshift.io/role: {}
      f:spec:
        .: {}
        f:kernelArguments: {}
    manager: oc
    operation: Update
    time: "2020-07-14T13:21:30Z"
  name: 99-openshift-machineconfig-worker-kargs
  resourceVersion: "57478"
  selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigs/99-openshift-machineconfig-worker-kargs
  uid: 7c55b87f-3624-43b2-9d89-9deb2ed6d4b7
spec:
  kernelArguments:
  - z=10 z=9 y=8 x=7 w=6 w=5 v=4 =u=3 t=2 s=1 r=0
$ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-42fe71a9854e52c79b44825e7a405182   True      False      False      3              3                   3                     0                      55m
worker   rendered-worker-8a90ad4c0b62799d96c63f36ac854048   False     True       False      3              0                   0                     0                      55m
$ oc get node
NAME                                         STATUS                     ROLES    AGE   VERSION
ip-10-0-131-12.us-west-2.compute.internal    Ready                      master   57m   v1.18.3+b9ac23f
ip-10-0-136-229.us-west-2.compute.internal   Ready,SchedulingDisabled   worker   46m   v1.18.3+b9ac23f
ip-10-0-182-165.us-west-2.compute.internal   Ready                      master   57m   v1.18.3+b9ac23f
ip-10-0-186-178.us-west-2.compute.internal   Ready                      worker   46m   v1.18.3+b9ac23f
ip-10-0-192-116.us-west-2.compute.internal   Ready                      worker   46m   v1.18.3+b9ac23f
ip-10-0-210-115.us-west-2.compute.internal   Ready                      master   57m   v1.18.3+b9ac23f

$ watch oc get node
$ oc debug node/ip-10-0-136-229.us-west-2.compute.internal -- chroot /host cat /proc/cmdline
Starting pod/ip-10-0-136-229us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-e669223ccd2fad834876e719daca00f5fbf518850ff25e9551b1008b252cebf3/vmlinuz-4.18.0-211.el8.x86_64 rhcos.root=crypt_rootfs random.trust_cpu=on console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/e669223ccd2fad834876e719daca00f5fbf518850ff25e9551b1008b252cebf3/0 ignition.platform.id=aws z=10 z=9 y=8 x=7 w=6 w=5 v=4 =u=3 t=2 s=1 r=0

Comment 10 errata-xmlrpc 2020-10-27 16:12:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.