Bug 1853890
| Summary: | Machine config daemon does not preserve kernel arguments order | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Artyom <alukiano> |
| Component: | Machine Config Operator | Assignee: | Antonio Murdaca <amurdaca> |
| Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.6 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.6.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-10-27 16:12:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
> Additional info:
> **IMPORTANT**: I see another problem with the current map implementation, it
> can miss some arguments in the case when you have arguments with the same
> name, but with the different value, for example you can specify multiple
> sizes of hugepages.
> hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024
I'm gonna check this as I think we're allowing same-key args - if you look at the map, we're using the whole karg as the kay so we should allow duplicates so that may not be something to fix (ordering I get it instead)
(In reply to Antonio Murdaca from comment #1) > > Additional info: > > **IMPORTANT**: I see another problem with the current map implementation, it > > can miss some arguments in the case when you have arguments with the same > > name, but with the different value, for example you can specify multiple > > sizes of hugepages. > > hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024 > > I'm gonna check this as I think we're allowing same-key args - if you look > at the map, we're using the whole karg as the kay so we should allow > duplicates so that may not be something to fix (ordering I get it instead) Unless something else is removing duplicates, the MCO do run --append with duplicate keys so, I'll further check Thanks for the check. Verified on 4.6.0-0.nightly-2020-07-14-092216. Order of kargs is preserved and same name kargs are preserved using both styles of specifying kernel argumetns.
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.6.0-0.nightly-2020-07-14-092216 True False 15m Cluster version is 4.6.0-0.nightly-2020-07-14-092216
$ cat kargs.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-openshift-machineconfig-worker-kargs
spec:
kernelArguments:
- 'z=10'
- 'y=9'
- 'x=8'
- 'x=7'
- 'w=6'
- 'w=5'
- 'v=4'
- 'u=3'
- 't=2'
- 's=1'
- 'r=0'
$ oc create -f kargs.yaml
machineconfig.machineconfiguration.openshift.io/99-openshift-machineconfig-worker-kargs created
$
$ oc get mc
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
00-worker 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
01-master-container-runtime 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
01-master-kubelet 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
01-worker-container-runtime 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
01-worker-kubelet 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
99-master-generated-registries 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
99-master-ssh 2.2.0 41m
99-openshift-machineconfig-worker-kargs 13s
99-worker-generated-registries 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
99-worker-ssh 2.2.0 41m
rendered-master-42fe71a9854e52c79b44825e7a405182 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
rendered-worker-8a90ad4c0b62799d96c63f36ac854048 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 32m
rendered-worker-d1a40648c9dfc284485eef27547c80a8 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 8s
$ oc get mc/99-openshift-machineconfig-worker-kargs -o yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
creationTimestamp: "2020-07-14T12:59:26Z"
generation: 1
labels:
machineconfiguration.openshift.io/role: worker
managedFields:
- apiVersion: machineconfiguration.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.: {}
f:machineconfiguration.openshift.io/role: {}
f:spec:
.: {}
f:kernelArguments: {}
manager: oc
operation: Update
time: "2020-07-14T12:59:26Z"
name: 99-openshift-machineconfig-worker-kargs
resourceVersion: "37573"
selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigs/99-openshift-machineconfig-worker-kargs
uid: a16af0dd-0014-47aa-a1cc-9135e8ca3b00
spec:
kernelArguments:
- z=10
- y=9
- x=8
- x=7
- w=6
- w=5
- v=4
- u=3
- t=2
- s=1
- r=0
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-42fe71a9854e52c79b44825e7a405182 True False False 3 3 3 0 33m
worker rendered-worker-8a90ad4c0b62799d96c63f36ac854048 False True False 3 0 0 0 33m
$ watch oc get node
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-131-12.us-west-2.compute.internal Ready master 41m v1.18.3+b9ac23f
ip-10-0-136-229.us-west-2.compute.internal Ready worker 31m v1.18.3+b9ac23f
ip-10-0-182-165.us-west-2.compute.internal Ready master 41m v1.18.3+b9ac23f
ip-10-0-186-178.us-west-2.compute.internal Ready worker 31m v1.18.3+b9ac23f
ip-10-0-192-116.us-west-2.compute.internal Ready,SchedulingDisabled worker 31m v1.18.3+b9ac23f
ip-10-0-210-115.us-west-2.compute.internal Ready master 41m v1.18.3+b9ac23f
$ oc debug node/ip-10-0-186-178.us-west-2.compute.internal
Starting pod/ip-10-0-186-178us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-e669223ccd2fad834876e719daca00f5fbf518850ff25e9551b1008b252cebf3/vmlinuz-4.18.0-211.el8.x86_64 rhcos.root=crypt_rootfs random.trust_cpu=on console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/e669223ccd2fad834876e719daca00f5fbf518850ff25e9551b1008b252cebf3/0 ignition.platform.id=aws z=10 y=9 x=8 x=7 w=6 w=5 v=4 u=3 t=2 s=1 r=0
sh-4.4# exit
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-42fe71a9854e52c79b44825e7a405182 True False False 3 3 3 0 44m
worker rendered-worker-d1a40648c9dfc284485eef27547c80a8 True False False 3 3 3 0 44m
$ oc get mc
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 43m
00-worker 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 43m
01-master-container-runtime 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 43m
01-master-kubelet 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 43m
01-worker-container-runtime 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 43m
01-worker-kubelet 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 43m
99-master-generated-registries 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 43m
99-master-ssh 2.2.0 52m
99-openshift-machineconfig-worker-kargs 11m
99-worker-generated-registries 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 43m
99-worker-ssh 2.2.0 52m
rendered-master-42fe71a9854e52c79b44825e7a405182 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 42m
rendered-worker-8a90ad4c0b62799d96c63f36ac854048 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 42m
rendered-worker-d1a40648c9dfc284485eef27547c80a8 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 11m
$ oc delete mc/99-openshift-machineconfig-worker-kargs
machineconfig.machineconfiguration.openshift.io "99-openshift-machineconfig-worker-kargs" deleted
$ cat kargs2.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-openshift-machineconfig-worker-kargs
spec:
kernelArguments:
- z=10 z=9 y=8 x=7 w=6 w=5 v=4 =u=3 t=2 s=1 r=0
$ oc create -f kargs2.yaml
machineconfig.machineconfiguration.openshift.io/99-openshift-machineconfig-worker-kargs created
$ oc get mc
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
00-worker 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
01-master-container-runtime 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
01-master-kubelet 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
01-worker-container-runtime 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
01-worker-kubelet 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
99-master-generated-registries 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
99-master-ssh 2.2.0 63m
99-openshift-machineconfig-worker-kargs 4s
99-worker-generated-registries 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
99-worker-ssh 2.2.0 63m
rendered-master-42fe71a9854e52c79b44825e7a405182 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
rendered-worker-8a90ad4c0b62799d96c63f36ac854048 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 54m
rendered-worker-d1a40648c9dfc284485eef27547c80a8 8814f927dea5c74b3eabba6a097d6fbae4001b31 3.1.0 22m
$ oc get mc/99-openshift-machineconfig-worker-kargs -o yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
creationTimestamp: "2020-07-14T13:21:30Z"
generation: 1
labels:
machineconfiguration.openshift.io/role: worker
managedFields:
- apiVersion: machineconfiguration.openshift.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.: {}
f:machineconfiguration.openshift.io/role: {}
f:spec:
.: {}
f:kernelArguments: {}
manager: oc
operation: Update
time: "2020-07-14T13:21:30Z"
name: 99-openshift-machineconfig-worker-kargs
resourceVersion: "57478"
selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigs/99-openshift-machineconfig-worker-kargs
uid: 7c55b87f-3624-43b2-9d89-9deb2ed6d4b7
spec:
kernelArguments:
- z=10 z=9 y=8 x=7 w=6 w=5 v=4 =u=3 t=2 s=1 r=0
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-42fe71a9854e52c79b44825e7a405182 True False False 3 3 3 0 55m
worker rendered-worker-8a90ad4c0b62799d96c63f36ac854048 False True False 3 0 0 0 55m
$ oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-131-12.us-west-2.compute.internal Ready master 57m v1.18.3+b9ac23f
ip-10-0-136-229.us-west-2.compute.internal Ready,SchedulingDisabled worker 46m v1.18.3+b9ac23f
ip-10-0-182-165.us-west-2.compute.internal Ready master 57m v1.18.3+b9ac23f
ip-10-0-186-178.us-west-2.compute.internal Ready worker 46m v1.18.3+b9ac23f
ip-10-0-192-116.us-west-2.compute.internal Ready worker 46m v1.18.3+b9ac23f
ip-10-0-210-115.us-west-2.compute.internal Ready master 57m v1.18.3+b9ac23f
$ watch oc get node
$ oc debug node/ip-10-0-136-229.us-west-2.compute.internal -- chroot /host cat /proc/cmdline
Starting pod/ip-10-0-136-229us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-e669223ccd2fad834876e719daca00f5fbf518850ff25e9551b1008b252cebf3/vmlinuz-4.18.0-211.el8.x86_64 rhcos.root=crypt_rootfs random.trust_cpu=on console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/e669223ccd2fad834876e719daca00f5fbf518850ff25e9551b1008b252cebf3/0 ignition.platform.id=aws z=10 z=9 y=8 x=7 w=6 w=5 v=4 =u=3 t=2 s=1 r=0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |
Description of problem: I have a machineconfig CR with the kernel arguments section "kernelArguments": [ "skew_tick=1", "nohz=on", "rcu_nocbs=1-3", "tuned.non_isolcpus=00000001", "intel_pstate=disable", "nosoftlockup", "tsc=nowatchdog", "intel_iommu=on", "iommu=pt", "systemd.cpu_affinity=0", "default_hugepagesz=1G", "nmi_watchdog=0", "audit=0", "mce=off", "processor.max_cstate=1", "idle=poll", "intel_idle.max_cstate=0" ], but once machine-config daemon runs the rpm-ostree kargs command the order of arguments not preserved kargs --append=skew_tick=1 --append=tuned.non_isolcpus=00000001 --append=tsc=nowatchdog --append=processor.max_cstate=1 --append=rcu_nocbs=1-3 --append=intel_iommu=on --append=systemd.cpu_affinity=0 --append=nmi_watchdog=0 --append=mce=off --append=nohz=on --append=intel_pstate=disable --append=nosoftlockup --append=iommu=pt --append=default_hugepagesz=1G --append=audit=0 --append=idle=poll --append=intel_idle.max_cstate=0 It happens because code generates maps from the kernel arguments - https://github.com/openshift/machine-config-operator/blob/7cc08753e1bfd10503df5bb7a730c8a977a5d204/pkg/daemon/update.go#L670 and go map does not preserve the order. Version-Release number of selected component (if applicable): master How reproducible: Always Steps to Reproduce: 1. Create the machineconfig with the above kernel arguments and attach it to worker MCP. 2. Gives to the MCP some time for the update. 3. Verify the rpm-ostree command under the relevant machine-config log. Actual results: The order not preserved Expected results: The order should be preserved Additional info: **IMPORTANT**: I see another problem with the current map implementation, it can miss some arguments in the case when you have arguments with the same name, but with the different value, for example you can specify multiple sizes of hugepages. hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024