Bug 1812649
Summary: | Deletion of a MachineConfig with multiple kernelArguments as a single string causes a Degraded MCP | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jiří Mencák <jmencak> |
Component: | Machine Config Operator | Assignee: | Antonio Murdaca <amurdaca> |
Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 4.4 | CC: | amurdaca, msluiter, nstielau, smilner |
Target Milestone: | --- | ||
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
Previously, kernel arguments specified in MachineConfigs needed to be split out into individual argument strings in the array. These kargs were not validated before being concatenated into an rpm-ostree command.
Consequence:
Multiple kernel arguments concatenated via a space, as allowed in a single line in the kernel command line, would create an invalid rpm-ostree command.
Fix:
The MachineConfigController parsed each kernelArgument item in a similar manner as the kernel.
Result:
Users can supply multiple arguments concatenated via a space without errors.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2020-10-27 15:57:04 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jiří Mencák
2020-03-11 18:37:21 UTC
What I believe is happening here is that we are creating an invalid rpm-ostree command: e.g. `rpm-ostree kargs --append=a=1 b=2` instead of `rpm-ostree kargs --append=a=1 --append=b=2`. That causes it to go degraded when it tries to apply them. We could attempt to parse the kargs better, such as by using a simple string split on whitespace. That may cause other issues with fancier kargs that use quotes or something. > We could attempt to parse the kargs better I think clarifying your expectation on what you want users to do with that field is a better approach than trying to parse whatever users try to do. You could improve documentation by adding corresponding comments on (not only this) field in the Go code and the CRD, and extending the example on https://github.com/openshift/machine-config-operator/blob/master/docs/MachineConfiguration.md#kernelarguments with multiple kargs :) (In reply to Marc Sluiter from comment #2) > > We could attempt to parse the kargs better > > I think clarifying your expectation on what you want users to do with that > field is a better approach than trying to parse whatever users try to do. > You could improve documentation by adding corresponding comments on (not > only this) field in the Go code and the CRD, and extending the example on > https://github.com/openshift/machine-config-operator/blob/master/docs/ > MachineConfiguration.md#kernelarguments with multiple kargs :) I agree with the "better documentation" part, but I think the code should also be more robust to handle this, otherwise this is probably not the last BZ they see about this. The tuned daemon, for example, is preparing kernel boot parameters in /etc/tuned/bootcmdline as a single string with multiple kernel parameters. Something will have to do the parsing. Looking at kernel code the parsing shouldn't hopefully be all that difficult. As a workaround, I'm looking at writing some golang parameter parsing code for the node tuning operator. Verified on 4.6.0-0.nightly-2020-07-07-083718. Deletion of kargs in a single line does not cause degraded MCP.
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.6.0-0.nightly-2020-07-07-083718 True False 3h16m Cluster version is 4.6.0-0.nightly-2020-07-07-083718
$ oc create -f - <<EOF
> apiVersion: machineconfiguration.openshift.io/v1
> kind: MachineConfig
> metadata:
> labels:
> machineconfiguration.openshift.io/role: worker
> name: 50-worker-custom
> spec:
> kernelArguments:
> - a=1 b=2
> EOF
machineconfig.machineconfiguration.openshift.io/50-worker-custom created
$
$
$ oc get mc
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
00-worker 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
01-master-container-runtime 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
01-master-kubelet 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
01-worker-container-runtime 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
01-worker-kubelet 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
50-worker-custom 10s
99-master-generated-registries 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
99-master-ssh 2.2.0 3h16m
99-worker-generated-registries 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
99-worker-ssh 2.2.0 3h16m
rendered-master-c1aacf3a48a81966a24864e984ea37bc 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
rendered-worker-c01ae91040c73f0a3bf642a626d6a237 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 5s
rendered-worker-f2257ee53059916ea51a9652971094c5 34d03f09dc395269de06ab290a0422a8274b8bd9 2.2.0 3h10m
$ oc get mcp/worker
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
worker rendered-worker-f2257ee53059916ea51a9652971094c5 False True False 3 0 0 0 3h11m
$ watch oc get node
$ oc get mcp/worker
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
worker rendered-worker-c01ae91040c73f0a3bf642a626d6a237 True False False 3 3 3 0 3h26m
$ oc delete mc/50-worker-custom
machineconfig.machineconfiguration.openshift.io "50-worker-custom" deleted
$ oc get mcp/worker
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
worker rendered-worker-c01ae91040c73f0a3bf642a626d6a237 False True False 3 0 0 0 3h27m
$ watch oc get node
$ oc get mcp/worker
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
worker rendered-worker-f2257ee53059916ea51a9652971094c5 True False False 3 3 3 0 3h38m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |