Bug 1944986 - Clarify the ContainerRuntimeConfiguration cr description on the validation
Summary: Clarify the ContainerRuntimeConfiguration cr description on the validation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 4.8.0
Assignee: Harshal Patil
QA Contact: MinLi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-31 07:12 UTC by Harshal Patil
Modified: 2021-07-27 22:57 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 22:56:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2503 0 None open Bug 1944986: Doc fix for ContainerRuntimeConfig CR 2021-03-31 10:45:46 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:57:03 UTC

Description Harshal Patil 2021-03-31 07:12:06 UTC
In an attempt to fix, https://bugzilla.redhat.com/show_bug.cgi?id=1930636#c3, we discovered that it's pretty simple to do a basic validation on values of ContainerRuntimeConfiguration within MCO. 

As long as the value can be parsed as int64 the validation code within the controller responsible for ContainerRuntimeConfiguration gets invoked and we do some basic validation like boundary condition validation, sign validation to check negative numbers etc. 

However, if the given input is such that it cannot be parsed into int64 type at all, e.g. "9asadG" the execution flow fails even before it reaches validation code for ContainerRuntimeConfiguration. 

W0330 08:03:49.665463       1 reflector.go:436] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: watch of *v1.ContainerRuntimeConfig ended with: an error on the server ("unable to decode an event from the watch stream: unable to decode watch event: v1.ContainerRuntimeConfig.Spec: v1.ContainerRuntimeConfigSpec.MachineConfigPoolSelector: ContainerRuntimeConfig: v1.ContainerRuntimeConfiguration.OverlaySize: unmarshalerDecoder: quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$', error found in #10 byte of ...|\":\"9asadG\"},\"machine|..., bigger context ...|:{\"containerRuntimeConfig\":{\"overlaySize\":\"9asadG\"},\"machineConfigPoolSelector\":{\"matchLabels\":{\"cus|...") has prevented the request from succeeding
E0330 08:03:50.810155       1 reflector.go:138] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to watch *v1.ContainerRuntimeConfig: failed to list *v1.ContainerRuntimeConfig: v1.ContainerRuntimeConfigList.Items: []v1.ContainerRuntimeConfig: v1.ContainerRuntimeConfig.Spec: v1.ContainerRuntimeConfigSpec.MachineConfigPoolSelector: ContainerRuntimeConfig: v1.ContainerRuntimeConfiguration.OverlaySize: unmarshalerDecoder: quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$', error found in #10 byte of ...|":"9asadG"},"machine|..., bigger context ...|:{"containerRuntimeConfig":{"overlaySize":"9asadG"},"machineConfigPoolSelector":{"matchLabels":{"cus|...


As you can see, for an input that's not int64, the failure occurs in reflector.go in the go client of upstream k8s which is trying to read the submitted yaml file by the user for updating ContainerRuntimeConfiguration. In the past, we could handle the validation for such an input for KubeletConfig in MCO because it's defined as of type &runtime.RawExtension. Upstream go client will happily read whatever it's thrown at if it's &runtime.RawExtension and that's why we could make the execution flow reach the controller for KubeletConfig which could then do the validation. 

But some of the fields of ContainerRuntimeConfiguration, such as LogSizeMax or OverlaySize, are of type resource.Quantity. Which means at the time of reading the user input the upstream go client will try to make sure the input is indeed of type  resource.Quantity and this is where it's failing for input like "9asadG". 

This clearly falls outside the domain of MCO, and there is little we could do there to improve the situation from MCO's point of view. Hence I am going update the docs and CRD description for ContainerRuntimeConfiguration to alert the user that while we do our best to validate the input they will have to be more vigilant too and where to look for logs in case of the failure.

Comment 3 MinLi 2021-04-09 09:20:51 UTC
verified on version : 4.8.0-0.nightly-2021-04-09-000946

$ oc explain ContainerRuntimeConfig.spec
KIND:     ContainerRuntimeConfig
VERSION:  machineconfiguration.openshift.io/v1

RESOURCE: spec <Object>

DESCRIPTION:
     ContainerRuntimeConfigSpec defines the desired state of
     ContainerRuntimeConfig

FIELDS:
   containerRuntimeConfig	<Object> -required-
     ContainerRuntimeConfiguration defines the tuneables of the container
     runtime. It's important to note that, since the fields of the
     ContainerRuntimeConfiguration are directly read by the upstream kubernetes
     golang client, the validation of those values is handled directly by that
     golang client which is outside of the controller for
     ContainerRuntimeConfiguration. Please ensure the valid values are used for
     those fields as invalid values may render cluster nodes unusable.

   machineConfigPoolSelector	<Object>
     A label selector is a label query over a set of resources. The result of
     matchLabels and matchExpressions are ANDed. An empty label selector matches
     all objects. A null label selector matches no objects.

Comment 6 errata-xmlrpc 2021-07-27 22:56:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.