Bug 1722887 - MCO is reporting a full text value for Reason - Reason must be a short constant
Summary: MCO is reporting a full text value for Reason - Reason must be a short constant
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.1.z
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.1.z
Assignee: Christian Glombek
QA Contact: Micah Abbott
URL:
Whiteboard: 4.1.4
Depends On:
Blocks: 1722894
TreeView+ depends on / blocked
 
Reported: 2019-06-21 15:29 UTC by Clayton Coleman
Modified: 2019-07-04 09:01 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1722894 (view as bug list)
Environment:
Last Closed: 2019-07-04 09:01:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:1635 0 None None None 2019-07-04 09:01:50 UTC

Description Clayton Coleman 2019-06-21 15:29:16 UTC
The machine-config cluster operator is reporting full text string values for reason, which is not what Reason is for.

For instance, a 4.1.0 cluster is reporting:

reason = "timed out waiting for the condition during waitForDeploymentRollout: Deployment machine-config-controller is not ready. status: (replicas: 1, updated: 1, ready: 0, unavailable: 1)"

That value should be in "message" - reason must be a camel-case constant with low cardinality like "WaitForRollout" or "Timeout".

Using messages in this field can cause prometheus to report too many series, and the limit is also unbounded which could result in a failure to report metrics.

This is high severity because it could potentially bring down prometheus due to size limits, and is the wrong value.  Needs to be fixed in 4.1.3 or 4.1.4.

Comment 4 Gaoyun Pei 2019-07-01 11:22:46 UTC
Verify this bug with 4.1.4 stable payload.

# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.4     True        False         4h39m   Cluster version is 4.1.4

No such error found in machine-config-operator pod log or workers' kubelet service log.

Comment 7 errata-xmlrpc 2019-07-04 09:01:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1635


Note You need to log in before you can comment on or make changes to this bug.