Bug 1722887

Summary: MCO is reporting a full text value for Reason - Reason must be a short constant
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: Machine Config OperatorAssignee: Christian Glombek <cglombek>
Status: CLOSED ERRATA QA Contact: Micah Abbott <miabbott>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.zCC: amurdaca, gpei, lmeyer, xtian
Target Milestone: ---   
Target Release: 4.1.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: 4.1.4
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1722894 (view as bug list) Environment:
Last Closed: 2019-07-04 09:01:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1722894    

Description Clayton Coleman 2019-06-21 15:29:16 UTC
The machine-config cluster operator is reporting full text string values for reason, which is not what Reason is for.

For instance, a 4.1.0 cluster is reporting:

reason = "timed out waiting for the condition during waitForDeploymentRollout: Deployment machine-config-controller is not ready. status: (replicas: 1, updated: 1, ready: 0, unavailable: 1)"

That value should be in "message" - reason must be a camel-case constant with low cardinality like "WaitForRollout" or "Timeout".

Using messages in this field can cause prometheus to report too many series, and the limit is also unbounded which could result in a failure to report metrics.

This is high severity because it could potentially bring down prometheus due to size limits, and is the wrong value.  Needs to be fixed in 4.1.3 or 4.1.4.

Comment 4 Gaoyun Pei 2019-07-01 11:22:46 UTC
Verify this bug with 4.1.4 stable payload.

# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.4     True        False         4h39m   Cluster version is 4.1.4

No such error found in machine-config-operator pod log or workers' kubelet service log.

Comment 7 errata-xmlrpc 2019-07-04 09:01:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1635