+++ This bug was initially created as a clone of Bug #1722887 +++ The machine-config cluster operator is reporting full text string values for reason, which is not what Reason is for. For instance, a 4.1.0 cluster is reporting: reason = "timed out waiting for the condition during waitForDeploymentRollout: Deployment machine-config-controller is not ready. status: (replicas: 1, updated: 1, ready: 0, unavailable: 1)" That value should be in "message" - reason must be a camel-case constant with low cardinality like "WaitForRollout" or "Timeout". Using messages in this field can cause prometheus to report too many series, and the limit is also unbounded which could result in a failure to report metrics. This is high severity because it could potentially bring down prometheus due to size limits, and is the wrong value. Needs to be fixed in 4.1.3 or 4.1.4.
https://github.com/openshift/machine-config-operator/pull/876
Verified on NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.0-0.nightly-2019-06-27-041730 True False 41m Cluster version is 4.2.0-0.nightly-2019-06-27-041730 $ oc -n openshift-machine-config-operator get deployments/machine-config-operator -oyaml apiVersion: extensions/v1beta1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "1" creationTimestamp: "2019-06-27T18:46:58Z" generation: 1 labels: k8s-app: machine-config-operator name: machine-config-operator namespace: openshift-machine-config-operator resourceVersion: "2259" selfLink: /apis/extensions/v1beta1/namespaces/openshift-machine-config-operator/deployments/machine-config-operator uid: f1636c85-990b-11e9-90af-025f5011fcca spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: k8s-app: machine-config-operator strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: creationTimestamp: null labels: k8s-app: machine-config-operator spec: containers: - args: - start - --images-json=/etc/mco/images/images.json env: - name: RELEASE_VERSION value: 4.2.0-0.nightly-2019-06-27-041730 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e8cc93fc366cec2dc915b1578a6cc49a210a920a3c2d9ccce23669f1e3db6a4b imagePullPolicy: IfNotPresent name: machine-config-operator resources: requests: cpu: 20m memory: 50Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/ssl/kubernetes/ca.crt name: root-ca - mountPath: /etc/ssl/etcd/ca.crt name: etcd-ca - mountPath: /etc/mco/images name: images dnsPolicy: ClusterFirst nodeSelector: node-role.kubernetes.io/master: "" priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: runAsNonRoot: true runAsUser: 65534 terminationGracePeriodSeconds: 30 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 120 - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 120 volumes: - configMap: defaultMode: 420 name: machine-config-operator-images name: images - hostPath: path: /etc/ssl/etcd/ca.crt type: "" name: etcd-ca - hostPath: path: /etc/kubernetes/ca.crt type: "" name: root-ca status: availableReplicas: 1 conditions: - lastTransitionTime: "2019-06-27T18:48:33Z" lastUpdateTime: "2019-06-27T18:48:33Z" message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: "True" type: Available - lastTransitionTime: "2019-06-27T18:46:58Z" lastUpdateTime: "2019-06-27T18:48:33Z" message: ReplicaSet "machine-config-operator-66cf7c67d" has successfully progressed. reason: NewReplicaSetAvailable status: "True" type: Progressing observedGeneration: 1 readyReplicas: 1 replicas: 1 updatedReplicas: 1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922