Bug 1802214
Summary: | Upgrading from 4.2 to 4.3 creates new alerting issues | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Matt Woodson <mwoodson> |
Component: | Monitoring | Assignee: | Sergiusz Urbaniak <surbania> |
Status: | CLOSED NOTABUG | QA Contact: | Junqi Zhao <juzhao> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.3.0 | CC: | alegrand, anpicker, brad.williams, eparis, erooth, kakkoyun, lcosic, mloibl, pkrupa, surbania |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-02-13 09:20:46 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Matt Woodson
2020-02-12 15:53:26 UTC
This is the internal issues OSD is tracking: https://issues.redhat.com/browse/OSD-2861 After upgrading to 4.3, we observed these same alerts in the Starter clusters as well. The ones that seem most concerning are these 2: Critical alert is firing: {u'state': u'firing', u'labels': {u'severity': u'critical', u'alertname': u'ClusterAutoscalerOperatorDown'}, u'annotations': {u'message': u'cluster-autoscaler-operator has disappeared from Prometheus target discovery.'}, u'value': u'1e+00', u'activeAt': u'2020-01-28T23:29:41.800595892Z'} Alert is firing: {u'state': u'firing', u'labels': {u'job': u'cluster-autoscaler-operator', u'namespace': u'openshift-machine-api', u'alertname': u'TargetDown', u'service': u'cluster-autoscaler-operator', u'severity': u'warning'}, u'annotations': {u'message': u'100% of the cluster-autoscaler-operator targets in openshift-machine-api namespace are down.'}, u'value': u'1e+02', u'activeAt': u'2020-01-28T23:29:30.163677339Z'} following up here. Here are two bugs that call are causing the issues: Autoscaler: https://bugzilla.redhat.com/show_bug.cgi?id=1801300 Ingress Operator: https://bugzilla.redhat.com/show_bug.cgi?id=1802248 Closing as discussed with Matt, as all the bugzillas are already open. |