Bug 1825137 - prometheus chart for alert on "Monitoring -> Alerting" won't be shown if configured Prometheus externalLabels setting
Summary: prometheus chart for alert on "Monitoring -> Alerting" won't be shown if conf...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 4.4
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.6.0
Assignee: Andrew Pickering
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-17 07:08 UTC by Junqi Zhao
Modified: 2020-10-27 15:58 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 15:58:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
No prometheus graph for AlertmanagerReceiversNotConfigured if configured prometheus externalLabels (91.00 KB, image/png)
2020-04-17 07:08 UTC, Junqi Zhao
no flags Details
remove the externalLabels settings would show the prometheus chart (99.39 KB, image/png)
2020-04-17 07:10 UTC, Junqi Zhao
no flags Details
prometheus chart is shown with externalLabels setting (88.41 KB, image/png)
2020-06-30 05:53 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift console pull 5829 0 None closed Bug 1825137: Monitoring: Fix alert details graph when externalLabels are defined 2021-01-08 05:58:07 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 15:58:41 UTC

Description Junqi Zhao 2020-04-17 07:08:32 UTC
Created attachment 1679579 [details]
No prometheus graph for AlertmanagerReceiversNotConfigured if configured prometheus externalLabels

Description of problem:
Prometheus externalLabels setting would block prometheus chart for alert on "Monitoring -> Alerting", the exception is Watchdog alert, since its expression is just "vector(1)"
configure externalLabels for prometheus via cluster-monitoring-config configmap
*********************
# oc -n openshift-monitoring get cm cluster-monitoring-config -oyaml
apiVersion: v1
data:
  config.yaml: |
    prometheusK8s:
      externalLabels:
        region: us-east-2
        environment: AWS
kind: ConfigMap
metadata:
  creationTimestamp: "2020-04-17T05:25:31Z"
  name: cluster-monitoring-config
  namespace: openshift-monitoring
  resourceVersion: "164707"
  selfLink: /api/v1/namespaces/openshift-monitoring/configmaps/cluster-monitoring-config
  uid: 82d78eb3-88a0-44c6-b090-79b61d45b221
*********************
the alerts are with environment/region label
# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://alertmanager-main.openshift-monitoring.svc:9094/api/v1/alerts' | jq '.data[].labels | {alertname,environment,region}'
{
  "alertname": "CustomResourceDetected",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "ServiceCatalogAPIServerEnabled",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "ServiceCatalogControllerManagerEnabled",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "Watchdog",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "AlertmanagerReceiversNotConfigured",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "FailingOperator",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "FailingOperator",
  "environment": "AWS",
  "region": "us-east-2"
}

take AlertmanagerReceiversNotConfigured as an example, the prometheus chart can not be shown on the alert datil UI of "Monitoring -> Alerting".
remove the externalLabels settings and wait for the environment/region are null, check the AlertmanagerReceiversNotConfigured chart, there is prometheus chart then

# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://alertmanager-main.openshift-monitoring.svc:9094/api/v1/alerts' | jq '.data[].labels | {alertname,environment,region}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  6600    0  6600    0     0   1171      0 --:--:--  0:00:05 --:--:--  1600
{
  "alertname": "AlertmanagerReceiversNotConfigured",
  "environment": null,
  "region": null
}
{
  "alertname": "FailingOperator",
  "environment": null,
  "region": null
}
{
  "alertname": "ServiceCatalogControllerManagerEnabled",
  "environment": null,
  "region": null
}
{
  "alertname": "CustomResourceDetected",
  "environment": null,
  "region": null
}
{
  "alertname": "ServiceCatalogAPIServerEnabled",
  "environment": null,
  "region": null
}
{
  "alertname": "FailingOperator",
  "environment": null,
  "region": null
}
{
  "alertname": "Watchdog",
  "environment": null,
  "region": null
}


Version-Release number of selected component (if applicable):
4.4.0-0.nightly-2020-04-16-205909

How reproducible:
Always

Steps to Reproduce:
1. See the description
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Junqi Zhao 2020-04-17 07:10:58 UTC
Created attachment 1679580 [details]
remove the externalLabels settings would show the prometheus chart

Comment 2 Junqi Zhao 2020-04-17 08:27:54 UTC
It is the same error on 4.3, but there is not such error on 4.2, maybe from 4.3 we used APIs from thanos

Comment 3 Andrew Pickering 2020-04-17 09:09:32 UTC
Sounds like https://github.com/openshift/console/pull/3445

Comment 5 Samuel Padgett 2020-06-18 19:11:34 UTC
Marking this a duplicate of bug 1771843. Let us know if you're still seeing the issue.

*** This bug has been marked as a duplicate of bug 1771843 ***

Comment 6 Andrew Pickering 2020-06-25 04:47:01 UTC
I was able to reproduce this bug.

Comment 10 Junqi Zhao 2020-06-28 05:52:23 UTC
Blocked by bug 1851675

Comment 11 Junqi Zhao 2020-06-30 04:31:24 UTC
fixed with 4.6.0-0.nightly-2020-06-30-000342
# oc get co/console
NAME      VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
console   4.6.0-0.nightly-2020-06-30-000342   True        False         False      133m
# oc -n openshift-console get pod
NAME                        READY   STATUS    RESTARTS   AGE
console-788b79dbcc-2zzgb    1/1     Running   0          133m
console-788b79dbcc-6mxnl    1/1     Running   0          133m
downloads-5cf6d8447-7ms5n   1/1     Running   0          143m
downloads-5cf6d8447-btgpx   1/1     Running   0          143m
# oc -n openshift-console logs console-788b79dbcc-6mxnl
2020-06-30T02:14:17Z cmd/main: cookies are secure!
2020-06-30T02:14:17Z cmd/main: Binding to [::]:8443...
2020-06-30T02:14:17Z cmd/main: using TLS

Comment 12 Junqi Zhao 2020-06-30 05:52:35 UTC
ignore comment 11, tested with 4.6.0-0.nightly-2020-06-30-000342, prometheus chart for alerts on "Monitoring -> Alerting" is shown correctly after configured Prometheus externalLabels setting, see the attached picture
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://alertmanager-main.openshift-monitoring.svc:9094/api/v1/alerts' | jq '.data[].labels | {alertname,environment,region}'
{
  "alertname": "CannotRetrieveUpdates",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "Watchdog",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "AlertmanagerReceiversNotConfigured",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "CustomResourceDetected",
  "environment": "AWS",
  "region": "us-east-2"
}

Comment 13 Junqi Zhao 2020-06-30 05:53:29 UTC
Created attachment 1699237 [details]
prometheus chart is shown with externalLabels setting

Comment 15 errata-xmlrpc 2020-10-27 15:58:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.