Bug 1825137 - prometheus chart for alert on "Monitoring -> Alerting" won't be shown if configured Prometheus externalLabels setting
Summary: prometheus chart for alert on "Monitoring -> Alerting" won't be shown if conf...
Keywords:
Status: VERIFIED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 4.4
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.6.0
Assignee: Andrew Pickering
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-17 07:08 UTC by Junqi Zhao
Modified: 2020-06-30 05:53 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-06-18 19:11:34 UTC
Target Upstream Version:


Attachments (Terms of Use)
No prometheus graph for AlertmanagerReceiversNotConfigured if configured prometheus externalLabels (91.00 KB, image/png)
2020-04-17 07:08 UTC, Junqi Zhao
no flags Details
remove the externalLabels settings would show the prometheus chart (99.39 KB, image/png)
2020-04-17 07:10 UTC, Junqi Zhao
no flags Details
prometheus chart is shown with externalLabels setting (88.41 KB, image/png)
2020-06-30 05:53 UTC, Junqi Zhao
no flags Details


Links
System ID Priority Status Summary Last Updated
Github openshift console pull 5829 None closed Bug 1825137: Monitoring: Fix alert details graph when externalLabels are defined 2020-07-15 21:18:59 UTC

Description Junqi Zhao 2020-04-17 07:08:32 UTC
Created attachment 1679579 [details]
No prometheus graph for AlertmanagerReceiversNotConfigured if configured prometheus externalLabels

Description of problem:
Prometheus externalLabels setting would block prometheus chart for alert on "Monitoring -> Alerting", the exception is Watchdog alert, since its expression is just "vector(1)"
configure externalLabels for prometheus via cluster-monitoring-config configmap
*********************
# oc -n openshift-monitoring get cm cluster-monitoring-config -oyaml
apiVersion: v1
data:
  config.yaml: |
    prometheusK8s:
      externalLabels:
        region: us-east-2
        environment: AWS
kind: ConfigMap
metadata:
  creationTimestamp: "2020-04-17T05:25:31Z"
  name: cluster-monitoring-config
  namespace: openshift-monitoring
  resourceVersion: "164707"
  selfLink: /api/v1/namespaces/openshift-monitoring/configmaps/cluster-monitoring-config
  uid: 82d78eb3-88a0-44c6-b090-79b61d45b221
*********************
the alerts are with environment/region label
# token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://alertmanager-main.openshift-monitoring.svc:9094/api/v1/alerts' | jq '.data[].labels | {alertname,environment,region}'
{
  "alertname": "CustomResourceDetected",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "ServiceCatalogAPIServerEnabled",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "ServiceCatalogControllerManagerEnabled",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "Watchdog",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "AlertmanagerReceiversNotConfigured",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "FailingOperator",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "FailingOperator",
  "environment": "AWS",
  "region": "us-east-2"
}

take AlertmanagerReceiversNotConfigured as an example, the prometheus chart can not be shown on the alert datil UI of "Monitoring -> Alerting".
remove the externalLabels settings and wait for the environment/region are null, check the AlertmanagerReceiversNotConfigured chart, there is prometheus chart then

# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://alertmanager-main.openshift-monitoring.svc:9094/api/v1/alerts' | jq '.data[].labels | {alertname,environment,region}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  6600    0  6600    0     0   1171      0 --:--:--  0:00:05 --:--:--  1600
{
  "alertname": "AlertmanagerReceiversNotConfigured",
  "environment": null,
  "region": null
}
{
  "alertname": "FailingOperator",
  "environment": null,
  "region": null
}
{
  "alertname": "ServiceCatalogControllerManagerEnabled",
  "environment": null,
  "region": null
}
{
  "alertname": "CustomResourceDetected",
  "environment": null,
  "region": null
}
{
  "alertname": "ServiceCatalogAPIServerEnabled",
  "environment": null,
  "region": null
}
{
  "alertname": "FailingOperator",
  "environment": null,
  "region": null
}
{
  "alertname": "Watchdog",
  "environment": null,
  "region": null
}


Version-Release number of selected component (if applicable):
4.4.0-0.nightly-2020-04-16-205909

How reproducible:
Always

Steps to Reproduce:
1. See the description
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Junqi Zhao 2020-04-17 07:10:58 UTC
Created attachment 1679580 [details]
remove the externalLabels settings would show the prometheus chart

Comment 2 Junqi Zhao 2020-04-17 08:27:54 UTC
It is the same error on 4.3, but there is not such error on 4.2, maybe from 4.3 we used APIs from thanos

Comment 3 Andrew Pickering 2020-04-17 09:09:32 UTC
Sounds like https://github.com/openshift/console/pull/3445

Comment 5 Samuel Padgett 2020-06-18 19:11:34 UTC
Marking this a duplicate of bug 1771843. Let us know if you're still seeing the issue.

*** This bug has been marked as a duplicate of bug 1771843 ***

Comment 6 Andrew Pickering 2020-06-25 04:47:01 UTC
I was able to reproduce this bug.

Comment 10 Junqi Zhao 2020-06-28 05:52:23 UTC
Blocked by bug 1851675

Comment 11 Junqi Zhao 2020-06-30 04:31:24 UTC
fixed with 4.6.0-0.nightly-2020-06-30-000342
# oc get co/console
NAME      VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
console   4.6.0-0.nightly-2020-06-30-000342   True        False         False      133m
# oc -n openshift-console get pod
NAME                        READY   STATUS    RESTARTS   AGE
console-788b79dbcc-2zzgb    1/1     Running   0          133m
console-788b79dbcc-6mxnl    1/1     Running   0          133m
downloads-5cf6d8447-7ms5n   1/1     Running   0          143m
downloads-5cf6d8447-btgpx   1/1     Running   0          143m
# oc -n openshift-console logs console-788b79dbcc-6mxnl
2020-06-30T02:14:17Z cmd/main: cookies are secure!
2020-06-30T02:14:17Z cmd/main: Binding to [::]:8443...
2020-06-30T02:14:17Z cmd/main: using TLS

Comment 12 Junqi Zhao 2020-06-30 05:52:35 UTC
ignore comment 11, tested with 4.6.0-0.nightly-2020-06-30-000342, prometheus chart for alerts on "Monitoring -> Alerting" is shown correctly after configured Prometheus externalLabels setting, see the attached picture
# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://alertmanager-main.openshift-monitoring.svc:9094/api/v1/alerts' | jq '.data[].labels | {alertname,environment,region}'
{
  "alertname": "CannotRetrieveUpdates",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "Watchdog",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "AlertmanagerReceiversNotConfigured",
  "environment": "AWS",
  "region": "us-east-2"
}
{
  "alertname": "CustomResourceDetected",
  "environment": "AWS",
  "region": "us-east-2"
}

Comment 13 Junqi Zhao 2020-06-30 05:53:29 UTC
Created attachment 1699237 [details]
prometheus chart is shown with externalLabels setting


Note You need to log in before you can comment on or make changes to this bug.