1926598 – Duplicate alert rules are displayed on console for thanos-querier api return wrong results

Bug 1926598 - Duplicate alert rules are displayed on console for thanos-querier api return wrong results

Summary: Duplicate alert rules are displayed on console for thanos-querier api return ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	4.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	4.8.0
Assignee:	Sergiusz Urbaniak
QA Contact:	hongyan li
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1940882 (view as bug list)
Depends On:
Blocks:	1944575
TreeView+	depends on / blocked

Reported:	2021-02-09 07:45 UTC by hongyan li
Modified:	2021-07-27 22:43 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-07-27 22:42:29 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
screen shot for console with duplicate alert rule (239.68 KB, image/png) 2021-02-09 07:49 UTC, hongyan li	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift thanos pull 51	None	closed	Bug 1926598: pkg/rules: fix deduplication of equal alerts with different labels	2021-03-30 13:54:29 UTC
Github	thanos-io thanos pull 3960	None	closed	pkg/rules: fix deduplication of equal alerts with different labels	2021-03-30 13:54:32 UTC
Red Hat Product Errata	RHSA-2021:2438	None	None	None	2021-07-27 22:43:31 UTC

Description hongyan li 2021-02-09 07:45:15 UTC

Description of problem:
Duplicate alert rules display on console

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2021-02-08-191932

How reproducible:
always

Steps to Reproduce:
1.Open console, click Monitoring->Alerting
2.click Alerting rules tab
3.Duplicate alert rules display, such as
ElasticsearchClusterNotHealthy display 4 items, actual 2
ElasticsearchNodeDiskWatermarkReached display 6 items, actual 3
etcdHighFsyncDurations display 4 items, actual 2


Actual results:

# oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/rules' | jq '.data.groups[].rules[].name' | sort|grep ElasticsearchClusterNotHealthy
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  191k    0  191k    0     0  5317k      0 --:--:-- --:--:-- --:--:-- 5317k
"ElasticsearchClusterNotHealthy"
"ElasticsearchClusterNotHealthy"

#oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules' | jq '.data.groups[].rules[].name' | sort|grep ElasticsearchClusterNotHealthy
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  251k    0  251k    0     0  3110k      0 --:--:-- --:--:-- --:--:-- 3110k
"ElasticsearchClusterNotHealthy"
"ElasticsearchClusterNotHealthy"
"ElasticsearchClusterNotHealthy"
"ElasticsearchClusterNotHealthy"


Expected results:
The api return same results for platform alert rules

Additional info:
didn't enable user-workload monitoring

Comment 1 hongyan li 2021-02-09 07:49:48 UTC

Created attachment 1755860 [details]
screen shot for console with duplicate alert rule

Comment 2 hongyan li 2021-02-09 08:08:50 UTC

oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules' | jq |grep ElasticsearchClusterNotHealthy -A20
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  251k    0  251k    0     0  4344k      0 --:--:-- --:--:-- --:--:-- 4344k
            "name": "ElasticsearchClusterNotHealthy",
            "query": "sum by(cluster) (es_cluster_status == 2)",
            "duration": 420,
            "labels": {
              "prometheus": "openshift-monitoring/k8s",
              "severity": "critical"
            },
            "annotations": {
              "message": "Cluster {{ $labels.cluster }} health status has been RED for at least 7m. Cluster does not accept writes, shards may be missing or master node hasn't been elected yet. For more information refer to https://github.com/openshift/elasticsearch-operator/blob/master/docs/alerts.md#Elasticsearch-Cluster-Health-is-Red",
              "summary": "Cluster health status is RED"
            },
            "alerts": [],
            "health": "ok",
            "evaluationTime": 0.000253482,
            "lastEvaluation": "2021-02-09T08:06:42.681095906Z",
            "type": "alerting"
          },
          {
            "state": "inactive",
            "name": "ElasticsearchClusterNotHealthy",
            "query": "sum by(cluster) (es_cluster_status == 1)",
            "duration": 1200,
            "labels": {
              "prometheus": "openshift-monitoring/k8s",
              "severity": "warning"
            },
            "annotations": {
              "message": "Cluster {{ $labels.cluster }} health status has been YELLOW for at least 20m. Some shard replicas are not allocated. For more information refer to https://github.com/openshift/elasticsearch-operator/blob/master/docs/alerts.md#Elasticsearch-Cluster-Healthy-is-Yellow",
              "summary": "Cluster health status is YELLOW"
            },
            "alerts": [],
            "health": "ok",
            "evaluationTime": 0.000109183,
            "lastEvaluation": "2021-02-09T08:06:42.68135068Z",
            "type": "alerting"
          },
          {
            "state": "inactive",
            "name": "ElasticsearchClusterNotHealthy",
            "query": "sum by(cluster) (es_cluster_status == 2)",
            "duration": 420,
            "labels": {
              "prometheus": "openshift-monitoring/k8s",
              "severity": "critical"
            },
            "annotations": {
              "message": "Cluster {{ $labels.cluster }} health status has been RED for at least 7m. Cluster does not accept writes, shards may be missing or master node hasn't been elected yet. For more information refer to https://github.com/openshift/elasticsearch-operator/blob/master/docs/alerts.md#Elasticsearch-Cluster-Health-is-Red",
              "summary": "Cluster health status is RED"
            },
            "alerts": [],
            "health": "ok",
            "evaluationTime": 0.000212364,
            "lastEvaluation": "2021-02-09T08:06:42.681140147Z",
            "type": "alerting"
          },
          {
            "state": "inactive",
            "name": "ElasticsearchClusterNotHealthy",
            "query": "sum by(cluster) (es_cluster_status == 1)",
            "duration": 1200,
            "labels": {
              "prometheus": "openshift-monitoring/k8s",
              "severity": "warning"
            },
            "annotations": {
              "message": "Cluster {{ $labels.cluster }} health status has been YELLOW for at least 20m. Some shard replicas are not allocated. For more information refer to https://github.com/openshift/elasticsearch-operator/blob/master/docs/alerts.md#Elasticsearch-Cluster-Healthy-is-Yellow",
              "summary": "Cluster health status is YELLOW"
            },
            "alerts": [],
            "health": "ok",
            "evaluationTime": 8.7608e-05,
            "lastEvaluation": "2021-02-09T08:06:42.681353788Z",
            "type": "alerting"
          },
          {
            "state": "inactive",
            "name": "ElasticsearchDiskSpaceRunningLow",
            "query": "sum(predict_linear(es_fs_path_available_bytes[6h], 6 * 3600)) < 0",

Comment 3 hongyan li 2021-03-23 09:22:11 UTC

*** Bug 1940882 has been marked as a duplicate of this bug. ***

Comment 9 Junqi Zhao 2021-03-31 03:44:48 UTC

tested with 4.8.0-0.nightly-2021-03-30-160509, search alerting rules in "Monitoring-> Alerting -> Alerting rules", no duplicate alert rules are displayed now

Comment 12 errata-xmlrpc 2021-07-27 22:42:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438

Note You need to log in before you can comment on or make changes to this bug.