Bug 1740559 - "many-to-many matching not allowed" for AlertmanagerConfigInconsistent rule
Summary: "many-to-many matching not allowed" for AlertmanagerConfigInconsistent rule
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.2.0
Assignee: Lili Cosic
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks: 1745379
TreeView+ depends on / blocked
 
Reported: 2019-08-13 09:16 UTC by Junqi Zhao
Modified: 2019-10-16 06:36 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1745379 (view as bug list)
Environment:
Last Closed: 2019-10-16 06:35:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
prometheus-k8s pod logs (160.76 KB, text/plain)
2019-08-13 09:16 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 445 0 None closed Bug 1740559: Bring in kube-prometheus changes 2020-11-26 19:06:05 UTC
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:36:06 UTC

Description Junqi Zhao 2019-08-13 09:16:35 UTC
Created attachment 1603252 [details]
prometheus-k8s pod logs

Description of problem:
# oc -n openshift-monitoring logs prometheus-k8s-0 -c prometheus | grep many
level=warn ts=2019-08-13T06:53:04.487Z caller=manager.go:513 component="rule manager" group=alertmanager.rules msg="Evaluating rule failed" rule="alert: AlertmanagerConfigInconsistent\nexpr: count_values by(service) (\"config_hash\", alertmanager_config_hash{job=\"alertmanager-main\",namespace=\"openshift-monitoring\"})\n  / on(service) group_left() label_replace(prometheus_operator_spec_replicas{controller=\"alertmanager\",job=\"prometheus-operator\",namespace=\"openshift-monitoring\"},\n  \"service\", \"alertmanager-$1\", \"name\", \"(.*)\") != 1\nfor: 5m\nlabels:\n  severity: critical\nannotations:\n  message: The configuration of the instances of the Alertmanager cluster `{{$labels.service}}`\n    are out of sync.\n" err="found duplicate series for the match group {service=\"alertmanager-main\"} on the right hand-side of the operation: [{__name__=\"prometheus_operator_spec_replicas\", controller=\"alertmanager\", endpoint=\"http\", instance=\"10.131.0.17:8080\", job=\"prometheus-operator\", name=\"main\", namespace=\"openshift-monitoring\", pod=\"prometheus-operator-7665c99b6f-njsgm\", service=\"alertmanager-main\"}, {__name__=\"prometheus_operator_spec_replicas\", controller=\"alertmanager\", endpoint=\"http\", instance=\"10.129.2.21:8080\", job=\"prometheus-operator\", name=\"main\", namespace=\"openshift-monitoring\", pod=\"prometheus-operator-57d45fcf98-8jslj\", service=\"alertmanager-main\"}];many-to-many matching not allowed: matching labels must be unique on one side"
*********************************************

alert: AlertmanagerConfigInconsistent
expr: count_values
  by(service) ("config_hash", alertmanager_config_hash{job="alertmanager-main",namespace="openshift-monitoring"})
  / on(service) group_left() label_replace(prometheus_operator_spec_replicas{controller="alertmanager",job="prometheus-operator",namespace="openshift-monitoring"},
  "service", "alertmanager-$1", "name", "(.*)") !=
  1 
for: 5m 
labels: severity: critical 
annotations: message: The configuration of the instances of the Alertmanager cluster `{{$labels.service}}` are out of sync.

Note: did not meet this error before, it is a 4.2 regression, but it seems it does not affect the function

Version-Release number of selected component (if applicable):
4.2.0-0.nightly-2019-08-12-153437

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 6 Junqi Zhao 2019-08-19 02:31:22 UTC
issue is fixed with 4.2.0-0.nightly-2019-08-18-222019
verification steps please see Comment 3

Comment 7 errata-xmlrpc 2019-10-16 06:35:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.