Bug 1740559

Summary: "many-to-many matching not allowed" for AlertmanagerConfigInconsistent rule
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: MonitoringAssignee: Lili Cosic <lcosic>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: low Docs Contact:
Priority: low    
Version: 4.2.0CC: alegrand, anpicker, erooth, lcosic, mloibl, pkrupa, surbania
Target Milestone: ---Keywords: Regression
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1745379 (view as bug list) Environment:
Last Closed: 2019-10-16 06:35:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1745379    
Attachments:
Description Flags
prometheus-k8s pod logs none

Description Junqi Zhao 2019-08-13 09:16:35 UTC
Created attachment 1603252 [details]
prometheus-k8s pod logs

Description of problem:
# oc -n openshift-monitoring logs prometheus-k8s-0 -c prometheus | grep many
level=warn ts=2019-08-13T06:53:04.487Z caller=manager.go:513 component="rule manager" group=alertmanager.rules msg="Evaluating rule failed" rule="alert: AlertmanagerConfigInconsistent\nexpr: count_values by(service) (\"config_hash\", alertmanager_config_hash{job=\"alertmanager-main\",namespace=\"openshift-monitoring\"})\n  / on(service) group_left() label_replace(prometheus_operator_spec_replicas{controller=\"alertmanager\",job=\"prometheus-operator\",namespace=\"openshift-monitoring\"},\n  \"service\", \"alertmanager-$1\", \"name\", \"(.*)\") != 1\nfor: 5m\nlabels:\n  severity: critical\nannotations:\n  message: The configuration of the instances of the Alertmanager cluster `{{$labels.service}}`\n    are out of sync.\n" err="found duplicate series for the match group {service=\"alertmanager-main\"} on the right hand-side of the operation: [{__name__=\"prometheus_operator_spec_replicas\", controller=\"alertmanager\", endpoint=\"http\", instance=\"10.131.0.17:8080\", job=\"prometheus-operator\", name=\"main\", namespace=\"openshift-monitoring\", pod=\"prometheus-operator-7665c99b6f-njsgm\", service=\"alertmanager-main\"}, {__name__=\"prometheus_operator_spec_replicas\", controller=\"alertmanager\", endpoint=\"http\", instance=\"10.129.2.21:8080\", job=\"prometheus-operator\", name=\"main\", namespace=\"openshift-monitoring\", pod=\"prometheus-operator-57d45fcf98-8jslj\", service=\"alertmanager-main\"}];many-to-many matching not allowed: matching labels must be unique on one side"
*********************************************

alert: AlertmanagerConfigInconsistent
expr: count_values
  by(service) ("config_hash", alertmanager_config_hash{job="alertmanager-main",namespace="openshift-monitoring"})
  / on(service) group_left() label_replace(prometheus_operator_spec_replicas{controller="alertmanager",job="prometheus-operator",namespace="openshift-monitoring"},
  "service", "alertmanager-$1", "name", "(.*)") !=
  1 
for: 5m 
labels: severity: critical 
annotations: message: The configuration of the instances of the Alertmanager cluster `{{$labels.service}}` are out of sync.

Note: did not meet this error before, it is a 4.2 regression, but it seems it does not affect the function

Version-Release number of selected component (if applicable):
4.2.0-0.nightly-2019-08-12-153437

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 6 Junqi Zhao 2019-08-19 02:31:22 UTC
issue is fixed with 4.2.0-0.nightly-2019-08-18-222019
verification steps please see Comment 3

Comment 7 errata-xmlrpc 2019-10-16 06:35:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922