Bug 1851685 - "many-to-many matching not allowed: matching labels must be unique on one side" warn info for "cluster:cpu_core_node_labels"
Summary: "many-to-many matching not allowed: matching labels must be unique on one sid...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.4
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.6.0
Assignee: Simon Pasquier
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-28 08:00 UTC by Junqi Zhao
Modified: 2020-10-27 16:10 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:09:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 858 0 None closed jsonnet: avoid many-to-many errors for cluster:cpu_core_node_labels 2020-09-14 07:26:08 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:10:06 UTC

Description Junqi Zhao 2020-06-28 08:00:55 UTC
Description of problem:
upgrade from 4.3.26 to 4.4.9, found 
"many-to-many matching not allowed: matching labels must be unique on one side" warn info for "record: cluster:cpu_core_node_labels" and "record: node:node_num_cpu:sum" rules, issue for "record: node:node_num_cpu:sum" rule is tracked in bug 1834913, this BZ only track for "record: cluster:cpu_core_node_labels"

# oc -n openshift-monitoring logs prometheus-k8s-1 -c prometheus | grep "many-to-many matching not allowed: matching labels must be unique on one side"
level=warn ts=2020-06-28T07:42:05.328Z caller=manager.go:525 component="rule manager" group=node.rules msg="Evaluating rule failed" rule="record: node:node_num_cpu:sum\nexpr: count by(cluster, node) (sum by(node, cpu) (node_cpu_seconds_total{job=\"node-exporter\"}\n  * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:))\n" err="found duplicate series for the match group {namespace=\"openshift-monitoring\", pod=\"alertmanager-main-0\"} on the right hand-side of the operation: [{__name__=\"node_namespace_pod:kube_pod_info:\", namespace=\"openshift-monitoring\", node=\"dyan-upg4326-4qvbt-worker-westus21-zf2ks\", pod=\"alertmanager-main-0\"}, {__name__=\"node_namespace_pod:kube_pod_info:\", namespace=\"openshift-monitoring\", node=\"dyan-upg4326-4qvbt-worker-westus21-mx55p\", pod=\"alertmanager-main-0\"}];many-to-many matching not allowed: matching labels must be unique on one side"
level=warn ts=2020-06-28T07:42:06.822Z caller=manager.go:525 component="rule manager" group=kubernetes.rules msg="Evaluating rule failed" rule="record: cluster:cpu_core_node_labels\nexpr: cluster:nodes_roles * on(node) group_right(label_beta_kubernetes_io_instance_type,\n  label_node_role_kubernetes_io, label_node_openshift_io_os_id, label_kubernetes_io_arch,\n  label_node_role_kubernetes_io_master, label_node_role_kubernetes_io_infra) label_replace(cluster:cpu_core_hyperthreading,\n  \"node\", \"$1\", \"instance\", \"(.*)\")\n" err="found duplicate series for the match group {node=\"dyan-upg4326-4qvbt-worker-westus21-mx55p\"} on the left hand-side of the operation: [{__name__=\"cluster:nodes_roles\", label_beta_kubernetes_io_arch=\"amd64\", label_beta_kubernetes_io_instance_type=\"Standard_D2s_v3\", label_beta_kubernetes_io_os=\"linux\", label_failure_domain_beta_kubernetes_io_region=\"westus2\", label_failure_domain_beta_kubernetes_io_zone=\"westus2-1\", label_kubernetes_io_arch=\"amd64\", label_kubernetes_io_hostname=\"dyan-upg4326-4qvbt-worker-westus21-mx55p\", label_kubernetes_io_os=\"linux\", label_node_openshift_io_os_id=\"rhcos\", namespace=\"openshift-monitoring\", node=\"dyan-upg4326-4qvbt-worker-westus21-mx55p\"}, {__name__=\"cluster:nodes_roles\", label_beta_kubernetes_io_arch=\"amd64\", label_beta_kubernetes_io_instance_type=\"Standard_D2s_v3\", label_beta_kubernetes_io_os=\"linux\", label_failure_domain_beta_kubernetes_io_region=\"westus2\", label_failure_domain_beta_kubernetes_io_zone=\"westus2-1\", label_kubernetes_io_arch=\"amd64\", label_kubernetes_io_hostname=\"dyan-upg4326-4qvbt-worker-westus21-mx55p\", label_kubernetes_io_os=\"linux\", label_node_kubernetes_io_instance_type=\"Standard_D2s_v3\", label_node_openshift_io_os_id=\"rhcos\", label_topology_kubernetes_io_region=\"westus2\", label_topology_kubernetes_io_zone=\"westus2-1\", namespace=\"openshift-monitoring\", node=\"dyan-upg4326-4qvbt-worker-westus21-mx55p\"}];many-to-many matching not allowed: matching labels must be unique on one side"


Version-Release number of selected component (if applicable):
upgrade from 4.3.26 to 4.4.9

How reproducible:
sometimes

Steps to Reproduce:
1. see the description
2.
3.

Actual results:


Expected results:


Additional info:

Comment 8 Junqi Zhao 2020-07-23 12:12:50 UTC
upgrade from 4.5.3 to 4.6.0-0.nightly-2020-07-23-055513, no such issue now
expr is changed to
**************
  - expr: |
      topk by(node) (1, cluster:nodes_roles) * on (node)
        group_right( label_beta_kubernetes_io_instance_type, label_node_role_kubernetes_io, label_node_openshift_io_os_id, label_kubernetes_io_arch,
                     label_node_role_kubernetes_io_master, label_node_role_kubernetes_io_infra)
      label_replace( cluster:cpu_core_hyperthreading, "node", "$1", "instance", "(.*)" )
    record: cluster:cpu_core_node_labels
**************

Comment 10 errata-xmlrpc 2020-10-27 16:09:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.