Created attachment 1623447 [details] duplicate series for the match group Description of problem: see the attached file, there are warn info like below, error is err="found duplicate series for the match group", but it seems it does not affect the function level=warn ts=2019-10-08T03:42:36.753Z caller=manager.go:513 component="rule manager" group=kubernetes.rules msg="Evaluating rule failed" rule="record: node_role_os_version_machine:cpu_capacity_cores:sum\nexpr: sum by(label_node_openshift_io_os_id, label_kubernetes_io_arch, label_node_role_kubernetes_io_master_infra,\n label_node_role_kubernetes_io_master, label_node_role_kubernetes_io_infra) ((cluster:master_infra_nodes\n * on(node) group_left() kube_node_status_capacity_cpu_cores) or on(node) (cluster:master_nodes\n * on(node) group_left() kube_node_status_capacity_cpu_cores) or on(node) (cluster:infra_nodes\n * on(node) group_left() kube_node_status_capacity_cpu_cores) or on(node) (kube_node_labels\n * on(node) group_left() kube_node_status_capacity_cpu_cores))\n" err="found duplicate series for the match group {node=\"qe-jiazha-42-7tks4-master-0\"} on the right hand-side of the operation: [{__name__=\"kube_node_status_capacity_cpu_cores\", endpoint=\"https-main\", instance=\"10.131.0.4:8443\", job=\"kube-state-metrics\", namespace=\"openshift-monitoring\", node=\"qe-jiazha-42-7tks4-master-0\", pod=\"kube-state-metrics-65d5886446-69dhf\", service=\"kube-state-metrics\"}, {__name__=\"kube_node_status_capacity_cpu_cores\", endpoint=\"https-main\", instance=\"10.131.0.23:8443\", job=\"kube-state-metrics\", namespace=\"openshift-monitoring\", node=\"qe-jiazha-42-7tks4-master-0\", pod=\"kube-state-metrics-65d5886446-csvxl\", service=\"kube-state-metrics\"}];many-to-many matching not allowed: matching labels must be unique on one side" Version-Release number of selected component (if applicable): 4.2.0-0.nightly-2019-10-07-161806 How reproducible: some times Steps to Reproduce: 1. oc -n openshift-monitoring logs -c prometheus prometheus-k8s-0 2. 3. Actual results: Expected results: Additional info:
Can you have a look at this, seems like there is a duplicate match for the rule you added. Thanks!
Can you provide the results of the kube_node_role prometheus query? I'd like to understand what the node_role labels were on the cluster in order to reproduce.
(In reply to Chris Hambridge from comment #2) > Can you provide the results of the kube_node_role prometheus query? I'd like > to understand what the node_role labels were on the cluster in order to > reproduce. It is not reproduced every time, will provide info when we meet it next time
Created attachment 1624943 [details] monitoring dump
Created attachment 1626290 [details] 4.3 monitoring dump
Tested with 4.3.0-0.nightly-2019-11-25-153929, did not meet this issue now
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062