Bug 1759469
| Summary: | [4.2.z] sometimes find "found duplicate series for the match group" error in prometheus-k8s pod logs | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Junqi Zhao <juzhao> | ||||||||
| Component: | Monitoring | Assignee: | Paul Gier <pgier> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | ||||||||
| Severity: | low | Docs Contact: | |||||||||
| Priority: | low | ||||||||||
| Version: | 4.2.z | CC: | alegrand, anpicker, chambrid, erooth, kakkoyun, lcosic, mloibl, pkrupa, surbania | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | 4.3.0 | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2020-05-13 21:27:12 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
Can you have a look at this, seems like there is a duplicate match for the rule you added. Thanks! Can you provide the results of the kube_node_role prometheus query? I'd like to understand what the node_role labels were on the cluster in order to reproduce. (In reply to Chris Hambridge from comment #2) > Can you provide the results of the kube_node_role prometheus query? I'd like > to understand what the node_role labels were on the cluster in order to > reproduce. It is not reproduced every time, will provide info when we meet it next time Created attachment 1624943 [details]
monitoring dump
Created attachment 1626290 [details]
4.3 monitoring dump
Tested with 4.3.0-0.nightly-2019-11-25-153929, did not meet this issue now Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |
Created attachment 1623447 [details] duplicate series for the match group Description of problem: see the attached file, there are warn info like below, error is err="found duplicate series for the match group", but it seems it does not affect the function level=warn ts=2019-10-08T03:42:36.753Z caller=manager.go:513 component="rule manager" group=kubernetes.rules msg="Evaluating rule failed" rule="record: node_role_os_version_machine:cpu_capacity_cores:sum\nexpr: sum by(label_node_openshift_io_os_id, label_kubernetes_io_arch, label_node_role_kubernetes_io_master_infra,\n label_node_role_kubernetes_io_master, label_node_role_kubernetes_io_infra) ((cluster:master_infra_nodes\n * on(node) group_left() kube_node_status_capacity_cpu_cores) or on(node) (cluster:master_nodes\n * on(node) group_left() kube_node_status_capacity_cpu_cores) or on(node) (cluster:infra_nodes\n * on(node) group_left() kube_node_status_capacity_cpu_cores) or on(node) (kube_node_labels\n * on(node) group_left() kube_node_status_capacity_cpu_cores))\n" err="found duplicate series for the match group {node=\"qe-jiazha-42-7tks4-master-0\"} on the right hand-side of the operation: [{__name__=\"kube_node_status_capacity_cpu_cores\", endpoint=\"https-main\", instance=\"10.131.0.4:8443\", job=\"kube-state-metrics\", namespace=\"openshift-monitoring\", node=\"qe-jiazha-42-7tks4-master-0\", pod=\"kube-state-metrics-65d5886446-69dhf\", service=\"kube-state-metrics\"}, {__name__=\"kube_node_status_capacity_cpu_cores\", endpoint=\"https-main\", instance=\"10.131.0.23:8443\", job=\"kube-state-metrics\", namespace=\"openshift-monitoring\", node=\"qe-jiazha-42-7tks4-master-0\", pod=\"kube-state-metrics-65d5886446-csvxl\", service=\"kube-state-metrics\"}];many-to-many matching not allowed: matching labels must be unique on one side" Version-Release number of selected component (if applicable): 4.2.0-0.nightly-2019-10-07-161806 How reproducible: some times Steps to Reproduce: 1. oc -n openshift-monitoring logs -c prometheus prometheus-k8s-0 2. 3. Actual results: Expected results: Additional info: