Bug 1879520

Summary: "many-to-many matching not allowed: matching labels must be unique on one side" warn info for "record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum"
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: MonitoringAssignee: Pawel Krupa <pkrupa>
Status: CLOSED DUPLICATE QA Contact: Junqi Zhao <juzhao>
Severity: low Docs Contact:
Priority: low    
Version: 4.6CC: akhaire, alegrand, anpicker, armin.kunaschik, asachan, carl-johan.schenstrom, deads, erooth, gparente, hgomes, jnaess, kakkoyun, lcosic, mbukatov, mloibl, naygupta, pkrupa, spasquie, surbania, wking
Target Milestone: ---Keywords: Regression, UpcomingSprint
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-04 13:17:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
monitoring dump file
none
monitoring dump file none

Description Junqi Zhao 2020-09-16 13:01:03 UTC
Created attachment 1715078 [details]
monitoring dump file

Created attachment 1715078 [details]
monitoring dump file


Description of problem:
847 "many-to-many matching not allowed: matching labels must be unique on one side" error for "record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum"
the error is only for persistentvolumeclaim=\"hive-warehouse-data\" which on the node=\"ip-10-0-131-217.eu-west-1.compute.internal\", the other pvcs do not have this error
# oc -n openshift-monitoring logs -c prometheus prometheus-k8s-1 | grep "record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum" | tail -n 3
level=warn ts=2020-09-16T12:47:06.762Z caller=manager.go:577 component="rule manager" group=kubernetes.rules msg="Evaluating rule failed" rule="record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum\nexpr: sum by(provisioner) (kubelet_volume_stats_used_bytes * on(namespace, persistentvolumeclaim) group_right() (kube_persistentvolumeclaim_info * on(storageclass) group_left(provisioner) kube_storageclass_info))\n" err="found duplicate series for the match group {namespace=\"openshift-metering\", persistentvolumeclaim=\"hive-warehouse-data\"} on the left hand-side of the operation: [{__name__=\"kubelet_volume_stats_used_bytes\", endpoint=\"https-metrics\", instance=\"10.0.179.117:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"openshift-metering\", node=\"ip-10-0-179-117.eu-west-1.compute.internal\", persistentvolumeclaim=\"hive-warehouse-data\", service=\"kubelet\"}, {__name__=\"kubelet_volume_stats_used_bytes\", endpoint=\"https-metrics\", instance=\"10.0.131.217:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"openshift-metering\", node=\"ip-10-0-131-217.eu-west-1.compute.internal\", persistentvolumeclaim=\"hive-warehouse-data\", service=\"kubelet\"}];many-to-many matching not allowed: matching labels must be unique on one side"
level=warn ts=2020-09-16T12:47:36.763Z caller=manager.go:577 component="rule manager" group=kubernetes.rules msg="Evaluating rule failed" rule="record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum\nexpr: sum by(provisioner) (kubelet_volume_stats_used_bytes * on(namespace, persistentvolumeclaim) group_right() (kube_persistentvolumeclaim_info * on(storageclass) group_left(provisioner) kube_storageclass_info))\n" err="found duplicate series for the match group {namespace=\"openshift-metering\", persistentvolumeclaim=\"hive-warehouse-data\"} on the left hand-side of the operation: [{__name__=\"kubelet_volume_stats_used_bytes\", endpoint=\"https-metrics\", instance=\"10.0.179.117:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"openshift-metering\", node=\"ip-10-0-179-117.eu-west-1.compute.internal\", persistentvolumeclaim=\"hive-warehouse-data\", service=\"kubelet\"}, {__name__=\"kubelet_volume_stats_used_bytes\", endpoint=\"https-metrics\", instance=\"10.0.131.217:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"openshift-metering\", node=\"ip-10-0-131-217.eu-west-1.compute.internal\", persistentvolumeclaim=\"hive-warehouse-data\", service=\"kubelet\"}];many-to-many matching not allowed: matching labels must be unique on one side"
level=warn ts=2020-09-16T12:48:06.767Z caller=manager.go:577 component="rule manager" group=kubernetes.rules msg="Evaluating rule failed" rule="record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum\nexpr: sum by(provisioner) (kubelet_volume_stats_used_bytes * on(namespace, persistentvolumeclaim) group_right() (kube_persistentvolumeclaim_info * on(storageclass) group_left(provisioner) kube_storageclass_info))\n" err="found duplicate series for the match group {namespace=\"openshift-metering\", persistentvolumeclaim=\"hive-warehouse-data\"} on the left hand-side of the operation: [{__name__=\"kubelet_volume_stats_used_bytes\", endpoint=\"https-metrics\", instance=\"10.0.179.117:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"openshift-metering\", node=\"ip-10-0-179-117.eu-west-1.compute.internal\", persistentvolumeclaim=\"hive-warehouse-data\", service=\"kubelet\"}, {__name__=\"kubelet_volume_stats_used_bytes\", endpoint=\"https-metrics\", instance=\"10.0.131.217:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"openshift-metering\", node=\"ip-10-0-131-217.eu-west-1.compute.internal\", persistentvolumeclaim=\"hive-warehouse-data\", service=\"kubelet\"}];many-to-many matching not allowed: matching labels must be unique on one side"
# oc -n openshift-monitoring logs -c prometheus prometheus-k8s-1 | grep "record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum" | wc -l
847

record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum 
expr: sum by(provisioner) (kubelet_volume_stats_used_bytes * on(namespace, persistentvolumeclaim) group_right() (kube_persistentvolumeclaim_info * on(storageclass) group_left(provisioner) kube_storageclass_info)) 

# oc -n openshift-metering get pvc hive-warehouse-data -oyaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2020-09-16T05:45:48Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    metering.openshift.io/ns-prune: openshift-metering
    metering.openshift.io/prune: hive-shared-volume-pvc
  name: hive-warehouse-data
  namespace: openshift-metering
  ownerReferences:
  - apiVersion: metering.openshift.io/v1
    kind: MeteringConfig
    name: openshift-metering
    uid: 48181cdd-ca6a-48d0-80ac-20f6ca21a5f5
  resourceVersion: "356975"
  selfLink: /api/v1/namespaces/openshift-metering/persistentvolumeclaims/hive-warehouse-data
  uid: 4a3fcbbf-66f2-4113-b815-d8ab955f1fec
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  storageClassName: sc-openshift-metering
  volumeMode: Filesystem
  volumeName: pv-openshift-metering
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 5Gi
  phase: Bound
Version-Release number of selected component (if applicable):
4.6.0-0.nightly-2020-09-15-171211

How reproducible:
not sure

Steps to Reproduce:
1. oc -n openshift-monitoring logs -c prometheus prometheus-k8s-1 | grep "record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum"
2.
3.

Actual results:
"many-to-many matching not allowed: matching labels must be unique on one side" warn info for "record: cluster:kubelet_volume_stats_used_bytes:provisioner:sum"

Expected results:
no error

Additional info:
all the logs files are in the attached gz file

Comment 1 Pawel Krupa 2020-09-16 13:19:58 UTC
I cannot reproduce this. Could you share output value for following metrics:

- kubelet_volume_stats_used_bytes
- kube_persistentvolumeclaim_info
- kube_storageclass_info

Comment 2 Junqi Zhao 2020-09-16 13:32:51 UTC
Created attachment 1715088 [details]
monitoring dump file

Comment 6 Pawel Krupa 2020-10-09 11:17:38 UTC
*** Bug 1886177 has been marked as a duplicate of this bug. ***

Comment 11 hgomes 2020-11-18 17:25:17 UTC
Created solution https://access.redhat.com/solutions/5459581 to track this BZ.

Comment 12 Armin Kunaschik 2020-11-23 11:24:22 UTC
@hgomes The KCS is about 4.4. This BZ is about 4.6 and describes a different issue.
Killing the node-exporter pods does not fix the 4.6 issue.

Comment 13 German Parente 2020-11-23 13:54:56 UTC
@Armin.

The article (currently in progress) corresponding to this issue is:

https://access.redhat.com/solutions/5594541

I don't think there's a workaround for this since it's related to several storage classes with same provisioner as stated before in this bug report.

The rule has to be re-worked and the only possibility would be to silent the rule, imho. Engineering will give more details as possible.

Comment 15 Anmol Sachan 2020-11-30 14:21:02 UTC
*** Bug 1897674 has been marked as a duplicate of this bug. ***

Comment 20 Pawel Krupa 2020-12-04 13:17:14 UTC
Closing in favor of 1903464.

*** This bug has been marked as a duplicate of bug 1903464 ***

Comment 21 Red Hat Bugzilla 2023-09-15 00:48:14 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days