Bug 2211671

Summary: [KubeVirt] Additional metrics names failed promlint linter
Product: Container Native Virtualization (CNV) Reporter: Aviv Litman <alitman>
Component: MetricsAssignee: Shirly Radco <sradco>
Status: CLOSED MIGRATED QA Contact: Ahmad <ahafe>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.13.0CC: dbasunag, kmajcher, stirabos
Target Milestone: ---   
Target Release: 4.15.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: hco-bundle-registry-container-v4.15.0.rhel9-1443 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-12-14 16:18:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aviv Litman 2023-06-01 12:13:57 UTC
Description of problem:
We created a metric name linter in kubevirt based on promlint: https://github.com/kubevirt/kubevirt/pull/9709.
This metrics names failed:
kubevirt_allocatable_nodes_count: non-histogram and non-summary metrics should not have "_count" suffix
kubevirt_kvm_available_nodes_count: non-histogram and non-summary metrics should not have "_count" suffix
kubevirt_virt_api_up_total: non-counter metrics should not have "_total" suffix
kubevirt_virt_controller_ready_total: non-counter metrics should not have "_total" suffix
kubevirt_virt_controller_up_total: non-counter metrics should not have "_total" suffix
kubevirt_virt_handler_up_total: non-counter metrics should not have "_total" suffix
kubevirt_virt_operator_leading_total: non-counter metrics should not have "_total" suffix
kubevirt_virt_operator_ready_total: non-counter metrics should not have "_total" suffix
kubevirt_virt_operator_up_total: non-counter metrics should not have "_total" suffix
kubevirt_vmi_phase_count: non-histogram and non-summary metrics should not have "_count" suffix


Version-Release number of selected component (if applicable):
4.13

How reproducible:
100%

Steps to Reproduce:
1.cd kubevirt
2.make lint-metrics

Actual results:
some metric names are not aligned with promlint. 

Expected results:
Metrics named will be aligned with promlint linter and Prometheus best practices.

Additional info:
as for now the list of metrics are ignored in the linter.

Comment 1 Simone Tiraboschi 2023-10-03 12:52:30 UTC
Aviv,
do we need other code changes on can we consider this as ready for QE on v4.15?

Comment 3 Ahmad 2023-11-06 11:13:56 UTC
QA: verified cnv-4.15
all metrics renamed properly  

1) kubevirt_allocatable_nodes:
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_allocatable_nodes | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_allocatable_nodes"
        },
        "value": [
          1699267516.971,
          "6"
        ]
      }
    ]
  }
}
2) kubevirt_nodes_with_kvm
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_nodes_with_kvm | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_nodes_with_kvm"
        },
        "value": [
          1699267659.268,
          "3"
        ]
      }
    ]
  }
}
3) kubevirt_virt_api_up
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_virt_api_up | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_virt_api_up"
        },
        "value": [
          1699267840.703,
          "2"
        ]
      }
    ]
  }
}
4) kubevirt_virt_controller_ready
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_virt_controller_ready | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_virt_controller_ready"
        },
        "value": [
          1699268012.075,
          "2"
        ]
      }
    ]
  }
}

5)kubevirt_virt_controller_up
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_virt_controller_up | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_virt_controller_up"
        },
        "value": [
          1699268214.221,
          "2"
        ]
      }
    ]
  }
}
6) kubevirt_virt_handler_up
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_virt_handler_up | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_virt_handler_up"
        },
        "value": [
          1699268373.619,
          "3"
        ]
      }
    ]
  }
}
7) kubevirt_virt_operator_leading
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_virt_operator_leading | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_virt_operator_leading"
        },
        "value": [
          1699268907.950,
          "1"
        ]
      }
    ]
  }
}
8)kubevirt_virt_operator_ready
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_virt_operator_ready | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_virt_operator_ready"
        },
        "value": [
          1699269007.280,
          "2"
        ]
      }
    ]
  }
}
9) kubevirt_virt_operator_up
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_virt_operator_up | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_virt_operator_up"
        },
        "value": [
          1699269102.407,
          "2"
        ]
      }
    ]
  }
}
10)kubevirt_virt_operator_up
[cloud-user@ocp-psi-executor ~]$ oc exec -n openshift-monitoring prometheus-k8s-0 -c prometheus -- curl -s http://127.0.0.1:9090/api/v1/query?query=kubevirt_virt_operator_up | jq .
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "kubevirt_virt_operator_up"
        },
        "value": [
          1699269172.727,
          "2"
        ]
      }
    ]
  }
}

Comment 4 Shirly Radco 2023-11-06 12:56:59 UTC
The kubevirt_vmi_phase_count metric is very old and heavily used.
We have decided to make it an exception and keep it as is.