Bug 2148383

Summary: Migration metrics values are not sum up values from all VMIs
Product: Container Native Virtualization (CNV) Reporter: Akriti Gupta <akrgupta>
Component: MetricsAssignee: João Vilaça <jvilaca>
Status: CLOSED ERRATA QA Contact: Akriti Gupta <akrgupta>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.12.0CC: dbasunag, dshchedr, jvilaca, kmajcher, sradco, stirabos
Target Milestone: ---   
Target Release: 4.12.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: hco-bundle-registry-v4.12.1-39 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2164807 (view as bug list) Environment:
Last Closed: 2023-09-05 16:29:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2164807    

Description Akriti Gupta 2022-11-25 09:31:27 UTC
Description of problem:
As mentioned here https://github.com/kubevirt/kubevirt/blob/main/docs/metrics.md migration metrics:
kubevirt_migrate_vmi_scheduling_count,
kubevirt_migrate_vmi_running_count,
kubevirt_migrate_vmi_succeeded_total and
kubevirt_migrate_vmi_failed_total 
these metrics should represnt the total count - sum up the values from all VMIs
But instead they are representing values seperated by VMIs

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 João Vilaça 2022-11-25 10:34:29 UTC
We agreed on this for visibility purposes and some technical details.
But I agree it can be a bit confusing.

My suggestion would be 

1) renaming kubevirt_migrate_vmi_succeeded_total and kubevirt_migrate_vmi_failed_total to remove the total:

kubevirt_migrate_vmi_succeeded
kubevirt_migrate_vmi_failed

which would still maintain the labels.


2) and create 2 recording rules

kubevirt_migrate_vmi_succeeded_total
kubevirt_migrate_vmi_failed_total

which would be the sum of kubevirt_migrate_vmi_succeeded and kubevirt_migrate_vmi_failed, respectively

My question would be, should the total metrics still have the namespace label or not?

@sradco what do you think?

Comment 3 sgott 2022-11-28 13:05:34 UTC
Changing component to metrics as this BZ/discussion appears to be in that domain. Please feel free to revert this if that's not the correct choice.

Comment 4 Shirly Radco 2022-11-29 12:46:33 UTC
I agree about the name change to drop the total since the metrics is granular and in on the vm name level.
I don't see a need for creating the total metrics/recording rules at this point. 
This can easily be done by a PromQL query to get the total for all namespaces or by each namespace depending on the need.

Comment 5 João Vilaça 2022-11-29 14:08:50 UTC
waiting for https://github.com/kubevirt/kubevirt/pull/8875 to be merged to cherry-pick the changes into release-0.58

Comment 13 errata-xmlrpc 2023-09-05 16:29:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.12.6 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:4982