Description of problem: As mentioned here https://github.com/kubevirt/kubevirt/blob/main/docs/metrics.md migration metrics: kubevirt_migrate_vmi_scheduling_count, kubevirt_migrate_vmi_running_count, kubevirt_migrate_vmi_succeeded_total and kubevirt_migrate_vmi_failed_total these metrics should represnt the total count - sum up the values from all VMIs But instead they are representing values seperated by VMIs Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
We agreed on this for visibility purposes and some technical details. But I agree it can be a bit confusing. My suggestion would be 1) renaming kubevirt_migrate_vmi_succeeded_total and kubevirt_migrate_vmi_failed_total to remove the total: kubevirt_migrate_vmi_succeeded kubevirt_migrate_vmi_failed which would still maintain the labels. 2) and create 2 recording rules kubevirt_migrate_vmi_succeeded_total kubevirt_migrate_vmi_failed_total which would be the sum of kubevirt_migrate_vmi_succeeded and kubevirt_migrate_vmi_failed, respectively My question would be, should the total metrics still have the namespace label or not? @sradco what do you think?
Changing component to metrics as this BZ/discussion appears to be in that domain. Please feel free to revert this if that's not the correct choice.
I agree about the name change to drop the total since the metrics is granular and in on the vm name level. I don't see a need for creating the total metrics/recording rules at this point. This can easily be done by a PromQL query to get the total for all namespaces or by each namespace depending on the need.
waiting for https://github.com/kubevirt/kubevirt/pull/8875 to be merged to cherry-pick the changes into release-0.58
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.12.6 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:4982