Bug 2244623

Summary: [CDI] Metrics are missing default value
Product: Container Native Virtualization (CNV) Reporter: Aviv Litman <alitman>
Component: MetricsAssignee: Aviv Litman <alitman>
Status: CLOSED MIGRATED QA Contact: Ahmad <ahafe>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.14.0CC: dbasunag, kmajcher, sradco, stirabos
Target Milestone: ---   
Target Release: 4.14.2   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: no deafult value for some metrics Consequence: some metrics is not available on Prometheus UI if they have no value. Fix: Add default value to all metrics. Result: All metrics are available in Prometheus UI.
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-12-05 13:41:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aviv Litman 2023-10-17 11:42:54 UTC
Description of problem:
Some CDI metrics missing a default value, so in case there is no value these metrics will not be available on Prometheus UI and therefore, a user can miss these metrics exists.

The metrics missing default values are:
kubevirt_cdi_clone_pods_high_restart
kubevirt_cdi_dataimportcron_outdated
kubevirt_cdi_import_pods_high_restart
kubevirt_cdi_upload_pods_high_restart
*this list might be longer

Version-Release number of selected component (if applicable):
4.14 but exists since lower versions.

How reproducible:
100%

Steps to Reproduce:
1. git clone git:kubevirt/containerized-data-importer.git
2. cd containerized-data-importe
3. docker login
4. export KUBEVIRT_DEPLOY_PROMETHEUS=true
5. make cluster-up
6. make cluster-sync

might not be needed:
7. sudo sysctl -w net.ipv4.ip_forward=1
8. sudo sysctl -w net.ipv4.conf.all.route_localnet=1
9. sudo iptables -t nat -A PREROUTING -p tcp --dport 9090 -j DNAT --to-destination 127.0.0.1:9090

10. ./cluster-up/kubectl.sh port-forward service/prometheus-k8s -n monitoring 9090:9090
11. login to http://localhost:9090/
12. see the metrics I mentioned are missing a value and not shown in the metrics list.

Actual results:
some CDI metrics is missing from the Prometheus UI if they have no value.

Expected results:
All CDI metrics is available and have values in the Prometheus UI.

Comment 1 Krzysztof Majcher 2023-10-17 12:45:35 UTC
Shirly, Debarati believes that this might be by design, can you please take a look and advise?

Comment 2 Shirly Radco 2023-11-13 09:36:24 UTC
From what I understand this is indeed a bug and we should report zero as the default value, which is generally a good practice.

Reporting zero as a default accurately reflects the state of the system when it has not experienced any restarts.
This is a clear and unambiguous way to indicate that there have been no restarts up to that point.

Prometheus works best with continuous time series data. Having a consistent metric (like a restart count starting at zero and incrementing) makes it easier to write queries and create meaningful visualizations.
It helps in understanding trends over time and detecting anomalies.

When we have a consistent baseline (zero in this case), it becomes easier to set up alerts. For example, we might want to be alerted when the restart count exceeds a certain threshold. If the metric is always present, it's simpler to define these alerts.
If we don't report anything until a restart occurs, we might not be able to easily differentiate between a lack of data (due to issues like collection problems or system down) and a situation where there simply haven't been any restarts. Reporting zero eliminates this ambiguity.