Bug 2244623

Summary:	[CDI] Metrics are missing default value
Product:	Container Native Virtualization (CNV)	Reporter:	Aviv Litman <alitman>
Component:	Metrics	Assignee:	Aviv Litman <alitman>
Status:	CLOSED MIGRATED	QA Contact:	Ahmad <ahafe>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	4.14.0	CC:	dbasunag, kmajcher, sradco, stirabos
Target Milestone:	---
Target Release:	4.14.2
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Cause: no deafult value for some metrics Consequence: some metrics is not available on Prometheus UI if they have no value. Fix: Add default value to all metrics. Result: All metrics are available in Prometheus UI.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-12-05 13:41:36 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Aviv Litman 2023-10-17 11:42:54 UTC

Description of problem:
Some CDI metrics missing a default value, so in case there is no value these metrics will not be available on Prometheus UI and therefore, a user can miss these metrics exists.

The metrics missing default values are:
kubevirt_cdi_clone_pods_high_restart
kubevirt_cdi_dataimportcron_outdated
kubevirt_cdi_import_pods_high_restart
kubevirt_cdi_upload_pods_high_restart
*this list might be longer

Version-Release number of selected component (if applicable):
4.14 but exists since lower versions.

How reproducible:
100%

Steps to Reproduce:
1. git clone git:kubevirt/containerized-data-importer.git
2. cd containerized-data-importe
3. docker login
4. export KUBEVIRT_DEPLOY_PROMETHEUS=true
5. make cluster-up
6. make cluster-sync

might not be needed:
7. sudo sysctl -w net.ipv4.ip_forward=1
8. sudo sysctl -w net.ipv4.conf.all.route_localnet=1
9. sudo iptables -t nat -A PREROUTING -p tcp --dport 9090 -j DNAT --to-destination 127.0.0.1:9090

10. ./cluster-up/kubectl.sh port-forward service/prometheus-k8s -n monitoring 9090:9090
11. login to http://localhost:9090/
12. see the metrics I mentioned are missing a value and not shown in the metrics list.

Actual results:
some CDI metrics is missing from the Prometheus UI if they have no value.

Expected results:
All CDI metrics is available and have values in the Prometheus UI.

Comment 1 Krzysztof Majcher 2023-10-17 12:45:35 UTC

Shirly, Debarati believes that this might be by design, can you please take a look and advise?

Comment 2 Shirly Radco 2023-11-13 09:36:24 UTC

From what I understand this is indeed a bug and we should report zero as the default value, which is generally a good practice.

Reporting zero as a default accurately reflects the state of the system when it has not experienced any restarts.
This is a clear and unambiguous way to indicate that there have been no restarts up to that point.

Prometheus works best with continuous time series data. Having a consistent metric (like a restart count starting at zero and incrementing) makes it easier to write queries and create meaningful visualizations.
It helps in understanding trends over time and detecting anomalies.

When we have a consistent baseline (zero in this case), it becomes easier to set up alerts. For example, we might want to be alerted when the restart count exceeds a certain threshold. If the metric is always present, it's simpler to define these alerts.
If we don't report anything until a restart occurs, we might not be able to easily differentiate between a lack of data (due to issues like collection problems or system down) and a situation where there simply haven't been any restarts. Reporting zero eliminates this ambiguity.