1868411 – Some of the metal3 metrics are missing

Bug 1868411 - Some of the metal3 metrics are missing

Summary: Some of the metal3 metrics are missing

Keywords:
Status:	CLOSED DUPLICATE of bug 1820204
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Bare Metal Hardware Provisioning
Sub Component:
Version:	4.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Steven Hardy
QA Contact:	Amit Ugol
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-08-12 14:54 UTC by Daniel
Modified:	2020-09-22 16:34 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-09-22 16:34:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Daniel 2020-08-12 14:54:45 UTC

Description of problem:
These metal3 metrics are missing:
* metal3_host_error_total
* metal3_host_config_data_error_total 
* metal3_operation_register_duration_seconds 
* metal3_operation_inspect_duration_seconds 
* metal3_operation_provision_duration_seconds 
* metal3_provisioning_state_change_total


Version-Release number of selected component (if applicable):
4.4.0-0.nightly-2020-08-11-175135

How reproducible:
100%

Steps to Reproduce:
1. go to provisionhost and look for metal3 pod in openshift-machine-api project
2. get the log for metal3-baremetal-operator container in that pod
3. Look for line {"level":"info","ts":1585686677.369114,"logger":"cmd","msg":"gather metrics at http://127.0.0.1:8085/metrics"}
4. curl http://127.0.0.1:8085/metrics to get the metrics.

Actual results:
The 6 metrics mentioned above are missing.

Expected results:
All the metal3 metrics should be present.

Additional info:

Comment 1 Daniel 2020-08-12 15:41:29 UTC

Note: This is working before on 4.4.

Comment 2 Daniel 2020-08-12 15:41:47 UTC

Note: This is working before on 4.4.

Comment 3 Doug Hellmann 2020-08-17 14:13:14 UTC

The URL reported in the logs is relative to the pod containing the operator. In this case the pod uses host networking, so you could log in to the host where the pod is running. The URL will not work from the provisioning host.

Another way to verify the metrics is to look in the prometheus console for the cluster. Doing that should reproduce the bug reported as https://bugzilla.redhat.com/show_bug.cgi?id=1820083

Comment 4 Steven Hardy 2020-08-18 16:05:10 UTC

I believe this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1820083 - closing as such

*** This bug has been marked as a duplicate of bug 1820083 ***

Comment 5 Daniel 2020-08-19 07:50:36 UTC

(In reply to Steven Hardy from comment #4)
> I believe this is a duplicate of
> https://bugzilla.redhat.com/show_bug.cgi?id=1820083 - closing as such
> 
> *** This bug has been marked as a duplicate of bug 1820083 ***

Hi Steve, This is not a duplicate. The other bug is related to data points not showing in UI, while this one is about several metrics that should be there but are missing.

Comment 6 Zane Bitter 2020-09-22 16:25:19 UTC

Is it possible this is the same as bug 1820204?

Comment 7 Doug Hellmann 2020-09-22 16:34:38 UTC

(In reply to Zane Bitter from comment #6)
> Is it possible this is the same as bug 1820204?

Yes, it looks the same to me.

*** This bug has been marked as a duplicate of bug 1820204 ***

Note You need to log in before you can comment on or make changes to this bug.