Bug 1820083
Summary: | No datapoints found for metal3 metrics in UI | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Sasha Smolyak <ssmolyak> | ||||
Component: | Bare Metal Hardware Provisioning | Assignee: | Tomas Sedovic <tsedovic> | ||||
Bare Metal Hardware Provisioning sub component: | cluster-baremetal-operator | QA Contact: | Gaoyun Pei <gpei> | ||||
Status: | CLOSED CANTFIX | Docs Contact: | |||||
Severity: | high | ||||||
Priority: | high | CC: | achernet, aos-bugs, augol, beth.white, dhellmann, fsimonce, mifiedle, nstielau, ohochman, sdasu, vlaad, yjoseph, zbitter | ||||
Version: | 4.4 | Keywords: | Triaged | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.9.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2023-02-16 14:42:03 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Sasha Smolyak
2020-04-02 08:45:48 UTC
Can you please confirm that the URL you're getting the metrics from is prometheus? It looks like the metrics are exposed but are not scraped by prometheus and therefore it does not appear in the UI. This is very likely not a UI bug but rather a missing prometheus configuration. I can confirm that in cli I'm getting those metrics, they are collected, just not found by UI. So maybe it's the missing configuration, I'm not arguing about it. I do though see them when working in cli mode This seems to be caused by a missing BMO ServiceMonitor in machine-api-operator. *** Bug 1868411 has been marked as a duplicate of this bug. *** Cluster version 4.7.0-0.nightly-2020-11-29-133728 Got null results for any of the metal3 metrics both in openshift cluster UI, metrics tab, and prometheus UI: https://prometheus-k8s-openshift-monitoring.apps.ocp-edge-cluster-0.qe.lab.redhat.com/graph) and through prometheus API: [kni@provisionhost-0-0 ~]$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s "http://localhost:9090/api/v1/query?query=%20avg%20by%20(instance)%20(irate(metal3_operation_power_change_total%7Binstance%3D%22master-0-0%22%2Cmode%3D%22idle%22%7D%5B5m%5D))" | jq -r .data.result[0].value[1] null [kni@provisionhost-0-0 ~]$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s "http://localhost:9090/api/v1/query?query=%20avg%20by%20(instance)%20(irate(metal3_operation_register_duration_seconds%7Binstance%3D%22master-0-0%22%2Cmode%3D%22idle%22%7D%5B5m%5D))" | jq -r .data.result[0].value[1] null [kni@provisionhost-0-0 ~]$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s "http://localhost:9090/api/v1/query?query=%20avg%20by%20(instance)%20(irate(metal3_host_registration_required_total%7Binstance%3D%22master-0-0%22%2Cmode%3D%22idle%22%7D%5B5m%5D))" | jq -r .data.result[0].value[1] null [kni@provisionhost-0-0 ~]$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s "http://localhost:9090/api/v1/query?query=%20avg%20by%20(instance)%20(irate(metal3_host_registration_required_total%7Binstance%3D%22master-0-1%22%2Cmode%3D%22idle%22%7D%5B5m%5D))" | jq -r .data.result[0].value[1] null [kni@provisionhost-0-0 ~]$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s "http://localhost:9090/api/v1/query?query=%20avg%20by%20(instance)%20(irate(metal3_host_registration_required_total%7Binstance%3D%22master-0-2%22%2Cmode%3D%22idle%22%7D%5B5m%5D))" | jq -r .data.result[0].value[1] This was fixed in MAO during the 4.7 cycle, but the code was removed again prior to the 4.7 release when we switched to the cluster-baremetal-operator. The fix has never been implemented in the CBO, although there is an open (but outdated) PR for it - https://github.com/openshift/cluster-baremetal-operator/pull/99. This bug was opened against a very old version, and the patch has been abandoned since then. I'm closing this bug. If you still experience the issue, please open a bug (in jira) against a new version with updated information. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |