Bug 2077291
Summary: | Prometheus doesn't display acm_managed_cluster_info after upgrade from 2.4 to 2.5 | ||
---|---|---|---|
Product: | Red Hat Advanced Cluster Management for Kubernetes | Reporter: | dhuynh |
Component: | Cluster Lifecycle | Assignee: | Jian Qiu <jqiu> |
Status: | CLOSED ERRATA | QA Contact: | Hui Chen <huichen> |
Severity: | urgent | Docs Contact: | Christopher Dawson <cdawson> |
Priority: | urgent | ||
Version: | rhacm-2.5 | CC: | dhuynh, jagray, yuhe |
Target Milestone: | --- | Keywords: | Regression, TestBlocker |
Target Release: | rhacm-2.5 | Flags: | bot-tracker-sync:
rhacm-2.5+
|
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-06-09 02:10:54 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
dhuynh
2022-04-20 22:08:17 UTC
G2Bsync 1104649627 comment haoqing0110 Thu, 21 Apr 2022 02:45:11 UTC G2Bsync In the env, I can curl the metrics successfully inside the clusterlifecycle-state-metrics pod. ``` ✗ oc get pods -A | grep clusterlifecycle-state-metrics multicluster-engine clusterlifecycle-state-metrics-v2-778b7bd6dd-bzjfm 1/1 Running 0 10h ✗ oc exec -n multicluster-engine clusterlifecycle-state-metrics-v2-778b7bd6dd-bzjfm -it sh kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. sh-4.4$ curl http://localhost:8080/metrics # HELP acm_managed_cluster_info Managed cluster information # TYPE acm_managed_cluster_info gauge acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="clc-nap-aks-a243-import-before-upgrade-b",vendor="AKS",cloud="Azure",version="v1.21.9",available="Unknown",created_via="Other",core_worker="0",socket_worker="0"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="5e9196b6-d2ac-4602-958a-4afb183e4941",vendor="OpenShift",cloud="Amazon",version="4.9.25",available="True",created_via="Hive",core_worker="12",socket_worker="3"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="clc-nap-gke-a243-import-before-upgrade",vendor="GKE",cloud="Google",version="v1.21.9-gke.1002",available="True",created_via="Other",core_worker="0",socket_worker="0"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="d27fb041-5a9c-41ff-bdde-dc0175a521c8",vendor="OpenShift",cloud="Amazon",version="4.10.6",available="True",created_via="Other",core_worker="16",socket_worker="4"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="59b619ec-617b-4cc8-8ac3-00ee11082ea5",vendor="OpenShift",cloud="IBM",version="4.9.25",available="True",created_via="Other",core_worker="12",socket_worker="3"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="dc8626c5-127d-45c7-882d-7c635cd1fee1",vendor="OpenShift",cloud="RHV",version="4.10.3",available="True",created_via="Other",core_worker="12",socket_worker="3"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="clc-nap-eks-a243-import-before-upgrade",vendor="EKS",cloud="Amazon",version="v1.21.5-eks-bc4871b",available="True",created_via="Other",core_worker="0",socket_worker="0"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="9619487d-8d54-43e3-9e5c-fa2bf35a63d0",vendor="OpenShift",cloud="Google",version="4.10.3",available="True",created_via="Hive",core_worker="12",socket_worker="3"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="f0c5915d-4c31-4cc1-b999-c56ba6a1578c",vendor="OpenShift",cloud="Azure",version="4.10.3",available="True",created_via="Hive",core_worker="6",socket_worker="3"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="clc-nap-ocp311-a243-import-before-upgrade",vendor="OpenShift",cloud="Amazon",version="3",available="True",created_via="Other",core_worker="0",socket_worker="0"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="342598e6-c98f-482a-8b85-5b526a11218d",vendor="OpenShift",cloud="Amazon",version="4.7.18",available="True",created_via="Other",core_worker="24",socket_worker="6"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",vendor="OpenShift",cloud="Amazon",version="4.10.9",available="True",created_via="Other",core_worker="24",socket_worker="3"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="22054b74-8fc2-42ca-a3e9-654a9d6121c8",vendor="OpenShift",cloud="Azure",version="4.10.3",available="True",created_via="Hive",core_worker="6",socket_worker="3"} 1 acm_managed_cluster_info{hub_cluster_id="41407702-8f44-47b8-a1b8-1088d1a85023",managed_cluster_id="dc25a4ad-9e68-4331-8a28-a154377c60da",vendor="OpenShift",cloud="vSphere",version="4.10.3",available="True",created_via="Hive",core_worker="12",socket_worker="6"} 1 ``` while it is not shown in prometheus.The data upload depends on ServiceMonitor https://github.com/stolostron/backplane-operator/blob/main/pkg/templates/charts/toggle/cluster-lifecycle/templates/metrics-servicemonitor.yaml. Check the ServiceMonitor resource and I don't see it in the upgrade env. ``` ✗ oc get servicemonitors.monitoring.coreos.com -A | grep clusterlifecycle ✗ oc get servicemonitors.monitoring.coreos.com -A | grep openshift-monitoring openshift-monitoring acm-insights 6d10h openshift-monitoring alertmanager-main 7d3h openshift-monitoring cluster-monitoring-operator 7d4h openshift-monitoring etcd 7d4h openshift-monitoring grafana 7d4h openshift-monitoring kube-state-metrics 7d4h openshift-monitoring kubelet 7d4h openshift-monitoring node-exporter 7d4h openshift-monitoring observability-observatorium-api 6d9h openshift-monitoring observability-thanos-compact 6d9h openshift-monitoring observability-thanos-query 6d9h openshift-monitoring observability-thanos-query-frontend 6d9h openshift-monitoring observability-thanos-query-frontend-memcached 6d9h openshift-monitoring observability-thanos-receive 6d9h openshift-monitoring observability-thanos-receive-controller 6d9h openshift-monitoring observability-thanos-rule 6d9h openshift-monitoring observability-thanos-store-memcached 6d9h openshift-monitoring observability-thanos-store-shard 6d9h openshift-monitoring ocm-grc-688c4-policy-propagator-metrics 6d10h openshift-monitoring openshift-state-metrics 7d4h openshift-monitoring prometheus-adapter 7d4h openshift-monitoring prometheus-k8s 7d4h openshift-monitoring prometheus-operator 7d4h openshift-monitoring telemeter-client 7d4h openshift-monitoring thanos-querier 7d4h openshift-monitoring thanos-sidecar 7d4h ``` No longer seeing this after upgrade from 2.4.3-DOWNSTREAM-2022-04-13-07-05-00 → 2.5.0-DOWNSTREAM-2022-05-02-16-00-32 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Advanced Cluster Management 2.5 security updates, images, and bug fixes), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:4956 |