Description of the problem: QE running the automation and find, when doing the cluster hibernate/resume action, after the cluster was resumed, we checked the cluster metrics by calling the promotheus, we found sometimes, even if the cluster labels of Vendor in managedcluster CR was `OpenShift`, the cluster metrics will report `Other` as the value. The automation call stack like below: ``` Check the cluster metrics cy:request ✔ GET https://api.acmqe-autotest-azure.az.dev06.red-chesterfield.com:6443/apis/cluster.open-cluster-management.io/v1/managedclusters/acmqe-clc-auto-aws307 Status: 200 Response body: { "apiVersion": "cluster.open-cluster-management.io/v1", "kind": "ManagedCluster", "metadata": { "annotations": { "open-cluster-management/created-via": "hive" }, "creationTimestamp": "2022-02-27T09:59:19Z", "finalizers": [ "managedcluster-import-controller.open-cluster-management.io/cleanup", "managedclusterinfo.finalizers.open-cluster-management.io", "agent.open-cluster-management.io/klusterletaddonconfig-cleanup", "open-cluster-management.io/managedclusterrole", "managedcluster-import-controller.open-cluster-management.io/manifestwork-cleanup", "cluster.open-cluster-management.io/api-resource-cleanup" ], "generation": 2, "labels": { "clc-qe": "automation", "cloud": "Amazon", "clusterID": "b6e2463e-f58a-4c78-bbfd-6fec6a17bab3", "feature.open-cluster-management.io/addon-application-manager": "available", "feature.open-cluster-management.io/addon-cert-policy-controller": "available", "feature.open-cluster-management.io/addon-iam-policy-controller": "available", "feature.open-cluster-management.io/addon-policy-controller": "available", "feature.open-cluster-management.io/addon-search-collector": "available", "feature.open-cluster-management.io/addon-work-manager": "available", "name": "acmqe-clc-auto-aws307", "openshiftVersion": "4.9.13", "owner": "acmqe-clc-auto", "region": "us-east-2", "vendor": "OpenShift" }, "managedFields": [ { "apiVersion": "cluster.open-cluster-management.io/v1", "fieldsType": "FieldsV1", "fieldsV1": { "f:metadata": { "f:finalizers": { "v:\"agent.open-cluster-management.io/klusterletaddonconfig-cleanup\"": {} } } }, "manager": "klusterlet-addon-controller", "operation": "Update", "time": "2022-02-27T09:59:19Z" }, { "apiVersion": "cluster.open-cluster-management.io/v1", "fieldsType": "FieldsV1", "fieldsV1": { "f:metadata": { "f:labels": { ".": {}, "f:clc-qe": {}, "f:cloud": {}, "f:owner": {}, "f:region": {}, "f:vendor": {} } }, "f:spec": { ".": {}, "f:hubAcceptsClient": {} } }, "manager": "unknown", "operation": "Update", "time": "2022-02-27T14:40:36Z" }, { "apiVersion": "cluster.open-cluster-management.io/v1", "fieldsType": "FieldsV1", "fieldsV1": { "f:metadata": { "f:annotations": { ".": {}, "f:open-cluster-management/created-via": {} }, "f:finalizers": { "v:\"managedcluster-import-controller.open-cluster-management.io/cleanup\"": {}, "v:\"managedcluster-import-controller.open-cluster-management.io/manifestwork-cleanup\"": {} }, "f:labels": { "f:name": {} } } }, "manager": "managedcluster-import-controller", "operation": "Update", "time": "2022-02-27T14:42:12Z" }, { "apiVersion": "cluster.open-cluster-management.io/v1", "fieldsType": "FieldsV1", "fieldsV1": { "f:status": { "f:capacity": { "f:core_worker": {}, "f:socket_worker": {} } } }, "manager": "controller", "operation": "Update", "time": "2022-02-27T15:18:56Z" }, { "apiVersion": "cluster.open-cluster-management.io/v1", "fieldsType": "FieldsV1", "fieldsV1": { "f:metadata": { "f:labels": { "f:feature.open-cluster-management.io/addon-application-manager": {}, "f:feature.open-cluster-management.io/addon-cert-policy-controller": {}, "f:feature.open-cluster-management.io/addon-iam-policy-controller": {}, "f:feature.open-cluster-management.io/addon-policy-controller": {}, "f:feature.open-cluster-management.io/addon-search-collector": {}, "f:feature.open-cluster-management.io/addon-work-manager": {} } }, "f:status": { "f:allocatable": { "f:memory": {} }, "f:capacity": { "f:memory": {} }, "f:clusterClaims": {} } }, "manager": "registration", "operation": "Update", "time": "2022-02-27T15:19:12Z" }, { "apiVersion": "cluster.open-cluster-management.io/v1", " ... cy:request ✔ GET https://prometheus-k8s-openshift-monitoring.apps.acmqe-autotest-azure.az.dev06.red-chesterfield.com/api/v1/query?query=acm_managed_cluster_info{managed_cluster_id='b6e2463e-f58a-4c78-bbfd-6fec6a17bab3'} Status: 200 Response body: { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "acm_managed_cluster_info", "available": "True", "cloud": "Amazon", "core_worker": "12", "created_via": "Hive", "endpoint": "https", "hub_cluster_id": "ca62ccce-76bb-4d6a-bd5d-afc9ef3bf4e8", "instance": "10.129.2.40:8443", "job": "clusterlifecycle-state-metrics-v2", "managed_cluster_id": "b6e2463e-f58a-4c78-bbfd-6fec6a17bab3", "namespace": "ocm", "pod": "clusterlifecycle-state-metrics-v2-58c799f994-q2t6z", "service": "clusterlifecycle-state-metrics-v2", "socket_worker": "3", "vendor": "Other", "version": "v1.22.3+e790d7f" }, "value": [ 1645975262.465, "1" ] } ] } } cy:command ✔ wrap {status: success, data: {resulttype: vector, result: [{metric: Object{16}, value: [1645975262.465, 1]}]}} cy:command ✔ assert expected **success** to equal **success** cy:command ✔ assert expected **Amazon** to equal **Amazon** cy:command ✔ assert expected **hive** to equal **hive** cy:command ✔ assert expected **b6e2463e-f58a-4c78-bbfd-6fec6a17bab3** to equal **b6e2463e-f58a-4c78-bbfd-6fec6a17bab3** cy:command ✘ assert expected **Other** to equal **OpenShift** ``` Release version: 2.4.2 Operator snapshot version: OCP version: 4.9.13 Browser Info: Steps to reproduce: 1. Hibernate and then resume the cluster 2. Watch the cluster metrics by call promotheus API like https://prometheus-k8s-openshift-monitoring.apps.acmqe-autotest-azure.az.dev06.red-chesterfield.com/api/v1/query?query=acm_managed_cluster_info{managed_cluster_id='<your cluster ID>'} Actual results: see as above Expected results: should be same with the value in managedcluster Additional info:
G2Bsync 1056784268 comment haoqing0110 Wed, 02 Mar 2022 10:49:37 UTC G2Bsync In 2.4, the metrics "vendor"' value comes from managedclusterinfos status [`mci.Status.KubeVendor`](https://github.com/stolostron/clusterlifecycle-state-metrics/blob/release-2.4/pkg/collectors/managedclusterinfo.go#L152 ) In latest code, the metrics "vendor"' value comes from managedcluster label [`vendor` ](https://github.com/stolostron/clusterlifecycle-state-metrics/blob/main/pkg/collectors/managedclusterinfo.go#L64) Need to further check the managedclusterinfos status.
G2Bsync 1061420819 comment zhiweiyin318 Tue, 08 Mar 2022 05:28:16 UTC G2Bsync I think this issue could be fixed in the PR https://github.com/stolostron/multicloud-operators-foundation/pull/451
Verified on v2.4.3RC2. Cluster claim shows correct vendor both before and after hibernate/resume.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Advanced Cluster Management 2.4.3 security updates and bug fixes), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1476