Bug 2000294 - report apiversion of esxi host and vcenter server
Summary: report apiversion of esxi host and vcenter server
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.9.0
Assignee: Hemant Kumar
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-01 19:14 UTC by Hemant Kumar
Modified: 2021-10-18 17:51 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:50:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift vsphere-problem-detector pull 48 0 None None None 2021-09-01 19:16:33 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:51:03 UTC

Description Hemant Kumar 2021-09-01 19:14:23 UTC
report apiversion of esxi host and vcenter server

Comment 1 Hemant Kumar 2021-09-01 19:15:45 UTC
Reporting api version lets have accurate metrics on patch release etc of esxi and vcenter versions.

Comment 4 Wei Duan 2021-09-06 03:45:18 UTC
Test on 4.9.0-0.nightly-2021-09-05-204238:

There are 3 masters + 2 workers in cluster:
$ oc get node
NAME                             STATUS   ROLES    AGE    VERSION
wduan-0906a-rlj9t-master-0       Ready    master   108m   v1.22.0-rc.0+75ee307
wduan-0906a-rlj9t-master-1       Ready    master   108m   v1.22.0-rc.0+75ee307
wduan-0906a-rlj9t-master-2       Ready    master   108m   v1.22.0-rc.0+75ee307
wduan-0906a-rlj9t-worker-cd2rr   Ready    worker   98m    v1.22.0-rc.0+75ee307
wduan-0906a-rlj9t-worker-p5ktm   Ready    worker   98m    v1.22.0-rc.0+75ee307


From the vsphere-problem-detector log, looks like only check the masters:
$ oc -n openshift-cluster-storage-operator logs vsphere-problem-detector-operator-fbf45bff-dpjl6 | egrep "ESXi version"
I0906 02:00:31.821529       1 node_esxi_version.go:83] Node wduan-0906a-rlj9t-master-0 runs on host host-203583 (10.3.32.4) with ESXi version: 7.0.2
I0906 02:00:31.822167       1 node_esxi_version.go:83] Node wduan-0906a-rlj9t-master-2 runs on host host-259503 (10.3.32.7) with ESXi version: 7.0.2
I0906 02:00:31.822608       1 node_esxi_version.go:83] Node wduan-0906a-rlj9t-master-1 runs on host host-259509 (10.3.32.9) with ESXi version: 7.0.2

Also in vsphere_esxi_version_total metrics the number is "3"
$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/query?query=vsphere_esxi_version_total' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   484    0   484    0     0  32266      0 --:--:-- --:--:-- --:--:-- 32266
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "vsphere_esxi_version_total",
          "api_version": "7.0.2.0",
          "container": "vsphere-problem-detector-operator",
          "endpoint": "vsphere-metrics",
          "instance": "10.129.0.4:8444",
          "job": "vsphere-problem-detector-metrics",
          "namespace": "openshift-cluster-storage-operator",
          "pod": "vsphere-problem-detector-operator-fbf45bff-dpjl6",
          "service": "vsphere-problem-detector-metrics",
          "version": "7.0.2"
        },
        "value": [
          1630898344.026,
          "3"
        ]
      }
    ]
  }
}

I temporally change the status to "POST", please let me know if this is expected or something I missed here.

Comment 5 Wei Duan 2021-09-06 11:26:01 UTC
The first check probably happens during installation, when there are only master nodes in the cluster, so wait some time for another check.

Comment 6 Wei Duan 2021-09-06 13:17:01 UTC
Checked in another cluster and wait for the second round check, all nodes are reported:

$ oc -n openshift-cluster-storage-operator logs vsphere-problem-detector-operator-fbf45bff-l9hkw | grep ESXi
I0906 05:04:52.426170       1 node_esxi_version.go:83] Node control-plane-0 runs on host host-259509 (10.3.32.9) with ESXi version: 7.0.2
I0906 05:04:52.426206       1 operator.go:305] CollectNodeESXiVersion:control-plane-0 passed
I0906 05:04:52.431054       1 node_esxi_version.go:83] Node control-plane-2 runs on host host-221014 (10.3.32.8) with ESXi version: 7.0.2
I0906 05:04:52.431087       1 operator.go:305] CollectNodeESXiVersion:control-plane-2 passed
I0906 05:04:52.443919       1 node_esxi_version.go:83] Node control-plane-1 runs on host host-172909 (10.3.32.5) with ESXi version: 7.0.2
I0906 05:04:52.443952       1 operator.go:305] CollectNodeESXiVersion:control-plane-1 passed
I0906 13:04:57.586066       1 node_esxi_version.go:83] Node compute-0 runs on host host-259503 (10.3.32.7) with ESXi version: 7.0.2
I0906 13:04:57.586097       1 operator.go:305] CollectNodeESXiVersion:compute-0 passed
I0906 13:04:57.588661       1 node_esxi_version.go:83] Node compute-1 runs on host host-203583 (10.3.32.4) with ESXi version: 7.0.2
I0906 13:04:57.588686       1 operator.go:305] CollectNodeESXiVersion:compute-1 passed
I0906 13:04:57.589594       1 node_esxi_version.go:83] Node control-plane-1 runs on host host-172909 (10.3.32.5) with ESXi version: 7.0.2
I0906 13:04:57.589828       1 operator.go:305] CollectNodeESXiVersion:control-plane-1 passed
I0906 13:04:57.593048       1 node_esxi_version.go:83] Node control-plane-0 runs on host host-259509 (10.3.32.9) with ESXi version: 7.0.2
I0906 13:04:57.593064       1 operator.go:305] CollectNodeESXiVersion:control-plane-0 passed
I0906 13:04:57.597787       1 node_esxi_version.go:83] Node control-plane-2 runs on host host-221014 (10.3.32.8) with ESXi version: 7.0.2
I0906 13:04:57.597806       1 operator.go:305] CollectNodeESXiVersion:control-plane-2 passed


{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "vsphere_esxi_version_total",
          "api_version": "7.0.2.0",
          "container": "vsphere-problem-detector-operator",
          "endpoint": "vsphere-metrics",
          "instance": "10.129.0.3:8444",
          "job": "vsphere-problem-detector-metrics",
          "namespace": "openshift-cluster-storage-operator",
          "pod": "vsphere-problem-detector-operator-fbf45bff-l9hkw",
          "service": "vsphere-problem-detector-metrics",
          "version": "7.0.2"
        },
        "value": [
          1630934173.332,
          "5"
        ]
      }
    ]
  }
}

So mark it as "VERIFIED".

Comment 9 errata-xmlrpc 2021-10-18 17:50:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.