Hide Forgot
Configure metrics for vsphere driver to be reported
Adding driver + syncer metrics to Prometheus. The driver reports regular go metrics (process_open_fds, go_threads, ...), the only driver specific thing is `vsphere_csi_info`: # HELP vsphere_csi_info CSI Info # TYPE vsphere_csi_info gauge vsphere_csi_info{version="09175db5"} 1 The syncer is mostly the same, it has vsphere_syncer_info: # HELP vsphere_syncer_info Syncer Info # TYPE vsphere_syncer_info gauge vsphere_syncer_info{version="09175db5"} 1 (to be honest, the metrics don't look very useful)
Verified passed on 4.11.0-0.nightly-2022-04-23-153426 The following metrics are be albe to reported from Prometheus. "vsphere_csi_driver_error", "vsphere_csi_info", "vsphere_csi_volume_ops_histogram_bucket", "vsphere_csi_volume_ops_histogram_count", "vsphere_csi_volume_ops_histogram_sum", "vsphere_full_sync_ops_histogram_bucket", "vsphere_full_sync_ops_histogram_count", "vsphere_full_sync_ops_histogram_sum", "vsphere_sync_errors", "vsphere_syncer_info", Move status to "VERIFIED"
On an upgrade cluser (from 4.10.0-0.nightly-2022-04-23-095048 to 4.11.0-0.nightly-2022-04-23-153426), only vsphere_csi_driver_error and vsphere_sync_errors presnet. Will double check. Move status to "ON_QA"
This should be fixed by library-go bump in https://github.com/openshift/vmware-vsphere-csi-driver-operator/pull/85
https://github.com/openshift/vmware-vsphere-csi-driver-operator/pull/85 got merged.
Verified pass on upgrade case: "vsphere_cns_volume_ops_histogram_bucket", "vsphere_cns_volume_ops_histogram_count", "vsphere_cns_volume_ops_histogram_sum", "vsphere_csi_info", "vsphere_full_sync_ops_histogram_bucket", "vsphere_full_sync_ops_histogram_count", "vsphere_full_sync_ops_histogram_sum", "vsphere_sync_errors", "vsphere_syncer_info", $ oc -n openshift-cluster-csi-drivers get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE vmware-vsphere-csi-driver-controller-metrics ClusterIP 172.30.38.193 <none> 442/TCP,443/TCP,444/TCP,445/TCP,446/TCP,447/TCP 7h48m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069