Description of problem: Adding CodeReady Workspaces metrics to the cluster monitoring allowlist on OCP: - cardinality approval -> https://docs.google.com/document/d/1WzJrXXCFSFVFLHIKeg0b-lo3wA4S1qPK4Qa7fKdjMKo/edit#heading=h.3im6egcwyu2 - PR to cluster-monitoring-operator - https://github.com/openshift/cluster-monitoring-operator/pull/925 Ideally, we would like to have it enabled for 4.6, otherwise for 4.7
lowering severity as this is not a release blocker for 4.6, per comment.
tested with 4.6.0-0.nightly-2020-09-22-200146, the fix is in cluster-monitoring-operator, we also need one PR for telemeter-server # oc -n openshift-monitoring get cm telemetry-config -o jsonpath="{.data.metrics\.yaml}" ... # (codeready workspaces, @ibuziuk) The number of workspaces with a given status STARTING|STOPPED|RUNNING|STOPPING. Type 'gauge'. - '{__name__="che_workspace_status"}' # (codeready workspaces, @ibuziuk) The number of started workspaces. Type 'counter'. - '{__name__="che_workspace_started_total"}' # (codeready workspaces, @ibuziuk) The number of failed workspaces. # Can be used with the 'while' label e.g. {while="STARTING"}, {while="RUNNING"}, {while="STOPPING"}.Type 'counter'. - '{__name__="che_workspace_failure_total"}' # (codeready workspaces, @ibuziuk) The time in seconds required for the startup of all the workspaces. - '{__name__="che_workspace_start_time_seconds_sum"}' # (codeready workspaces, @ibuziuk) The overall number of attempts for starting all the workspaces ... please correct me if I am wrong
Created https://gitlab.cee.redhat.com/observatorium/configuration/-/merge_requests/165 please, let me know if there is anything else needed for enabling the CRW metrics in 4.6
@junqi: do you mind to reasses this bugzilla?
(In reply to Sergiusz Urbaniak from comment #5) > @junqi: do you mind to reasses this bugzilla? The fix is fine, but I don't have permission to check in https://infogw-proxy.api.openshift.com/ if the following metrics could be found, that is fine "{__name__=\"che_workspace_failure_total\"}", "{__name__=\"che_workspace_start_time_seconds_count\"}", "{__name__=\"che_workspace_start_time_seconds_sum\"}", "{__name__=\"che_workspace_started_total\"}", "{__name__=\"che_workspace_status\"}"
closing as wontfix as the original author is not available, the metrics are not in the responsibility domain of the monitoring team.
Not sure why the issue is closed as wontfix - all the CRW metrics from the cardinality document are available on telemeter-lts-dashboards What is expected to be done at this point?
Reopening for clarification. Currently, it is expected that CRW metrics from the cardinality document [1] are available on telememeter for OCP 4.6+ [1] https://docs.google.com/document/d/1WzJrXXCFSFVFLHIKeg0b-lo3wA4S1qPK4Qa7fKdjMKo/edit#heading=h.3im6egcwyu2