Bug 1877735 - Adding CodeReady Workspaces metrics to the cluster monitoring allowlist on OCP
Summary: Adding CodeReady Workspaces metrics to the cluster monitoring allowlist on OCP
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.6
Hardware: All
OS: All
low
low
Target Milestone: ---
: ---
Assignee: Ilya Buziuk
QA Contact: Junqi Zhao
URL:
Whiteboard: telemeter-metric
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-10 10:21 UTC by Ilya Buziuk
Modified: 2022-08-25 21:31 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-25 21:31:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 925 0 None closed Bug 1877735: Adding CodeReady Workspaces metrics to the allowlist / config.yaml 2021-02-21 14:19:14 UTC

Description Ilya Buziuk 2020-09-10 10:21:11 UTC
Description of problem:


Adding CodeReady Workspaces metrics to the cluster monitoring allowlist on OCP:

- cardinality approval -> https://docs.google.com/document/d/1WzJrXXCFSFVFLHIKeg0b-lo3wA4S1qPK4Qa7fKdjMKo/edit#heading=h.3im6egcwyu2

- PR to cluster-monitoring-operator - https://github.com/openshift/cluster-monitoring-operator/pull/925

Ideally, we would like to have it enabled for 4.6, otherwise for 4.7

Comment 1 Sergiusz Urbaniak 2020-09-10 11:22:11 UTC
lowering severity as this is not a release blocker for 4.6, per comment.

Comment 3 Junqi Zhao 2020-09-23 02:41:35 UTC
tested with 4.6.0-0.nightly-2020-09-22-200146, the fix is in cluster-monitoring-operator, we also need one PR for telemeter-server

# oc -n openshift-monitoring get cm telemetry-config -o jsonpath="{.data.metrics\.yaml}"
...
# (codeready workspaces, @ibuziuk) The number of workspaces with a given status STARTING|STOPPED|RUNNING|STOPPING. Type 'gauge'.
- '{__name__="che_workspace_status"}'
# (codeready workspaces, @ibuziuk) The number of started workspaces. Type 'counter'.
- '{__name__="che_workspace_started_total"}'
# (codeready workspaces, @ibuziuk) The number of failed workspaces.
# Can be used with the 'while' label e.g. {while="STARTING"}, {while="RUNNING"}, {while="STOPPING"}.Type 'counter'.
- '{__name__="che_workspace_failure_total"}'
# (codeready workspaces, @ibuziuk) The time in seconds required for the startup of all the workspaces.
- '{__name__="che_workspace_start_time_seconds_sum"}'
# (codeready workspaces, @ibuziuk) The overall number of attempts for starting all the workspaces
...

please correct me if I am wrong

Comment 4 Ilya Buziuk 2020-09-29 08:18:15 UTC
Created https://gitlab.cee.redhat.com/observatorium/configuration/-/merge_requests/165

please, let me know if there is anything else needed for enabling the CRW metrics in 4.6

Comment 5 Sergiusz Urbaniak 2020-10-07 07:01:08 UTC
@junqi: do you mind to reasses this bugzilla?

Comment 6 Junqi Zhao 2020-10-09 08:25:00 UTC
(In reply to Sergiusz Urbaniak from comment #5)
> @junqi: do you mind to reasses this bugzilla?

The fix is fine, but I don't have permission to check in https://infogw-proxy.api.openshift.com/
if the following metrics could be found, that is fine
"{__name__=\"che_workspace_failure_total\"}",
"{__name__=\"che_workspace_start_time_seconds_count\"}",
"{__name__=\"che_workspace_start_time_seconds_sum\"}",
"{__name__=\"che_workspace_started_total\"}",
"{__name__=\"che_workspace_status\"}"

Comment 10 Sergiusz Urbaniak 2021-03-31 05:22:26 UTC
closing as wontfix as the original author is not available, the metrics are not in the responsibility domain of the monitoring team.

Comment 11 Ilya Buziuk 2021-03-31 07:43:28 UTC
Not sure why the issue is closed as wontfix - all the CRW metrics from the cardinality document are available on telemeter-lts-dashboards
What is expected to be done at this point?

Comment 12 Ilya Buziuk 2021-03-31 08:07:15 UTC
Reopening for clarification.
Currently, it is expected that CRW metrics from the cardinality document [1] are available on telememeter for OCP 4.6+

[1] https://docs.google.com/document/d/1WzJrXXCFSFVFLHIKeg0b-lo3wA4S1qPK4Qa7fKdjMKo/edit#heading=h.3im6egcwyu2


Note You need to log in before you can comment on or make changes to this bug.