Bug 1971312 - [master] Telemetry-client won't report metrics in case the cluster was installed using the assisted operator
Summary: [master] Telemetry-client won't report metrics in case the cluster was instal...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: assisted-installer
Version: 4.8
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: ---
Assignee: Yoni Bettan
QA Contact: Omri Hochman
URL:
Whiteboard: AI-Team-Core
Depends On:
Blocks: 1971589
TreeView+ depends on / blocked
 
Reported: 2021-06-13 14:45 UTC by Eran Cohen
Modified: 2022-08-28 08:45 UTC (History)
1 user (show)

Fixed In Version: OCP-Metal-v1.0.22.1
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1971589 (view as bug list)
Environment:
Last Closed: 2022-08-28 08:45:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Eran Cohen 2021-06-13 14:45:59 UTC
Description of problem:

The telemetry-client isn't forwarding metrics 
Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Install a cluster using the assisted-service-operator
2.
3.

Actual results:

level=info caller=main.go:259 ts=2021-06-12T02:01:12.403052205Z msg="starting telemeter-client" from=https://prometheus-k8s.openshift-monitoring.svc:9091 to=https://dummy.com listen=localhost:8080
level=error caller=forwarder.go:268 ts=2021-06-13T11:03:07.595840144Z component=forwarder/worker msg="unable to forward results" err="unable to authorize to server: unable to parse the authentication response: invalid character '<' looking for beginning of value"
level=error caller=forwarder.go:268 ts=2021-06-13T11:04:07.998038114Z component=forwarder/worker msg="unable to forward results" err="unable to authorize to server: unable to parse the authentication response: invalid character '<' looking for beginning of value"
level=error caller=forwarder.go:268 ts=2021-06-13T11:05:08.389192787Z component=forwarder/worker msg="unable to forward results" err="unable to authorize to server: unable to parse the authentication response: invalid character '<' looking for beginning of value"

Expected results:

level=info caller=main.go:259 ts=2021-06-13T11:33:06.758863733Z msg="starting telemeter-client" from=https://prometheus-k8s.openshift-monitoring.svc:9091 to=https://infogw.api.openshift.com/ listen=localhost:8080

Additional info:

The root cause of this issue seems to be this PR https://github.com/openshift/assisted-service/pull/1773
The aim of the PR was to redirect the cluster metrics to the correct Telemeter server.
In case the cluster is installed using the assisted-service operator the service URL isn't "https://api.openshift.com" nor "https://api.stage.openshift.com" resulting a telemeter URL of:  "https://dummy.com"

Comment 1 Yoni Bettan 2021-06-13 15:26:49 UTC
What would be the correct URLs for the operator on different envs ?

Comment 2 Eran Cohen 2021-06-13 16:22:34 UTC
I think the desired behavior when installing with the operator is to set the Telemeter URL to: https://infogw.api.openshift.com 
The operator/service should allow setting another URL but by default, it should get to prod.

Comment 3 nshidlin 2021-06-15 10:03:08 UTC
Verified with:
assisted-service image: quay.io/ocpmetal/assisted-service@sha256:03027f7421882b8a8ac3c145193aa1e16912b6eb1cfe9bde996feec08f68a384

oc -n openshift-monitoring logs telemeter-client-6648bf7cc6-rzgmd -c telemeter-client                                                             
level=info caller=main.go:85 ts=2021-06-15T09:16:19.645381943Z msg="telemeter client initialized"                                                                            
level=warn caller=forwarder.go:130 ts=2021-06-15T09:16:19.645721789Z component=forwarder msg="not anonymizing any labels"                                                    
level=info caller=main.go:259 ts=2021-06-15T09:16:19.740197735Z msg="starting telemeter-client" from=https://prometheus-k8s.openshift-monitoring.svc:9091 to=https://infogw.a
pi.openshift.com/ listen=localhost:8080

Comment 4 Yoni Bettan 2021-06-21 13:28:13 UTC
In order to make sure that we are able to monitor metrics for clusters installed using assisted-installer-operator, we needed to added some additional label replacement in the recording rules:
https://github.com/openshift/telemeter/pull/381


Note You need to log in before you can comment on or make changes to this bug.