Bug 1971589

Summary: [4.8.0] Telemetry-client won't report metrics in case the cluster was installed using the assisted operator
Product: OpenShift Container Platform Reporter: Ronnie Lazar <alazar>
Component: assisted-installerAssignee: Yoni Bettan <ybettan>
assisted-installer sub component: Deployment Operator QA Contact: bjacot
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: urgent CC: alazar, aos-bugs, ercohen, nshidlin, ybettan
Version: 4.8Keywords: Triaged
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AI-Team-Core
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1971312 Environment:
Last Closed: 2021-07-27 23:12:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1971312    
Bug Blocks:    

Description Ronnie Lazar 2021-06-14 11:29:13 UTC
+++ This bug was initially created as a clone of Bug #1971312 +++

Description of problem:

The telemetry-client isn't forwarding metrics 
Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Install a cluster using the assisted-service-operator
2.
3.

Actual results:

level=info caller=main.go:259 ts=2021-06-12T02:01:12.403052205Z msg="starting telemeter-client" from=https://prometheus-k8s.openshift-monitoring.svc:9091 to=https://dummy.com listen=localhost:8080
level=error caller=forwarder.go:268 ts=2021-06-13T11:03:07.595840144Z component=forwarder/worker msg="unable to forward results" err="unable to authorize to server: unable to parse the authentication response: invalid character '<' looking for beginning of value"
level=error caller=forwarder.go:268 ts=2021-06-13T11:04:07.998038114Z component=forwarder/worker msg="unable to forward results" err="unable to authorize to server: unable to parse the authentication response: invalid character '<' looking for beginning of value"
level=error caller=forwarder.go:268 ts=2021-06-13T11:05:08.389192787Z component=forwarder/worker msg="unable to forward results" err="unable to authorize to server: unable to parse the authentication response: invalid character '<' looking for beginning of value"

Expected results:

level=info caller=main.go:259 ts=2021-06-13T11:33:06.758863733Z msg="starting telemeter-client" from=https://prometheus-k8s.openshift-monitoring.svc:9091 to=https://infogw.api.openshift.com/ listen=localhost:8080

Additional info:

The root cause of this issue seems to be this PR https://github.com/openshift/assisted-service/pull/1773
The aim of the PR was to redirect the cluster metrics to the correct Telemeter server.
In case the cluster is installed using the assisted-service operator the service URL isn't "https://api.openshift.com" nor "https://api.stage.openshift.com" resulting a telemeter URL of:  "https://dummy.com"

--- Additional comment from ybettan on 20210613T15:26:49

What would be the correct URLs for the operator on different envs ?

--- Additional comment from ercohen on 20210613T16:22:34

I think the desired behavior when installing with the operator is to set the Telemeter URL to: https://infogw.api.openshift.com 
The operator/service should allow setting another URL but by default, it should get to prod.

Comment 2 Yoni Bettan 2021-06-21 13:25:34 UTC
Nir, in order to test it, in addition to checking the logs, please make sure that the metrics arrived to telemeter server and have the correct labels.

You can do it as follow:
1. go to https://telemeter-lts-dashboards.datahub.redhat.com/d/rFFxyJ5Mz/assisted-installer?panelId=16&fullscreen&orgId=1&from=now-7d&to=now and to https://telemeter-lts-dashboards.datahub.redhat.com/d/rFFxyJ5Mz/assisted-installer?panelId=41&fullscreen&orgId=1&from=now-7d&to=now
2. make sure both panels contain some cluster with the label "infrastructure-operator"

Comment 6 errata-xmlrpc 2021-07-27 23:12:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438