Bug 2112381

Summary: Setting a telemeter proxy in the cluster-monitoring-config config map does not work as expected
Product: OpenShift Container Platform Reporter: nigsmith
Component: MonitoringAssignee: Juan Rodriguez <jrodrig>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: low Docs Contact:
Priority: medium    
Version: 4.8CC: anpicker
Target Milestone: ---   
Target Release: 4.8.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: wip
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2116382 (view as bug list) Environment:
Last Closed: 2022-10-27 05:44:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2116382    
Bug Blocks:    

Description nigsmith 2022-07-29 14:43:42 UTC
Description of problem:

Setting a telemeter proxy in the cluster-monitoring-config config map does not work as expected

Version-Release number of selected component (if applicable):


How reproducible:

Yes. 


Steps to Reproduce:

the following KCS details steps to add a proxy. 
The steps have been verified at 4.7 but do not work at 4.8, 4.9 or 4.10

https://access.redhat.com/solutions/6172402

When testing at 4.8, 4.9 and 4.10 the proxy setting where also nested under `telemeterClient` 

which triggered a telemeter restart but the proxy setting do not get set in the deployment as they do in 4.7  

Actual results:

4.8, 4.9 and 4.10 without the nested `telemeterClient` 
does not trigger a restart of the telemeter pod 

Expected results:

I think the proxy setting should be nested under telemeterClient
but should set the environment variables in the deployment 

Additional info:

Comment 1 Juan Rodriguez 2022-08-03 15:10:57 UTC
Thanks for reaching out to us with this issue. I'd need some additional information to be able to reproduce the issue. Specifically, for the reported version 4.8, I'd ask you to provide the following information about how you applied the procedure described in https://access.redhat.com/solutions/6172402:

- Ouput for `oc get infrastructures.config.openshift.io cluster -o jsonpath='{.status.apiServerInternalURI}'`
- Ouput for `oc get networks.config.openshift.io cluster -o jsonpath='{.status.serviceNetwork[*]}'`
- Ouput for `oc get infrastructures.config.openshift.io cluster -o jsonpath='{.status.etcdDiscoveryDomain}'`
- Ouput for `oc get networks.config.openshift.io cluster -o jsonpath='{.status.clusterNetwork[*].cidr}'`
- Output for `oc -n openshift-monitoring get configmap cluster-monitoring-config -o yaml` and which values you added to `data.config.yaml` as part of the procedure

I don't understand what you mean with the following statement:

> When testing at 4.8, 4.9 and 4.10 the proxy setting where also nested under `telemeterClient` 
> which triggered a telemeter restart but the proxy setting do not get set in the deployment as they do in 4.7  

Do you mean that for versions 4.8, 4.9 and 4.10 you have to specify the fields httpProxy, httpsProxy, noProxy in `data.config.yaml.telemeterClient` instead of `data.config.yaml.http`? Does `data.config.yaml.http` as specified in https://access.redhat.com/solutions/6172402 work for version 4.7?

Also please clarify whether or not you were able to get make the telemeter proxy work with your configuration, if that is the case then this would be a documentation problem and not a code problem.

Comment 2 nigsmith 2022-08-04 07:45:13 UTC
Hello Juan, 

> I don't understand what you mean with the following statement:

>> When testing at 4.8, 4.9 and 4.10 the proxy setting where also nested under `telemeterClient` 
>> which triggered a telemeter restart but the proxy setting do not get set in the deployment as they do in 4.7 

This is the the example config I tested at 4.8.4.9 and 4.10 - which, for me triggers a pod restart
In my understanding the telemeterClient: key was introduced with version 4.8. 
I thought maybe this is why the KCS steps no longer work - so I move the proxy keys under telemeterClient as an additional test. 

~~~

apiVersion: v1
data:
  config.yaml: |
    telemeterClient:
      http:
        httpProxy: http://<myproxyhost>:<myproxyport>/
        httpsProxy: https://<myproxyhost>:<myproxyport>
        noProxy: 127.0.0.1,localhost,.svc,.cluster.local,10.0.0.0/16,172.30.0.0/16,10.128.0.0/14,api-int.mycluster.mydomain,169.254.169.254
    prometheusK8s:
      retention: 8d
kind: ConfigMap

~~~

but the values are not set in the deployment - verified with the following command - 

~~~

oc get deploy -n openshift-monitoring telemeter-client -o json|jq -r '.spec.template.spec.containers[0].env[] | select (.name|test ("_PROXY"))' 

~~~

Comment 3 nigsmith 2022-08-04 07:55:19 UTC
Hello Juan 

The procedure I am following is exactly as is described in the KCS - as follows 

~~~

oc get infrastructures.config.openshift.io cluster -o jsonpath='{.status.apiServerInternalURI}' 
https://api-int.nigsmith001.lab.pnq2.cee.redhat.com:6443
[nigsmith@nigsmith Downloads]$ 
[nigsmith@nigsmith Downloads]$ oc get networks.config.openshift.io cluster -o jsonpath='{.status.serviceNetwork[*]}' 
172.30.0.0/16
[nigsmith@nigsmith Downloads]$ 
[nigsmith@nigsmith Downloads]$ oc get infrastructures.config.openshift.io cluster -o jsonpath='{.status.etcdDiscoveryDomain}' 
[nigsmith@nigsmith Downloads]$ oc get networks.config.openshift.io cluster -o jsonpath='{.status.clusterNetwork[*].cidr}' 
10.128.0.0/14
[nigsmith@nigsmith Downloads]$ 

$ cat <<EOF| oc apply -f -
> apiVersion: v1
> data:
>   config.yaml: |
>     http:
>       httpProxy: http://somemadeupproxy:8080/
>       httpsProxy: https://somemadeuporxy:8090/
>       noProxy: 127.0.0.1,localhost,.svc,.cluster.local,10.0.0.0/16,172.30.0.0/16,10.128.0.0/14,api-int.nigsmith001.lab.pnq2.cee.redhat.com,169.254.169.254
> kind: ConfigMap
> metadata:
>   name: cluster-monitoring-config
>   namespace: openshift-monitoring
> EOF

~~~ 

And to answer you last question - 

> Also please clarify whether or not you were able to get make the telemeter proxy work with your configuration, if that is the case then this would be a documentation problem and not a code problem. 

I have not set the operator to unmanaged and attempted to set the value manually.

Comment 4 Juan Rodriguez 2022-08-08 12:41:44 UTC
Hi again, 

The config keys telemeterClient.http.* are ignored if they don't match the structures defined in https://github.com/openshift/cluster-monitoring-operator/blob/release-4.8/pkg/manifests/config.go#L41. 
I've tried setting up the cluster-wide proxy on a 4.8 as in https://docs.openshift.com/container-platform/4.10/networking/enable-cluster-wide-proxy.html, and that sets the env vars HTTP_PROXY etc, but from the case it looks like the customer wants to setup the proxy just for telemeter client but not for the whole cluster. The CMO operator should respect the configmap keys `http.*`, and only fallback to the cluster wide proxy config when that config is empty, but the current code https://github.com/openshift/cluster-monitoring-operator/blob/release-4.8/pkg/operator/operator.go#L435 is doing the opposite. I'll work on a code fix for that.

Comment 7 Juan Rodriguez 2022-08-22 17:05:46 UTC
see https://issues.redhat.com/browse/OCPBUGS-407 for backport to 4.11.z

Comment 8 Juan Rodriguez 2022-09-06 09:30:43 UTC
Backport to 4.10.z https://issues.redhat.com/browse/OCPBUGS-579 verified

Comment 9 Juan Rodriguez 2022-09-06 09:42:23 UTC
Automated cherry pick for 4.9 failed in https://github.com/openshift/cluster-monitoring-operator/pull/1751
Working in https://issues.redhat.com/browse/OCPBUGS-937 on 4.9.z backport

Comment 10 Juan Rodriguez 2022-09-09 15:40:27 UTC
Automated cherry pick for 4.8 failed in https://github.com/openshift/cluster-monitoring-operator/pull/1762. 
I'll backport manually

Comment 11 Juan Rodriguez 2022-09-09 16:10:32 UTC
Fix for 4.8 in https://issues.redhat.com/browse/OCPBUGS-1098 due to bugzilla migration

Comment 12 Junqi Zhao 2022-09-14 01:47:56 UTC
change to POST status as per OCPBUGS-1098

Comment 13 Junqi Zhao 2022-10-11 01:24:40 UTC
the jira bug OCPBUGS-1098 is closed, set the bug to verified

Comment 16 errata-xmlrpc 2022-10-27 05:44:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.52 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7034