Bug 2116382 - Setting a telemeter proxy in the cluster-monitoring-config config map does not work as expected
Summary: Setting a telemeter proxy in the cluster-monitoring-config config map does no...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.12
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 4.12.0
Assignee: Simon Pasquier
QA Contact: Junqi Zhao
Brian Burt
URL:
Whiteboard: wip
Depends On:
Blocks: 2112381
TreeView+ depends on / blocked
 
Reported: 2022-08-08 12:48 UTC by Juan Rodriguez
Modified: 2023-01-17 19:54 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 2112381
Environment:
Last Closed: 2023-01-17 19:54:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 1737 0 None Draft Bug 2116382: Give precedence to CMO config map proxy config 2022-08-08 14:53:13 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:54:42 UTC

Description Juan Rodriguez 2022-08-08 12:48:03 UTC
+++ This bug was initially created as a clone of Bug #2112381 +++

Description of problem:

Setting a telemeter proxy in the cluster-monitoring-config config map does not work as expected

Version-Release number of selected component (if applicable):


How reproducible:

Yes. 


Steps to Reproduce:

the following KCS details steps to add a proxy. 
The steps have been verified at 4.7 but do not work at 4.8, 4.9 or 4.10

https://access.redhat.com/solutions/6172402

When testing at 4.8, 4.9 and 4.10 the proxy setting where also nested under `telemeterClient` 

which triggered a telemeter restart but the proxy setting do not get set in the deployment as they do in 4.7  

Actual results:

4.8, 4.9 and 4.10 without the nested `telemeterClient` 
does not trigger a restart of the telemeter pod 

Expected results:

I think the proxy setting should be nested under telemeterClient
but should set the environment variables in the deployment 

Additional info:

--- Additional comment from Juan Rodriguez on 2022-08-03 15:10:57 UTC ---

Thanks for reaching out to us with this issue. I'd need some additional information to be able to reproduce the issue. Specifically, for the reported version 4.8, I'd ask you to provide the following information about how you applied the procedure described in https://access.redhat.com/solutions/6172402:

- Ouput for `oc get infrastructures.config.openshift.io cluster -o jsonpath='{.status.apiServerInternalURI}'`
- Ouput for `oc get networks.config.openshift.io cluster -o jsonpath='{.status.serviceNetwork[*]}'`
- Ouput for `oc get infrastructures.config.openshift.io cluster -o jsonpath='{.status.etcdDiscoveryDomain}'`
- Ouput for `oc get networks.config.openshift.io cluster -o jsonpath='{.status.clusterNetwork[*].cidr}'`
- Output for `oc -n openshift-monitoring get configmap cluster-monitoring-config -o yaml` and which values you added to `data.config.yaml` as part of the procedure

I don't understand what you mean with the following statement:

> When testing at 4.8, 4.9 and 4.10 the proxy setting where also nested under `telemeterClient` 
> which triggered a telemeter restart but the proxy setting do not get set in the deployment as they do in 4.7  

Do you mean that for versions 4.8, 4.9 and 4.10 you have to specify the fields httpProxy, httpsProxy, noProxy in `data.config.yaml.telemeterClient` instead of `data.config.yaml.http`? Does `data.config.yaml.http` as specified in https://access.redhat.com/solutions/6172402 work for version 4.7?

Also please clarify whether or not you were able to get make the telemeter proxy work with your configuration, if that is the case then this would be a documentation problem and not a code problem.

--- Additional comment from  on 2022-08-04 07:45:13 UTC ---

Hello Juan, 

> I don't understand what you mean with the following statement:

>> When testing at 4.8, 4.9 and 4.10 the proxy setting where also nested under `telemeterClient` 
>> which triggered a telemeter restart but the proxy setting do not get set in the deployment as they do in 4.7 

This is the the example config I tested at 4.8.4.9 and 4.10 - which, for me triggers a pod restart
In my understanding the telemeterClient: key was introduced with version 4.8. 
I thought maybe this is why the KCS steps no longer work - so I move the proxy keys under telemeterClient as an additional test. 

~~~

apiVersion: v1
data:
  config.yaml: |
    telemeterClient:
      http:
        httpProxy: http://<myproxyhost>:<myproxyport>/
        httpsProxy: https://<myproxyhost>:<myproxyport>
        noProxy: 127.0.0.1,localhost,.svc,.cluster.local,10.0.0.0/16,172.30.0.0/16,10.128.0.0/14,api-int.mycluster.mydomain,169.254.169.254
    prometheusK8s:
      retention: 8d
kind: ConfigMap

~~~

but the values are not set in the deployment - verified with the following command - 

~~~

oc get deploy -n openshift-monitoring telemeter-client -o json|jq -r '.spec.template.spec.containers[0].env[] | select (.name|test ("_PROXY"))' 

~~~

--- Additional comment from  on 2022-08-04 07:55:19 UTC ---

Hello Juan 

The procedure I am following is exactly as is described in the KCS - as follows 

~~~

oc get infrastructures.config.openshift.io cluster -o jsonpath='{.status.apiServerInternalURI}' 
https://api-int.nigsmith001.lab.pnq2.cee.redhat.com:6443
[nigsmith@nigsmith Downloads]$ 
[nigsmith@nigsmith Downloads]$ oc get networks.config.openshift.io cluster -o jsonpath='{.status.serviceNetwork[*]}' 
172.30.0.0/16
[nigsmith@nigsmith Downloads]$ 
[nigsmith@nigsmith Downloads]$ oc get infrastructures.config.openshift.io cluster -o jsonpath='{.status.etcdDiscoveryDomain}' 
[nigsmith@nigsmith Downloads]$ oc get networks.config.openshift.io cluster -o jsonpath='{.status.clusterNetwork[*].cidr}' 
10.128.0.0/14
[nigsmith@nigsmith Downloads]$ 

$ cat <<EOF| oc apply -f -
> apiVersion: v1
> data:
>   config.yaml: |
>     http:
>       httpProxy: http://somemadeupproxy:8080/
>       httpsProxy: https://somemadeuporxy:8090/
>       noProxy: 127.0.0.1,localhost,.svc,.cluster.local,10.0.0.0/16,172.30.0.0/16,10.128.0.0/14,api-int.nigsmith001.lab.pnq2.cee.redhat.com,169.254.169.254
> kind: ConfigMap
> metadata:
>   name: cluster-monitoring-config
>   namespace: openshift-monitoring
> EOF

~~~ 

And to answer you last question - 

> Also please clarify whether or not you were able to get make the telemeter proxy work with your configuration, if that is the case then this would be a documentation problem and not a code problem. 

I have not set the operator to unmanaged and attempted to set the value manually.

--- Additional comment from Juan Rodriguez on 2022-08-08 12:41:44 UTC ---

Hi again, 

The config keys telemeterClient.http.* are ignored if they don't match the structures defined in https://github.com/openshift/cluster-monitoring-operator/blob/release-4.8/pkg/manifests/config.go#L41. 
I've tried setting up the cluster-wide proxy on a 4.8 as in https://docs.openshift.com/container-platform/4.10/networking/enable-cluster-wide-proxy.html, and that sets the env vars HTTP_PROXY etc, but from the case it looks like the customer wants to setup the proxy just for telemeter client but not for the whole cluster. The CMO operator should respect the configmap keys `http.*`, and only fallback to the cluster wide proxy config when that config is empty, but the current code https://github.com/openshift/cluster-monitoring-operator/blob/release-4.8/pkg/operator/operator.go#L435 is doing the opposite. I'll work on a code fix for that.

Comment 1 Juan Rodriguez 2022-08-08 12:53:26 UTC
The same issue as bug 2112381 is present on master as seen in  https://github.com/openshift/cluster-monitoring-operator/blob/master/pkg/operator/operator.go#L595. This bug is the fix for 4.12.0, to start the backporting flow until reaching 4.8 that is the version for which bug 2112381 was reported

Comment 11 Simon Pasquier 2022-12-14 09:13:11 UTC
The bug fix has already shipped in 4.11.z, 4.10.z, 4.9.z and 4.8.z.

Comment 13 errata-xmlrpc 2023-01-17 19:54:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.