Bug 1734704

Summary: proxy environment, alertmanager-proxy container in alertmanager-main pods can not be started
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: MonitoringAssignee: Pawel Krupa <pkrupa>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.0CC: alegrand, anpicker, dhansen, erooth, mloibl, pkrupa, surbania
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-16 06:34:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 4 Daneyon Hansen 2019-08-18 00:35:26 UTC
Can you try using 4.2.0-0.nightly-2019-08-15-205330. This version has worked for other proxy related bugs. I see the container Environment NO_PROXY does not include the service cid 172.30.0.0/16 or other cluster default no proxy values. I believe this is why the see the following error:

2019/07/31 05:50:16 main.go:138: Invalid configuration:
  unable to load OpenShift configuration: unable to retrieve authentication information for tokens: Post https://172.30.0.1:443/apis/authentication.k8s.io/v1beta1/tokenreviews: Service Unavailable

Alert manager should be populating container proxy env vars using status from proxy. Please verify the cluster proxy contains status with no proxy values. For example:

$ oc get proxy/cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
  creationTimestamp: 2019-08-15T05:03:11Z
  generation: 42
  name: cluster
  resourceVersion: "331967"
  selfLink: /apis/config.openshift.io/v1/proxies/cluster
  uid: fac20627-bf19-11e9-91ed-021045d73216
spec:
  httpProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@139.178.76.57:3128
  httpsProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@139.178.76.57:3128
status:
  httpProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@139.178.76.57:3128
  httpsProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@139.178.76.57:3128
  noProxy: 10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.dhansen.devcluster.openshift.com,api.dhansen.devcluster.openshift.com,etcd-0.dhansen.devcluster.openshift.com,etcd-1.dhansen.devcluster.openshift.com,etcd-2.dhansen.devcluster.openshift.com,localhost,test.no-proxy.com

Comment 5 Daneyon Hansen 2019-08-18 20:09:21 UTC
If pod alertmanager-main-0 only communicates within the cluster and/or hosting cloud provider api's, then it does not need to consume proxy env vars.

Comment 6 Pawel Krupa 2019-08-19 08:48:25 UTC
As of now alertmanager is not consuming *_PROXY variables. This is a direct result of https://github.com/openshift/cluster-monitoring-operator/pull/428 PR which should be fixing this ticket.

Comment 10 Pawel Krupa 2019-08-19 16:24:17 UTC
using Status instead of Spec is done in https://github.com/openshift/cluster-monitoring-operator/pull/446

Comment 11 Daneyon Hansen 2019-08-19 16:48:59 UTC
@Junqi Zhao take a look at comment https://bugzilla.redhat.com/show_bug.cgi?id=1734704#c4. Can you please verify that your proxy object status looks similar to the example I provided? Status.noProxy should be a combination of spec.noProxy and cluster default no proxy values (i.e. 172.30.0.0/16).

Comment 12 Junqi Zhao 2019-08-20 01:51:25 UTC
(In reply to Daneyon Hansen from comment #11)
> @Junqi Zhao take a look at comment
> https://bugzilla.redhat.com/show_bug.cgi?id=1734704#c4. Can you please
> verify that your proxy object status looks similar to the example I
> provided? Status.noProxy should be a combination of spec.noProxy and cluster
> default no proxy values (i.e. 172.30.0.0/16).

From https://bugzilla.redhat.com/show_bug.cgi?id=1734704#c7, 
Status.noProxy is a combination of spec.noProxy and cluster default no proxy values.

This problem for telemeter-client pod is:
NO_PROXY for telemeter-client does not use value from status.noProxy of "oc get proxy/cluster -o yaml"

Comment 15 errata-xmlrpc 2019-10-16 06:34:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922