Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1752725

Summary: Log into kibana console get `504 Gateway Time-out The server didn't respond in time. ` when http_proxy enabled
Product: OpenShift Container Platform Reporter: Qiaoling Tang <qitang>
Component: LoggingAssignee: ewolinet
Status: CLOSED ERRATA QA Contact: Anping Li <anli>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2.0CC: alegrand, anpicker, aos-bugs, dhansen, erooth, igreen, jcantril, juzhao, kakkoyun, lcosic, mfojtik, mloibl, nhosoi, pkrupa, pweil, rmeggins, slaznick, surbania
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1780989 1781521 (view as bug list) Environment:
Last Closed: 2020-05-04 11:13:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1780989, 1781521    
Attachments:
Description Flags
Resources and logs none

Comment 1 Junqi Zhao 2019-09-17 06:05:02 UTC
Created attachment 1615706 [details]
500 error in alertmanager-proxy container

same error for monitoring routes, eg: there is 500 error in alertmanager-proxy container

Comment 3 Pawel Krupa 2019-09-17 07:56:25 UTC
alertmanager-proxy container doesn't handle http_proxy variables and there is no expectation from it to handle them.

Reassigning to kibana owners.

Comment 4 Standa Laznicka 2019-09-17 10:47:29 UTC
Does the oauth-proxy container in this deployment have the proxy env vars set?

```
2019/09/17 03:19:39 oauthproxy.go:645: error redeeming code (client:10.128.2.6:34506): Post https://oauth-openshift.apps.qitang.qe.devcluster.openshift.com/oauth/token: dial tcp 18.189.68.182:443: connect: connection timed out

```
The above line suggests that there is no proxy connection happening.

Comment 5 Jeff Cantrill 2019-09-17 12:48:09 UTC
Not yet supported:  https://jira.coreos.com/browse/LOG-464.  Is this really a regression?

Comment 6 Anping Li 2019-09-18 03:17:34 UTC
The proxy Envs weren't set.   After set them manully, the kibana works.
@jeff,  This bug is to access kibana on the cluster which proxy are enabled. It is different from the https://jira.coreos.com/browse/LOG-464. The LOG-464 is about Log collectors.

Comment 8 Jeff Cantrill 2019-09-18 15:31:14 UTC
Per PM, this does not need to be addressed until 4.3

Comment 9 Daneyon Hansen 2019-09-18 18:41:16 UTC
If oauth is on-cluster, then the oauth-proxy can bypass the cluster proxy by adding the ingress domain to noProxy. For example:


$ INGRESS_DOMAIN="$(oc get ingress.config/cluster -o 'jsonpath={.spec.domain}')"
$ oc get proxy/cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Proxy
<SNIP>
spec:
  httpProxy: http://$MY_HTTP_PROXY
  httpsProxy: https://$MY_HTTPS_PROXY
  noProxy: $INGRESS_DOMAIN
<SNIP>

Comment 10 Daneyon Hansen 2019-09-18 18:50:03 UTC
https://github.com/openshift/cluster-network-operator/pull/296 is a PR to automatically add the ingress or cluster domain to noProxy. From the PR comments it sounds like this is not desired.

Comment 11 Qiaoling Tang 2019-09-19 06:21:18 UTC
This is not a regression issue.

Comment 13 Standa Laznicka 2019-09-19 08:10:48 UTC
This bug is more or less the same as https://bugzilla.redhat.com/show_bug.cgi?id=1753091, only for a different component. You may want to check https://github.com/openshift/enhancements/pull/22 as well.

Comment 15 Anping Li 2019-09-19 15:27:31 UTC
@Standa Yes, it is same issue in differnent component. If we want to fix in apiserver, we can duplicate one. Will we fix it in 4.2?

Comment 17 Daneyon Hansen 2019-09-26 17:52:08 UTC
https://github.com/openshift/cluster-network-operator/pull/296 is a PR to automatically adds the ingress or cluster domain to noProxy. From the PR comments it sounds like this is not desired. I am working on getting clarification from @deads2k PR comments. API calls to cluster resources (i.e. oauth route) should not be proxied. Routes may be hosted by ingress controllers (i.e. routers) that reside outside the cluster, so we have some grey area that needs to be addressed.

Comment 18 Daneyon Hansen 2019-11-07 17:38:34 UTC
@dead2sk confirmed that the cluster or ingress domain should not automatically be set to noProxy. Review https://github.com/openshift/cluster-network-operator/pull/296 comments for details. You must manually add the route hostname to fix this issue.

Comment 19 Daneyon Hansen 2019-11-07 17:54:09 UTC
I just noticed PR https://github.com/openshift/cluster-logging-operator/pull/255 to fix this issue. ptal at https://github.com/openshift/cluster-logging-operator/pull/255#issuecomment-551190938 to follow the recommended implementation.

Comment 21 Anping Li 2019-11-24 12:19:51 UTC
I didn't 't see the HTTP_PROXY,HTTPS_PROXY, NO_PROXY Env are in the kibana and fluentd pods. the fluentd-trusted-ca-bundle isn't mounted into fluentd, and the kibana-trusted-ca-bundle isn't mounted into kibana.  Shall I specify some variables to enable them?

Comment 22 Anping Li 2019-11-25 13:44:41 UTC
Created attachment 1639494 [details]
Resources and logs

Comment 23 Daneyon Hansen 2019-12-03 17:25:52 UTC
@Anping regarding https://bugzilla.redhat.com/show_bug.cgi?id=1752725#c12 you can have the installer a) create only internal load-balancers or b) have the router use an internal instead of external load-balancer.

For a:

1. $ openshift-install create install-config --dir <install_dir>
2. Change the publish: External to publish: Internal in <install_dir>/install-config.yaml
3. $ openshift-install create cluster --dir <install_dir>

Keep in mind that you need to be connected to the VPC's private network for the installer to complete successfully.

For b:

1. $ openshift-install create install-config --dir <install_dir>
2. $ openshift-install create manifests --dir <install_dir>
3. Create the following manifest <install_dir>/manifests/cluster-ingress-default-ingresscontroller.yaml:

apiVersion: operator.openshift.io/v1
kind: IngressController
metadata:
  name: default
  namespace: openshift-ingress-operator
spec:
  endpointPublishingStrategy:
    loadBalancer:
      scope: Internal
    type: LoadBalancerService

4. $ openshift-install create cluster --dir <install_dir>

Comment 26 Qiaoling Tang 2019-12-09 03:02:49 UTC
Verified with ose-cluster-logging-operator-v4.4.0-201912080435

Comment 29 errata-xmlrpc 2020-05-04 11:13:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581