Bug 1752725
| Summary: | Log into kibana console get `504 Gateway Time-out The server didn't respond in time. ` when http_proxy enabled | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Qiaoling Tang <qitang> | ||||
| Component: | Logging | Assignee: | ewolinet | ||||
| Status: | CLOSED ERRATA | QA Contact: | Anping Li <anli> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.2.0 | CC: | alegrand, anpicker, aos-bugs, dhansen, erooth, igreen, jcantril, juzhao, kakkoyun, lcosic, mfojtik, mloibl, nhosoi, pkrupa, pweil, rmeggins, slaznick, surbania | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.4.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1780989 1781521 (view as bug list) | Environment: | |||||
| Last Closed: | 2020-05-04 11:13:57 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1780989, 1781521 | ||||||
| Attachments: |
|
||||||
alertmanager-proxy container doesn't handle http_proxy variables and there is no expectation from it to handle them. Reassigning to kibana owners. Does the oauth-proxy container in this deployment have the proxy env vars set? ``` 2019/09/17 03:19:39 oauthproxy.go:645: error redeeming code (client:10.128.2.6:34506): Post https://oauth-openshift.apps.qitang.qe.devcluster.openshift.com/oauth/token: dial tcp 18.189.68.182:443: connect: connection timed out ``` The above line suggests that there is no proxy connection happening. Not yet supported: https://jira.coreos.com/browse/LOG-464. Is this really a regression? The proxy Envs weren't set. After set them manully, the kibana works. @jeff, This bug is to access kibana on the cluster which proxy are enabled. It is different from the https://jira.coreos.com/browse/LOG-464. The LOG-464 is about Log collectors. Per PM, this does not need to be addressed until 4.3 If oauth is on-cluster, then the oauth-proxy can bypass the cluster proxy by adding the ingress domain to noProxy. For example:
$ INGRESS_DOMAIN="$(oc get ingress.config/cluster -o 'jsonpath={.spec.domain}')"
$ oc get proxy/cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Proxy
<SNIP>
spec:
httpProxy: http://$MY_HTTP_PROXY
httpsProxy: https://$MY_HTTPS_PROXY
noProxy: $INGRESS_DOMAIN
<SNIP>
https://github.com/openshift/cluster-network-operator/pull/296 is a PR to automatically add the ingress or cluster domain to noProxy. From the PR comments it sounds like this is not desired. This is not a regression issue. This bug is more or less the same as https://bugzilla.redhat.com/show_bug.cgi?id=1753091, only for a different component. You may want to check https://github.com/openshift/enhancements/pull/22 as well. @Standa Yes, it is same issue in differnent component. If we want to fix in apiserver, we can duplicate one. Will we fix it in 4.2? https://github.com/openshift/cluster-network-operator/pull/296 is a PR to automatically adds the ingress or cluster domain to noProxy. From the PR comments it sounds like this is not desired. I am working on getting clarification from @deads2k PR comments. API calls to cluster resources (i.e. oauth route) should not be proxied. Routes may be hosted by ingress controllers (i.e. routers) that reside outside the cluster, so we have some grey area that needs to be addressed. @dead2sk confirmed that the cluster or ingress domain should not automatically be set to noProxy. Review https://github.com/openshift/cluster-network-operator/pull/296 comments for details. You must manually add the route hostname to fix this issue. I just noticed PR https://github.com/openshift/cluster-logging-operator/pull/255 to fix this issue. ptal at https://github.com/openshift/cluster-logging-operator/pull/255#issuecomment-551190938 to follow the recommended implementation. I didn't 't see the HTTP_PROXY,HTTPS_PROXY, NO_PROXY Env are in the kibana and fluentd pods. the fluentd-trusted-ca-bundle isn't mounted into fluentd, and the kibana-trusted-ca-bundle isn't mounted into kibana. Shall I specify some variables to enable them? Created attachment 1639494 [details]
Resources and logs
@Anping regarding https://bugzilla.redhat.com/show_bug.cgi?id=1752725#c12 you can have the installer a) create only internal load-balancers or b) have the router use an internal instead of external load-balancer. For a: 1. $ openshift-install create install-config --dir <install_dir> 2. Change the publish: External to publish: Internal in <install_dir>/install-config.yaml 3. $ openshift-install create cluster --dir <install_dir> Keep in mind that you need to be connected to the VPC's private network for the installer to complete successfully. For b: 1. $ openshift-install create install-config --dir <install_dir> 2. $ openshift-install create manifests --dir <install_dir> 3. Create the following manifest <install_dir>/manifests/cluster-ingress-default-ingresscontroller.yaml: apiVersion: operator.openshift.io/v1 kind: IngressController metadata: name: default namespace: openshift-ingress-operator spec: endpointPublishingStrategy: loadBalancer: scope: Internal type: LoadBalancerService 4. $ openshift-install create cluster --dir <install_dir> Verified with ose-cluster-logging-operator-v4.4.0-201912080435 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 |
Created attachment 1615706 [details] 500 error in alertmanager-proxy container same error for monitoring routes, eg: there is 500 error in alertmanager-proxy container