Bug 1766181
Summary: | Authentication "500 Internal Error" when accessing monitoring components | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Gabriel Virga <gfelixvirga> | |
Component: | Monitoring | Assignee: | Christian Heidenreich <cvogel> | |
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.3.0 | CC: | aabhishe, adeshpan, ajohn, alegrand, anpicker, atripath, clasohm, cvogel, dahernan, dyocum, erooth, gparente, hcisneir, jeff.li, jkaur, jnordell, kakkoyun, lcosic, lstanton, malonso, mharri, mloibl, nchavan, openshift-bugs-escalate, palonsor, pamoedom, pkrupa, rdiazgav, rhowe, rsandu, sgarciam, sreber, surbania | |
Target Milestone: | --- | |||
Target Release: | 4.3.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: |
Previously, external routes accessing monitoring components (Grafana, Alertmanager, Prometheus) were not accessible when the user configured a custom trusted CA bundle. This is fixed now and the above mentioned components are now accessible with custom configured trusted CA bundles.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1803957 1807963 (view as bug list) | Environment: | ||
Last Closed: | 2020-01-23 11:09:38 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1776085, 1776213 | |||
Bug Blocks: | 1803957, 1807963 |
Description
Gabriel Virga
2019-10-28 14:12:51 UTC
Thanks for the bugzilla, do you mind doing an `oc version`, so I know which 4.2 cluster version it was. Thank you! oc version Client Version: version.Info{Major:"", Minor:"", GitVersion:"v4.2.0-alpha.0-2-g8fdb79e5", GitCommit:"8fdb79e549651c0f3c91d54349715309b5d149d3", GitTreeState:"clean", BuildDate:"2019-08-07T17:48:56Z", GoVersion:"go1.12.6", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.6+2e5ed54", GitCommit:"2e5ed54", GitTreeState:"clean", BuildDate:"2019-10-10T22:04:13Z", GoVersion:"go1.12.8", Compiler:"gc", Platform:"linux/amd64"} OpenShift Version: 4.2.0 Let's track oauth-proxy problems here and alertmanager CA bundle in https://bugzilla.redhat.com/show_bug.cgi?id=1766984 *** Bug 1768977 has been marked as a duplicate of this bug. *** Hello Team, As per the document[1], customer replaced the default ingress certificate. Post modification, the customer is unable to open the GUI of Grafana/AlertManager/Prometheus,etc with error "500 Internal Error" on the screen. Grafana pod logs shows; # oc logs -c grafana-proxy grafana-74bdcddbcb-wl947 [...] [...] 2019/11/13 04:33:39 oauthproxy.go:645: error redeeming code (client:10.247.4.1:50910): Post https://oauth-openshift.apps.hashed-out.example.com/oauth/token: x509: certificate signed by unknown authority 2019/11/13 04:33:39 oauthproxy.go:438: ErrorPage 500 Internal Error Internal Error 2019/11/13 04:33:39 provider.go:373: authorizer reason: [1] https://docs.openshift.com/container-platform/4.2/authentication/certificates/replacing-default-ingress-certificate.html Customer is heavily affected due to this issue as its impacting their business. -Niket We investigated the issue and we have a potential fix ready. However, we are blocked by apiserver bug regarding the validation of CRDs (kubernetes/kubernetes#84880). (In reply to Pawel Krupa from comment #7) > We investigated the issue and we have a potential fix ready. However, we are > blocked by apiserver bug regarding the validation of CRDs > (kubernetes/kubernetes#84880). Hello, Can we have a tentative timeline indication of when this can be fixed? this needs to be further discussed with the customer accordingly. As mentioned in #6, the customer is heavily affected by this issue. -Niket Hello Can I please have a response and further update on this? Need to update customer accordingly. -Niket hi (In reply to Gabriel Virga from comment #0) > Description of problem: > I installed the latest Openshift 4.2 version. And I used the variable > "additionalTrustBundle:" to add our internal intermediate and root chains. > The proxy sidecar from all metrics are not receiving the > additionalTrustBundle > > How reproducible: > Every install using additionalTrustBundle > > Steps to Reproduce: > 1. Install Openshift 4.2 with additionalTrustBundle for self signed > certificate > 2. Try to authenticate to > - https://grafana-openshift-monitoring.apps.osesbx.mtb.com/ > - https://console-openshift-console.apps.osesbx.mtb.com/ > - https://prometheus-k8s-openshift-monitoring.apps.osesbx.mtb.com/ > - https://alertmanager-main-openshift-monitoring.apps.osesbx.mtb.com/ > > Actual results: > Browser error "500 Internal Error" > > # Alermanager-proxy container > $ oc logs -c alertmanager-proxy alertmanager-main-2 | grep x509 > 2019/10/28 12:38:13 oauthproxy.go:645: error redeeming code > (client:10.128.0.1:39918): Post > https://oauth-openshift.apps.ose.company.com/oauth/token: x509: certificate > signed by unknown authority > > $ oc logs -c prometheus-proxy prometheus-k8s-1 | grep x509 > 2019/10/28 13:51:10 oauthproxy.go:645: error redeeming code > (client:10.128.0.1:48886): Post > https://oauth-openshift.apps.ose.company.com/oauth/token: x509: certificate > signed by unknown authority > > Expected results: > Login > > Additional info: > Conversations I started > https://github.com/openshift/cluster-monitoring-operator/pull/448 > https://github.com/openshift/cluster-monitoring-operator/issues/526 > CASE 02497459 > > > > > ######## > # To fix Grafana I set the operator to Unmanaged then > ######## > Under grafana-proxy container I added: > - name: trusted-ca-bundle > readOnly: true > mountPath: /etc/pki/ca-trust/extracted/pem > > > Under Volumes I added: > - name: trusted-ca-bundle > configMap: > name: trusted-ca-bundle > items: > - key: ca-bundle.crt > path: tls-ca-bundle.pem > defaultMode: 420 According to ocp 4.2 release: https://docs.openshift.com/container-platform/4.2/release_notes/ocp-4-2-release-notes.html and this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1719188 OCP 4.2 ignores "Unmanaged" for "managementState", which means I can't apply the workaround. https://jira.coreos.com/browse/MON-884 is tracking all efforts regarding this issue. @Christian please evaluate and prioritize possible backporting of this fix. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |