Bug 1794885 - Prometheus and Alertmanager services returning 403 errors, breaking console metrics
Summary: Prometheus and Alertmanager services returning 403 errors, breaking console m...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: apiserver-auth
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.4.0
Assignee: Standa Laznicka
QA Contact: Junqi Zhao
URL:
Whiteboard:
: 1796912 1797497 (view as bug list)
Depends On:
Blocks: 1796538 1796993 1797027
TreeView+ depends on / blocked
 
Reported: 2020-01-25 02:56 UTC by Samuel Padgett
Modified: 2023-09-14 05:50 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Version: 4.4.0-0.nightly-2020-01-24-235025 Cluster ID: 142f338f-b24a-4f79-82d3-23acc3f2671c Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:72.0) Gecko/20100101 Firefox/72.0
Last Closed: 2020-05-04 11:26:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
403 errors (259.91 KB, image/png)
2020-01-25 02:56 UTC, Samuel Padgett
no flags Details
Metrics and alerts fail to load in console (1011.61 KB, image/png)
2020-01-25 02:57 UTC, Samuel Padgett
no flags Details
prometheus-proxy container logs (99.15 KB, application/octet-stream)
2020-01-25 03:03 UTC, Samuel Padgett
no flags Details
still 403/Forbidden error (271.10 KB, image/png)
2020-02-05 06:37 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 24461 0 None closed UPSTREAM: <carry>: oauth-authn: add implicit audience support 2020-12-17 05:21:29 UTC
Github openshift origin pull 24503 0 None closed Bug 1794885: UPSTREAM: <carry>: bootstrap user - make tokens have implicit audiences 2020-12-17 05:21:27 UTC
Red Hat Product Errata RHBA-2020:0581 0 None None None 2020-05-04 11:27:10 UTC

Internal Links: 1798026

Description Samuel Padgett 2020-01-25 02:56:42 UTC
Created attachment 1655177 [details]
403 errors

The console backend proxies requests to the Prometheus and Alertmanager services for metrics. We're starting to see 403 responses (see screenshot), which breaks metrics in console. This appears to have surfaced with the changes in

https://openshift-release.svc.ci.openshift.org/releasestream/4.4.0-0.nightly/release/4.4.0-0.nightly-2020-01-24-172700

aws-console tests were passing in 4.4.0-0.nightly-2020-01-24-141203 and earlier.

Comment 1 Samuel Padgett 2020-01-25 02:57:32 UTC
Created attachment 1655178 [details]
Metrics and alerts fail to load in console

Comment 2 Samuel Padgett 2020-01-25 03:03:41 UTC
Created attachment 1655179 [details]
prometheus-proxy container logs

There are some suspicious auth errors in the prometheus-proxy container logs

Comment 13 Samuel Padgett 2020-02-03 13:59:51 UTC
*** Bug 1796912 has been marked as a duplicate of this bug. ***

Comment 14 Samuel Padgett 2020-02-03 14:10:10 UTC
*** Bug 1797497 has been marked as a duplicate of this bug. ***

Comment 19 Junqi Zhao 2020-02-05 06:37:23 UTC
Created attachment 1657750 [details]
still 403/Forbidden error

Comment 20 Standa Laznicka 2020-02-05 10:43:31 UTC
Reproduced. The login to the prometheus dashboard actually works fine but apparently in the monitoring/alerts section console is trying to communicate with it using a token and this triggers the bug.

Comment 22 Junqi Zhao 2020-02-07 03:12:55 UTC
Tested with 4.4.0-0.nightly-2020-02-06-170203 and checked Prometheus API in
admin console
"Home -> Overview"
"Workloads -> Pods"
"Monitoring -> Alerting", "Monitoring -> Metrics"
"Compute -> Nodes"


and developer console

Could show metrics data/diagram now

Comment 24 errata-xmlrpc 2020-05-04 11:26:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581

Comment 25 Red Hat Bugzilla 2023-09-14 05:50:43 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.