Bug 1794885

Summary: Prometheus and Alertmanager services returning 403 errors, breaking console metrics
Product: OpenShift Container Platform Reporter: Samuel Padgett <spadgett>
Component: apiserver-authAssignee: Standa Laznicka <slaznick>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.4CC: alegrand, anpicker, aos-bugs, ccoleman, ebondare, erooth, hasha, jhou, juzhao, kakkoyun, lcosic, mfojtik, mloibl, pkrupa, slaznick, spasquie, sttts, surbania, yapei
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Version: 4.4.0-0.nightly-2020-01-24-235025 Cluster ID: 142f338f-b24a-4f79-82d3-23acc3f2671c Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:72.0) Gecko/20100101 Firefox/72.0
Last Closed: 2020-05-04 11:26:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1796538, 1796993, 1797027    
Attachments:
Description Flags
403 errors
none
Metrics and alerts fail to load in console
none
prometheus-proxy container logs
none
still 403/Forbidden error none

Description Samuel Padgett 2020-01-25 02:56:42 UTC
Created attachment 1655177 [details]
403 errors

The console backend proxies requests to the Prometheus and Alertmanager services for metrics. We're starting to see 403 responses (see screenshot), which breaks metrics in console. This appears to have surfaced with the changes in

https://openshift-release.svc.ci.openshift.org/releasestream/4.4.0-0.nightly/release/4.4.0-0.nightly-2020-01-24-172700

aws-console tests were passing in 4.4.0-0.nightly-2020-01-24-141203 and earlier.

Comment 1 Samuel Padgett 2020-01-25 02:57:32 UTC
Created attachment 1655178 [details]
Metrics and alerts fail to load in console

Comment 2 Samuel Padgett 2020-01-25 03:03:41 UTC
Created attachment 1655179 [details]
prometheus-proxy container logs

There are some suspicious auth errors in the prometheus-proxy container logs

Comment 13 Samuel Padgett 2020-02-03 13:59:51 UTC
*** Bug 1796912 has been marked as a duplicate of this bug. ***

Comment 14 Samuel Padgett 2020-02-03 14:10:10 UTC
*** Bug 1797497 has been marked as a duplicate of this bug. ***

Comment 19 Junqi Zhao 2020-02-05 06:37:23 UTC
Created attachment 1657750 [details]
still 403/Forbidden error

Comment 20 Standa Laznicka 2020-02-05 10:43:31 UTC
Reproduced. The login to the prometheus dashboard actually works fine but apparently in the monitoring/alerts section console is trying to communicate with it using a token and this triggers the bug.

Comment 22 Junqi Zhao 2020-02-07 03:12:55 UTC
Tested with 4.4.0-0.nightly-2020-02-06-170203 and checked Prometheus API in
admin console
"Home -> Overview"
"Workloads -> Pods"
"Monitoring -> Alerting", "Monitoring -> Metrics"
"Compute -> Nodes"


and developer console

Could show metrics data/diagram now

Comment 24 errata-xmlrpc 2020-05-04 11:26:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581

Comment 25 Red Hat Bugzilla 2023-09-14 05:50:43 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days