1734390 – oauth-proxy is rejecting a valid service account

Bug 1734390 - oauth-proxy is rejecting a valid service account

Summary: oauth-proxy is rejecting a valid service account

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	apiserver-auth
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.2.0
Assignee:	Matt Rogers
QA Contact:	Chuan Yu
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-07-30 12:28 UTC by Pawel Krupa
Modified:	2019-09-02 15:07 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-09-02 15:07:22 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift oauth-proxy pull 129	0	None	closed	Bug 1734390: don't log Authorization header when Basic Auth fails	2020-02-18 15:57:16 UTC

Description Pawel Krupa 2019-07-30 12:28:45 UTC

Description of problem:
oauth proxy is rejecting valid service account. Probably due to problems with certificate rotation.


Version-Release number of selected component (if applicable):
Tested on cluster in version 4.2.0-0.ci-2019-07-30-062021
Was also noticed on previous ones.

This was noticed in oauth-proxy deployed in prometheus pod. Logs are available at https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_console/2210/pull-ci-openshift-console-master-e2e-aws/6271/artifacts/e2e-aws/pods/openshift-monitoring_prometheus-k8s-0_prometheus-proxy.log

It is happening consistently with new clusters.

Comment 1 Standa Laznicka 2019-08-01 12:13:54 UTC

Pawel, this is indeed weird behavior, is this happening in all clusters today? Can you get me the config for that proxy? I was not able to get it by modifying the link.

Comment 2 Pawel Krupa 2019-08-07 14:30:56 UTC

This is happening on all clusters and in every e2e CI job. It can be observed for example in logs gathered from prometheus-proxy container in prometheus-k8s pod. Configuration for that container is available at https://github.com/openshift/cluster-monitoring-operator/blob/master/assets/prometheus-k8s/prometheus.yaml#L38-L70

Comment 3 Standa Laznicka 2019-08-09 13:45:22 UTC

Debugging progress - with requests logging turned on, it shows that the request causing the behavior is:

prometheus-k8s.openshift-monitoring.svc:9091 GET localhost:9090 '/federate?match[]={__name__="up"}&match[]={__name__="cluster_version"}&match[]={__name__="cluster_version_available_updates"}&match[]={__name__="cluster_operator_up"}&match[]={__name__="cluster_operator_conditions"}&match[]={__name__="cluster_version_payload"}&match[]={__name__="cluster_installer"}&match[]={__name__="instance:etcd_object_counts:sum"}&match[]={__name__="ALERTS",alertstate="firing"}&match[]={__name__="code:apiserver_request_count:rate:sum"}&match[]={__name__="cluster:capacity_cpu_cores:sum"}&match[]={__name__="cluster:capacity_memory_bytes:sum"}&match[]={__name__="cluster:cpu_usage_cores:sum"}&match[]={__name__="cluster:memory_usage_bytes:sum"}&match[]={__name__="openshift:cpu_usage_cores:sum"}&match[]={__name__="openshift:memory_usage_bytes:sum"}&match[]={__name__="cluster:node_instance_type_count:sum"}&match[]={__name__="cnv:vmi_status_running:count"}&match[]={__name__="subscription_sync_total"}' HTTP/1.1 "Go-http-client/1.1" 200 5278 0.009

Suspicion falls on telemeter-client

Comment 4 Standa Laznicka 2019-08-09 13:55:50 UTC

tried to add `- -skip-auth-regex=^/federate` which seems to have fixed the problem for me

Comment 5 Standa Laznicka 2019-08-12 10:37:35 UTC

The next endpoint was `/api`, and that we don't want to reveal, turns out even `/federate` should not be visible, there has to be another way around this

Note You need to log in before you can comment on or make changes to this bug.