1808240 – Always return metrics value for pods under the user's namespace

Bug 1808240 - Always return metrics value for pods under the user's namespace

Summary: Always return metrics value for pods under the user's namespace

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	4.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	4.10.0
Assignee:	Jan Fajerski
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-02-28 05:56 UTC by Junqi Zhao
Modified:	2022-03-12 04:34 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Tenancy is enforced by checking and potentially injecting a label matcher for the label namespace. If a user creates a query with a different value in the namespace label matcher, this value is silently replaced. Consequence: A user will get a query result for a namespace that differs from the namespace specified in the query. Fix: Return an error instead. Result: The user will now be presented with an HTTP error 400 if the namespace value differs from the one enforced based on tenancy.
Clone Of:
Environment:
Last Closed:	2022-03-12 04:34:40 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
topk(25, sort_desc(sum(avg_over_time(container_memory_working_set_bytes{container="",pod!="",namespace='openshift-monitoring'}[5m])) BY (pod, namespace))) (68.44 KB, image/png) 2020-02-28 05:56 UTC, Junqi Zhao	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift cluster-monitoring-operator pull 1400	None	open	Bug 1808240: prom-label-proxy: set --error-on-replace	2021-10-13 09:27:09 UTC
Github	openshift prom-label-proxy pull 340	None	Merged	Bug 1808240: Bump to v0.4.0	2021-10-13 09:43:47 UTC
Github	prometheus-community prom-label-proxy pull 67	None	Merged	return error if existing label matcher in query would change	2021-10-06 13:52:05 UTC
Red Hat Product Errata	RHSA-2022:0056	None	None	None	2022-03-12 04:34:54 UTC

Description Junqi Zhao 2020-02-28 05:56:41 UTC

Created attachment 1666316 [details]
topk(25, sort_desc(sum(avg_over_time(container_memory_working_set_bytes{container="",pod!="",namespace='openshift-monitoring'}[5m])) BY (pod, namespace)))

Description of problem:
common user, create project and deploy pods under the namespace, example:
# oc -n test get pod
NAME                      READY   STATUS    RESTARTS   AGE
example-75778c488-4b2x6   1/1     Running   0          13m
example-75778c488-kv492   1/1     Running   0          13m
example-75778c488-wnng5   1/1     Running   0          13m


then login the developer console, click "Monitoring" then select "Metrics" tab,
input custome query in the textarea, change namespace value to openshift-monitoring, which the user don't have view permission, example:
topk(25, sort_desc(sum(avg_over_time(container_memory_working_set_bytes{container="",pod!="",namespace='openshift-monitoring'}[5m])) BY (pod, namespace)))
the result is like the followings, it shows the result for pods under user's namespace, it should not return data

namespace	pod                     value
test		example-75778c488-kv492	13799424
test		example-75778c488-wnng5	13316096
test		example-75778c488-4b2x6	13271040


Version-Release number of selected component (if applicable):
4.4.0-0.nightly-2020-02-27-020932

How reproducible:
Always

Steps to Reproduce:
1. See the description
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Samuel Padgett 2020-02-28 17:18:56 UTC

surbania - Is this expected? Console is making a request through prometheus-tenancy service with the namespace query parameter set. For example,

/api/v1/query?namespace=sgp&query=topk%2825%2C+sort_desc%28sum%28avg_over_time%28container_memory_working_set_bytes%7Bcontainer%3D%22%22%2Cpod%21%3D%22%22%2Cnamespace%3D%27openshift-monitoring%27%7D%5B5m%5D%29%29+BY+%28pod%2C+namespace%29%29%29

Comment 17 Sergiusz Urbaniak 2020-09-11 13:16:32 UTC

Work is ongoing in upstream prom-label-proxy, hence slipping into the next release.

Comment 18 Sergiusz Urbaniak 2020-10-02 12:39:50 UTC

this is planned in one of the next sprints.

Comment 20 Sergiusz Urbaniak 2020-11-13 09:04:05 UTC

UpcomingSprint: We don't have enough capacity to tackle this one in the next sprint (193).

Comment 26 Jan Fajerski 2021-05-25 15:04:42 UTC

Waiting on upstream review.

Comment 33 Jan Fajerski 2021-09-29 14:50:45 UTC

Waiting on upstream prom-label-proxy release https://github.com/prometheus-community/prom-label-proxy/pull/88

Comment 36 Junqi Zhao 2021-10-21 03:50:33 UTC

tested with 4.10.0-0.nightly-2021-10-21-014208, followed steps in Comment 0
select "test" project and run
topk(25, sort_desc(sum(avg_over_time(container_memory_working_set_bytes{container="",pod!="",namespace='openshift-monitoring'}[5m])) BY (pod, namespace)))
will get 400 Bad Request error

select "test" project and run, will return the correct result
topk(25, sort_desc(sum(avg_over_time(container_memory_working_set_bytes{container="",pod!="",namespace='test'}[5m])) BY (pod, namespace)))

Comment 45 errata-xmlrpc 2022-03-12 04:34:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056

Note You need to log in before you can comment on or make changes to this bug.