1439391 – [RFE] allow dedicated admins to monitor resource usage

Bug 1439391 - [RFE] allow dedicated admins to monitor resource usage [NEEDINFO]

Summary: [RFE] allow dedicated admins to monitor resource usage

Keywords:
Status:	VERIFIED
Alias:	None
Product:	OpenShift Online
Classification:	Red Hat
Component:	RFE
Sub Component:
Version:	3.x
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Abhishek Gupta
QA Contact:	wangyu
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-04-05 21:49 UTC by Brennan Vincello
Modified:	2024-02-03 05:44 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:
Flags:	fshaikh: needinfo? (abhgupta) fshaikh: needinfo? (abhgupta)

Attachments	(Terms of Use)

Description Brennan Vincello 2017-04-05 21:49:36 UTC

Description of problem:

As a dedicated admin I need to be able to monitor resource usage as described here:

https://docs.openshift.com/container-platform/3.4/admin_guide/allocating_node_resources.html#system-resources-reported-by-node 

I need to see the capacity available to nodes, and the resources currently allocated per node.  

How can I get access to see this information on our two openshift dedicated clusters?

Can this permission be granted to the dedicated-admins group, so that my fellow teammates can access this information?

Version-Release number of selected component (if applicable): Dedicated OCP 3.4

How reproducible: Very

Steps to Reproduce:
1. Request a node endpoint resembling:
https://console.example.openshift.com/api/v1/nodes/cluster.ip-xxx-xx-xx-x.ca-central-1.compute.internal/proxy/stats/summary

Actual results:

I am denied with this message:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "User \"system:anonymous\" cannot get nodes/unsafeproxy at the cluster scope",
  "reason": "Forbidden",
  "details": {
    "name": "cluster.ip-xxx-xx-xx-x.ca-central-1.compute.internal",
    "kind": "nodes/unsafeproxy"
  },
  "code": 403
}

Expected results:

Receive statistics.

Additional info: n/a

Comment 1 Matt Wringe 2017-04-05 22:45:48 UTC

You are trying to access a protected endpoint without passing any credentials to it, which is why you are getting an access denied error for "system:anonymous".

You need to try with something like this:

curl --insecure -H "Authorization: Bearer ${TOKEN}" -X GET https://${MASTER_HOST}/api/v1/nodes/${NODENAME}/proxy/stats/summary

I am not sure the exact role or permission you would to grant your user to have access to this (other than something like cluster-reader).

If you have OpenShift Metrics installed, then Hawkular will already have this information stored. But currently I believe this would require a cluster reader to access.

You can access this information via something like:

curl -H "Authorization: Bearer ${TOKEN}" -H "Hawkular-tenant: _system"  -X GET https://hawkular-metrics.example.com/hawkular/metrics/metrics?tags=nodename:${NODENAME},type:node | python -m json.tool

Comment 2 Matt Wringe 2017-04-18 18:39:41 UTC

Is there anything else we can do for you here? Or was your issue resolved when you used tokens to access the restricted endpoints?

Comment 19 Steve Speicher 2017-08-15 21:41:18 UTC

It is being beta tested now. We plan to roll out the new app in the coming weeks to allow for view into the dashboard of cluster utilization.

Comment 25 Renato Puccini 2017-11-06 18:16:50 UTC

Team,

Was the BZ completed/resolved?

Comment 27 Steve Speicher 2017-11-16 14:36:13 UTC

We are working on 2 key initiatives around resource utilization (actual and scheduled):
1. Increasing the permission of the dedicated-admin role (basically this RFE)
2. Rolling out the dedicated.openshift.com to all customers (also updating to including scheduler information).

For #1, this is in the top 5 RFEs that we are working with engineering to determine a delivery date on. That work is on-going, trying to get rolled out before holiday shutdown/freeze.
We have rolled out #2 to some customers. I'm working on a rollout plan to all dedicated customers and get commitment from engineering.

Comment 31 Will Gordon 2019-03-15 16:27:58 UTC

OpenShift Dedicated customers now have access to the Grafana Dashboards in >= 3.11 clusters. Instructions on reaching the Grafana dashboard is included on each cluster dashboard in https://dedicated.openshift.com. Typically, the Grafana dashboard should be available by visiting https://admin-console.<cluster-id>.openshift.com, click on Monitoring -> Dashboards. Only Dedicated-Admins will have access to this dashboard. Dedicated-Admins can also view the Grafana URL directly by running "oc get routes --all-namespaces | grep grafana".

Comment 32 wangyu 2019-04-01 09:47:48 UTC

@wgordon Will,could you help grant the Dedicated-Admins permission for me to verify this bug?
My account for testing is "yuwan".

Comment 33 wangyu 2019-04-02 01:58:35 UTC

@wgordon Will,the environment of 'ded-stage-aws' is ok for me to verify this bug, please help grant the Dedicated-Admins permission on that.thanks.
My account for testing is "yuwan".

Comment 34 Will Gordon 2019-04-04 19:57:26 UTC

@wangyu, I've provided Dedicated-Admins permissions to your account

Comment 35 wangyu 2019-04-09 08:01:44 UTC

@Will thanks,I check get the grafana route now.The pod of "grafana-667c9d6f6f-rc4xn" is running, but the grafana app is still unavailable. I test on the ded-stage-aws ENV.Could you help investigate this issue?

Comment 36 Will Gordon 2019-04-10 02:35:39 UTC

SRE has addressed the issue, please try again.

Comment 37 wangyu 2019-04-10 07:00:57 UTC

I verified this bug on ded-stage-aws.We can monitor the resource usage by the Grafana Dashboards now.

Note You need to log in before you can comment on or make changes to this bug.