Bug 1889488 - The metrics endpoint for the Scheduler is not protected by RBAC
Summary: The metrics endpoint for the Scheduler is not protected by RBAC
Keywords:
Status: POST
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-scheduler
Version: 4.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Jan Chaloupka
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-19 18:45 UTC by Kirsten Newcomer
Modified: 2020-11-30 16:16 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)
controller-unauthorized-output (175 bytes, text/plain)
2020-10-19 19:08 UTC, Kirsten Newcomer
no flags Details


Links
System ID Priority Status Summary Last Updated
Github kubernetes kubernetes pull 96345 None open refactor: disable insecure serving in kube-scheduler 2020-11-30 16:16:45 UTC

Internal Links: 1897630

Description Kirsten Newcomer 2020-10-19 18:45:38 UTC
Description of problem:
The metrics endpoint for the Scheduler is not protected by RBAC

Version-Release number of selected component (if applicable):
OCP 4.5

How reproducible:
Consistently

Steps to Reproduce:
oc project openshift-kube-scheduler
POD=$(oc get pods -l app=openshift-kube-scheduler -o jsonpath='{.items[0].metadata.name}')
PORT=$(oc get pod $POD -o jsonpath='{.spec.containers[0].livenessProbe.httpGet.port}')
# Should return 403 Forbidden
oc rsh ${POD} curl https://localhost:${PORT}/metrics -k
 
# Create a service account to test RBAC
oc create sa permission-test-sa
 
# Should return 403 Forbidden
SA_TOKEN=$(oc sa get-token permission-test-sa)
oc rsh ${POD} curl https://localhost:${PORT}/metrics -H "Authorization: Bearer $SA_TOKEN" -k
 
# As cluster admin, should succeed
CLUSTER_ADMIN_TOKEN=$(oc whoami -t)
oc rsh ${POD} curl https://localhost:${PORT}/metrics -H "Authorization: Bearer $CLUSTER_ADMIN_TOKEN" -k
 
# Cleanup
oc delete sa permission-test-sa


Actual results:
Metrics are returned. See private attachment.


Expected results:
403 Forbidden


Additional info:
This test maps to CIS Kube item 1.4.1. OCP 4.5 fails this test
Credit to Khaled Janania for finding this

Comment 2 Kirsten Newcomer 2020-10-19 18:55:21 UTC
By contrast, attempting the same thing for the Controller Manager resturns 403 Forbidden. 

oc project openshift-kube-contoller-manager
POD=$(oc get pods -n openshift-kube-controller-manager -l app=kube-controller-manager -o jsonpath='{.items[0].metadata.name}')
PORT=$(oc get pods -n openshift-kube-controller-manager -l app=kube-controller-manager -o jsonpath='{.items[0].spec.containers[0].ports[0].hostPort}')

# Should return 403 Forbidden
oc rsh -n openshift-kube-controller-manager ${POD} curl https://localhost:${PORT}/metrics -k
 
# Create a service account to test RBAC
oc create -n openshift-kube-controller-manager sa permission-test-sa
 
# Should return 403 Forbidden
SA_TOKEN=$(oc sa -n openshift-kube-controller-manager get-token permission-test-sa)
oc rsh -n openshift-kube-controller-manager ${POD} curl https://localhost:${PORT}/metrics -H "Authorization: Bearer $SA_TOKEN" -k
 
# As cluster admin, should succeed
CLUSTER_ADMIN_TOKEN=$(oc whoami -t)
oc rsh -n openshift-kube-controller-manager ${POD} curl https://localhost:${PORT}/metrics -H "Authorization: Bearer $CLUSTER_ADMIN_TOKEN" -k
 
# Cleanup
oc delete -n openshift-kube-controller-manager sa permission-test-sa

Comment 3 Kirsten Newcomer 2020-10-19 19:05:08 UTC
Created attachment 1722676 [details]
controller-sucess-output

Comment 4 Kirsten Newcomer 2020-10-19 19:08:04 UTC
Created attachment 1722678 [details]
controller-unauthorized-output

Comment 10 Jan Chaloupka 2020-10-23 11:07:26 UTC
Need more time to properly evaluate the solution for the issue

Comment 12 Kirsten Newcomer 2020-10-23 20:44:41 UTC
I see from the following that the scheme is set to HTTP for livenessProbe and readinessProbe for the scheduler. 

oc -n openshift-kube-scheduler get cm kube-scheduler-pod -o json | jq -r '.data."pod.yaml"' | jq '.spec.containers'


 "livenessProbe": {
      "httpGet": {
        "path": "healthz",
        "port": 10251,
        "scheme": "HTTP"
      },
      "initialDelaySeconds": 45
    },
    "readinessProbe": {
      "httpGet": {
        "path": "healthz",
        "port": 10251,
        "scheme": "HTTP"

The scheme for these for Controller manager is set to HTTPS.

oc -n openshift-kube-controller-manager get cm kube-controller-manager-pod -o json | jq -r '.data."pod.yaml"' | jq '.spec.containers'

    "livenessProbe": {
      "httpGet": {
        "path": "healthz",
        "port": 10357,
        "scheme": "HTTPS"
      },
      "initialDelaySeconds": 45,
      "timeoutSeconds": 10
    },
    "readinessProbe": {
      "httpGet": {
        "path": "healthz",
        "port": 10357,
        "scheme": "HTTPS"

Comment 16 Jan Chaloupka 2020-11-05 13:31:43 UTC
The /metrics endpoint is registered for both http and https.

Checking https endpoint:

```
POD=$(oc get pods -n openshift-kube-scheduler -l app=openshift-kube-scheduler -o jsonpath='{.items[0].metadata.name}')
PORT=$(oc get pods -n openshift-kube-scheduler -l app=openshift-kube-scheduler -o jsonpath='{.items[0].spec.containers[0].ports[0].hostPort}')
CLUSTER_ADMIN_TOKEN=$(oc whoami -t)
oc rsh -n openshift-kube-scheduler ${POD} curl https://localhost:${PORT}/metrics -H "Authorization: Bearer $CLUSTER_ADMIN_TOKEN" -k
```

I am able to get the metrics.

Prometheus is collecting the metrics through https so it's safe to disable http unless Prometheus needs an anonymous access. On the other hand, kube-controller-manager does not provide metrics through http.

So the insecure bits needs to be disabled. However, their initialization is hardcoded: https://github.com/openshift/kubernetes/blob/36083e429212a2e46c7243942748a258eb714b61/cmd/kube-scheduler/app/options/options.go#L92-L101. I did not find a way how to disable registering the insecure bits through options/component config configuration. Code changes are required.

Comment 17 Jan Chaloupka 2020-11-13 12:07:22 UTC
More time is needed to discuss the right way to fix this upstream first.

Comment 18 Jan Chaloupka 2020-11-18 14:19:59 UTC
https://github.com/kubernetes/kubernetes/pull/96345 will not make it for 1.20. Though, we can still pick the PR as a partial upstream PR and make sure the changes are properly tested in our environment. The PR is passing all upstream tests up to clusterload2 test which still collects the metrics through the insecure port.


Note You need to log in before you can comment on or make changes to this bug.