Bug 1885356 - add p&f configuration to protect openshift traffic
Summary: add p&f configuration to protect openshift traffic
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.5
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.6.0
Assignee: Abu Kashem
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On: 1885358
Blocks: 1883589 1885353
TreeView+ depends on / blocked
 
Reported: 2020-10-05 17:51 UTC by Abu Kashem
Modified: 2020-10-27 16:48 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1885353
: 1885358 (view as bug list)
Environment:
Last Closed: 2020-10-27 16:47:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 967 0 None closed BUG 1885356: protect openshift traffic by using dedicated flowschema 2021-02-04 13:01:22 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:48:11 UTC

Description Abu Kashem 2020-10-05 17:51:34 UTC
+++ This bug was initially created as a clone of Bug #1885353 +++

add p&f configuration to protect openshift traffic. Define dedicated flowschema and priority configuration that will protect openshift specific traffic.

- subjectaccessreviews (SAR) and tokenreviews from oas or oauth server is very importnant.
- openshift controller manager, other `oas` requests, '/metrics' requests from openshift-monitoring is as important as kcm traffic.
- control plane operators are important (kas-o, auth operator, etcd operator)
- The default `workloads-low` goes below the traffic defined above.

Comment 4 Abu Kashem 2020-10-13 13:21:44 UTC
Hi xxia,
This also relates to a customer case - https://bugzilla.redhat.com/show_bug.cgi?id=1883589 we are working on.

I have outlined the steps to verify this in the 4.7 BZ https://bugzilla.redhat.com/show_bug.cgi?id=1885358. Can you please copy paste the logs here for each flow schema and service account? We want to make sure all service accounts match to the right priority level.

Comment 6 Xingxing Xia 2020-10-14 08:08:22 UTC
For your comment 4, checked in 4.6.0-rc.3 env:
$ oc patch kubeapiserver cluster --type=merge -p='
spec:
  logLevel: Trace'
Wait for kube-apiserver rolls out and becomes re-Running, then check for all kube-apiserver pods, all show like below:
$ oc logs kube-apiserver-xxia1013-cmccz-master-0.c.openshift-qe.internal -c kube-apiserver -n openshift-kube-apiserver
In the logs can see:
For /metrics requests from SA prometheus-k8s, the priority is set QS(workload-high):

2020-10-14T07:44:49.535969233Z I1014 07:44:49.535818      17 queueset.go:601] QS(workload-high) at r=2020-10-14 07:44:49.535667167 v=11.162318673s: dispatching request &request.RequestInfo{IsResourceRequest:false, Path:"/metrics", Verb:"get", APIPrefix:"", APIGroup:"", APIVersion:"", Namespace:"", Resource:"", Subresource:"", Name:"", Parts:[]string(nil)} &user.DefaultInfo{Name:"system:serviceaccount:openshift-monitoring:prometheus-k8s", UID:"c618e123-5c1f-4c84-8dbe-3b2acfb60fa5", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-monitoring", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 30 with virtual start time 11.162318673s, queue will have 0 waiting & 1 executing

This is as expected as the PR:
...
  priorityLevelConfiguration:
    name: workload-high
...

For requests from SA kube-apiserver-operator, the priority is set QS(openshift-control-plane-operators):

2020-10-14T07:46:43.162219012Z I1014 07:46:43.162133      17 queueset.go:601] QS(openshift-control-plane-operators) at r=2020-10-14 07:46:43.162103081 v=41.563321829s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-config-managed/secrets", Verb:"list", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-config-managed", Resource:"secrets", Subresource:"", Name:"", Parts:[]string{"secrets"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-kube-apiserver-operator:kube-apiserver-operator", UID:"e340db04-f8fe-4b7c-8892-c6a56ad178a7", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-kube-apiserver-operator", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 58 with virtual start time 41.563321829s, queue will have 0 waiting & 1 executing

This is as expected as the PR:
...
  priorityLevelConfiguration:
    name: openshift-control-plane-operators
...

Comment 7 Xingxing Xia 2020-10-14 08:35:16 UTC
Hmm, I only checked the first 2 SAs, there are other flowschemas for other SAs, still checking ...

Comment 8 Xingxing Xia 2020-10-15 03:34:21 UTC
Continuing other flowschemas & SAs (this bug's PR adds so many, so below verification is a bit verbose, some needs to manually create test data).

Get the kube-apiserver log files, store to KUBE_APISERVER_LOG_FILES for search, e.g.:
KUBE_APISERVER_LOG_FILES='logs/kube-apiserver-xxia1014-2p4jv-master-0.log logs/kube-apiserver-xxia1014-2p4jv-master-1.log logs/kube-apiserver-xxia1014-2p4jv-master-2.log'

Then one by one check other flowschemas & SAs:
For subjectaccessreview requests related to flowschema/openshift-apiserver-sar, QS(openshift-aggregated-api-delegated-auth) is set as expected:
2020-10-15T01:24:09.850646219Z I1015 01:24:09.850477      20 queueset.go:601] QS(openshift-aggregated-api-delegated-auth) at r=2020-10-15 01:24:09.850155411 v=20.870428617s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/authorization.k8s.io/v1/subjectaccessreviews", Verb:"create", APIPrefix:"apis", APIGroup:"authorization.k8s.io", APIVersion:"v1", Namespace:"", Resource:"subjectaccessreviews", Subresource:"", Name:"", Parts:[]string{"subjectaccessreviews"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-apiserver:openshift-apiserver-sa", UID:"9d373b03-bf0c-4e55-aac8-da284b4996b0", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 7 with virtual start time 20.870428617s, queue will have 0 waiting & 1 executing

For tokenreview requests related to flowschema/openshift-apiserver-sar, QS(openshift-aggregated-api-delegated-auth) is set as expected:
2020-10-15T01:24:13.078346341Z I1015 01:24:13.078247      20 queueset.go:601] QS(openshift-aggregated-api-delegated-auth) at r=2020-10-15 01:24:13.078005390 v=20.893524334s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/authentication.k8s.io/v1/tokenreviews", Verb:"create", APIPrefix:"apis", APIGroup:"authentication.k8s.io", APIVersion:"v1", Namespace:"", Resource:"tokenreviews", Subresource:"", Name:"", Parts:[]string{"tokenreviews"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-apiserver:openshift-apiserver-sa", UID:"9d373b03-bf0c-4e55-aac8-da284b4996b0", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 7 with virtual start time 20.893524334s, queue will have 0 waiting & 1 executing 

For other requests related to flowschema/openshift-apiserver, search returns empty, seems due to no such requests happen:
$ grep -in 'dispatching request.*Name:"system:serviceaccount:openshift-apiserver:openshift-apiserver-sa"' $KUBE_APISERVER_LOG_FILES | grep -v -E '(subjectaccessreview|tokenreview)'
Then manually create test data:
$ oc project openshift-apiserver
$ SA_TOKEN=`oc serviceaccounts get-token openshift-apiserver-sa`
$ oc login --token $SA_TOKEN
Then get the kube-apiserver logs and check again, QS(workload-high) is set as expected:
2020-10-15T03:04:48.182473425Z I1015 03:04:48.182316      18 queueset.go:601] QS(workload-high) at r=2020-10-15 03:04:48.175288062 v=32.323561850s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/user.openshift.io/v1/users/~", Verb:"get", APIPrefix:"apis", APIGroup:"user.openshift.io", APIVersion:"v1", Namespace:"", Resource:"users", Subresource:"", Name:"~", Parts:[]string{"users", "~"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-apiserver:openshift-apiserver-sa", UID:"9d373b03-bf0c-4e55-aac8-da284b4996b0", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 104 with virtual start time 32.323561850s, queue will have 0 waiting & 1 executing

For requests related to flowschema/openshift-apiserver-operator, QS(openshift-control-plane-operators) is set as expected:
2020-10-15T01:24:09.597569308Z I1015 01:24:09.593720      20 queueset.go:601] QS(openshift-control-plane-operators) at r=2020-10-15 01:24:09.593590328 v=158.687585524s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-apiserver/services/api", Verb:"get", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-apiserver", Resource:"services", Subresource:"", Name:"api", Parts:[]string{"services", "api"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-apiserver-operator:openshift-apiserver-operator", UID:"acd6f5e4-bcbb-4312-9798-3eea7099a68c", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-apiserver-operator", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 56 with virtual start time 158.687585524s, queue will have 0 waiting & 1 executing

For subjectaccessreview requests related to flowschema/openshift-oauth-apiserver-sar, QS(openshift-aggregated-api-delegated-auth) is set as expected:
2020-10-15T01:24:09.480613388Z I1015 01:24:09.426771      20 queueset.go:601] QS(openshift-aggregated-api-delegated-auth) at r=2020-10-15 01:24:09.426729818 v=20.868775444s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/authorization.k8s.io/v1/subjectaccessreviews", Verb:"create", APIPrefix:"apis", APIGroup:"authorization.k8s.io", APIVersion:"v1", Namespace:"", Resource:"subjectaccessreviews", Subresource:"", Name:"", Parts:[]string{"subjectaccessreviews"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-oauth-apiserver:oauth-apiserver-sa", UID:"4d8eeaa2-f58c-4fab-a879-a132e6bb2b9c", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-oauth-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 12 with virtual start time 20.868775444s, queue will have 0 waiting & 1 executing

For tokenreview requests related to flowschema/openshift-oauth-apiserver-sar, search returns empty, seems due to no such requests happen:
$ grep -in 'dispatching request.*tokenreview.*Name:"system:serviceaccount:openshift-oauth-apiserver:oauth-apiserver-sa"' $KUBE_APISERVER_LOG_FILES
Then manually create test data:
$ oc project openshift-oauth-apiserver
$ SA_TOKEN=`oc serviceaccounts get-token oauth-apiserver-sa`
$ oc login --token $SA_TOKEN
$ oc get tokenreview
Error from server (MethodNotAllowed): the server does not allow this method on the requested resource
Then get the kube-apiserver logs and check again, QS(openshift-aggregated-api-delegated-auth) is set as expected:
2020-10-15T03:23:29.092339825Z I1015 03:23:29.092249      20 queueset.go:601] QS(openshift-aggregated-api-delegated-auth) at r=2020-10-15 03:23:29.092115949 v=62.130627802s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/authentication.k8s.io/v1/tokenreviews", Verb:"list", APIPrefix:"apis", APIGroup:"authentication.k8s.io", APIVersion:"v1", Namespace:"", Resource:"tokenreviews", Subresource:"", Name:"", Parts:[]string{"tokenreviews"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-oauth-apiserver:oauth-apiserver-sa", UID:"4d8eeaa2-f58c-4fab-a879-a132e6bb2b9c", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-oauth-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 12 with virtual start time 62.130627802s, queue will have 0 waiting & 1 executing

For other requests related to flowschema/openshift-oauth-apiserver, QS(workload-high):
2020-10-15T03:17:30.703936069Z I1015 03:17:30.703721      18 queueset.go:601] QS(workload-high) at r=2020-10-15 03:17:30.701345060 v=35.080778911s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/user.openshift.io/v1/users/~", Verb:"get", APIPrefix:"apis", APIGroup:"user.openshift.io", APIVersion:"v1", Namespace:"", Resource:"users", Subresource:"", Name:"~", Parts:[]string{"users", "~"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-oauth-apiserver:oauth-apiserver-sa", UID:"4d8eeaa2-f58c-4fab-a879-a132e6bb2b9c", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-oauth-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 97 with virtual start time 35.080778911s, queue will have 0 waiting & 1 executing

For requests related to flowschema/openshift-authentication-operator, QS(openshift-control-plane-operators) is set as expected:
2020-10-15T01:23:59.580673680Z I1015 01:23:59.580593      18 queueset.go:601] QS(openshift-control-plane-operators) at r=2020-10-15 01:23:59.580521955 v=277.483461501s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-oauth-apiserver", Verb:"get", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-oauth-apiserver", Resource:"namespaces", Subresource:"", Name:"openshift-oauth-apiserver", Parts:[]string{"namespaces", "openshift-oauth-apiserver"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-authentication-operator:authentication-operator", UID:"f7a9dd05-380b-45c1-9067-5edf0e432c20", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-authentication-operator", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 57 with virtual start time 277.483461501s, queue will have 0 waiting & 1 executing

For requests related to flowschema/openshift-controller-manager, QS(workload-high) is set as expected:
2020-10-15T01:24:52.252021480Z I1015 01:24:52.251740      18 queueset.go:354] QS(workload-high): Dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-controller-manager/configmaps/openshift-master-controllers", Verb:"get", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-controller-manager", Resource:"configmaps", Subresource:"", Name:"openshift-master-controllers", Parts:[]string{"configmaps", "openshift-master-controllers"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-controller-manager:openshift-controller-manager-sa", UID:"28789c9b-bcd3-4505-a187-c6b7ada3d823", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-controller-manager", "system:authenticated"}, Extra:map[string][]string(nil)} from its queue

For requests related to flowschema/openshift-etcd-operator, QS(openshift-control-plane-operators) is set as expected:
2020-10-15T01:24:49.414447447Z I1015 01:24:49.414413      18 queueset.go:354] QS(openshift-control-plane-operators): Dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-etcd-operator/configmaps/openshift-cluster-etcd-operator-lock", Verb:"update", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-etcd-operator", Resource:"configmaps", Subresource:"", Name:"openshift-cluster-etcd-operator-lock", Parts:[]string{"configmaps", "openshift-cluster-etcd-operator-lock"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-etcd-operator:etcd-operator", UID:"16957387-ef34-48c4-8b58-6fddca086f61", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-etcd-operator", "system:authenticated"}, Extra:map[string][]string(nil)} from its queue

Comment 10 errata-xmlrpc 2020-10-27 16:47:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.