Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1885356

Summary: add p&f configuration to protect openshift traffic
Product: OpenShift Container Platform Reporter: Abu Kashem <akashem>
Component: kube-apiserverAssignee: Abu Kashem <akashem>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.5CC: aos-bugs, kewang, mfojtik, sreber, wking, xxia
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1885353
: 1885358 (view as bug list) Environment:
Last Closed: 2020-10-27 16:47:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1885358    
Bug Blocks: 1883589, 1885353    

Description Abu Kashem 2020-10-05 17:51:34 UTC
+++ This bug was initially created as a clone of Bug #1885353 +++

add p&f configuration to protect openshift traffic. Define dedicated flowschema and priority configuration that will protect openshift specific traffic.

- subjectaccessreviews (SAR) and tokenreviews from oas or oauth server is very importnant.
- openshift controller manager, other `oas` requests, '/metrics' requests from openshift-monitoring is as important as kcm traffic.
- control plane operators are important (kas-o, auth operator, etcd operator)
- The default `workloads-low` goes below the traffic defined above.

Comment 4 Abu Kashem 2020-10-13 13:21:44 UTC
Hi xxia,
This also relates to a customer case - https://bugzilla.redhat.com/show_bug.cgi?id=1883589 we are working on.

I have outlined the steps to verify this in the 4.7 BZ https://bugzilla.redhat.com/show_bug.cgi?id=1885358. Can you please copy paste the logs here for each flow schema and service account? We want to make sure all service accounts match to the right priority level.

Comment 6 Xingxing Xia 2020-10-14 08:08:22 UTC
For your comment 4, checked in 4.6.0-rc.3 env:
$ oc patch kubeapiserver cluster --type=merge -p='
spec:
  logLevel: Trace'
Wait for kube-apiserver rolls out and becomes re-Running, then check for all kube-apiserver pods, all show like below:
$ oc logs kube-apiserver-xxia1013-cmccz-master-0.c.openshift-qe.internal -c kube-apiserver -n openshift-kube-apiserver
In the logs can see:
For /metrics requests from SA prometheus-k8s, the priority is set QS(workload-high):

2020-10-14T07:44:49.535969233Z I1014 07:44:49.535818      17 queueset.go:601] QS(workload-high) at r=2020-10-14 07:44:49.535667167 v=11.162318673s: dispatching request &request.RequestInfo{IsResourceRequest:false, Path:"/metrics", Verb:"get", APIPrefix:"", APIGroup:"", APIVersion:"", Namespace:"", Resource:"", Subresource:"", Name:"", Parts:[]string(nil)} &user.DefaultInfo{Name:"system:serviceaccount:openshift-monitoring:prometheus-k8s", UID:"c618e123-5c1f-4c84-8dbe-3b2acfb60fa5", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-monitoring", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 30 with virtual start time 11.162318673s, queue will have 0 waiting & 1 executing

This is as expected as the PR:
...
  priorityLevelConfiguration:
    name: workload-high
...

For requests from SA kube-apiserver-operator, the priority is set QS(openshift-control-plane-operators):

2020-10-14T07:46:43.162219012Z I1014 07:46:43.162133      17 queueset.go:601] QS(openshift-control-plane-operators) at r=2020-10-14 07:46:43.162103081 v=41.563321829s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-config-managed/secrets", Verb:"list", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-config-managed", Resource:"secrets", Subresource:"", Name:"", Parts:[]string{"secrets"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-kube-apiserver-operator:kube-apiserver-operator", UID:"e340db04-f8fe-4b7c-8892-c6a56ad178a7", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-kube-apiserver-operator", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 58 with virtual start time 41.563321829s, queue will have 0 waiting & 1 executing

This is as expected as the PR:
...
  priorityLevelConfiguration:
    name: openshift-control-plane-operators
...

Comment 7 Xingxing Xia 2020-10-14 08:35:16 UTC
Hmm, I only checked the first 2 SAs, there are other flowschemas for other SAs, still checking ...

Comment 8 Xingxing Xia 2020-10-15 03:34:21 UTC
Continuing other flowschemas & SAs (this bug's PR adds so many, so below verification is a bit verbose, some needs to manually create test data).

Get the kube-apiserver log files, store to KUBE_APISERVER_LOG_FILES for search, e.g.:
KUBE_APISERVER_LOG_FILES='logs/kube-apiserver-xxia1014-2p4jv-master-0.log logs/kube-apiserver-xxia1014-2p4jv-master-1.log logs/kube-apiserver-xxia1014-2p4jv-master-2.log'

Then one by one check other flowschemas & SAs:
For subjectaccessreview requests related to flowschema/openshift-apiserver-sar, QS(openshift-aggregated-api-delegated-auth) is set as expected:
2020-10-15T01:24:09.850646219Z I1015 01:24:09.850477      20 queueset.go:601] QS(openshift-aggregated-api-delegated-auth) at r=2020-10-15 01:24:09.850155411 v=20.870428617s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/authorization.k8s.io/v1/subjectaccessreviews", Verb:"create", APIPrefix:"apis", APIGroup:"authorization.k8s.io", APIVersion:"v1", Namespace:"", Resource:"subjectaccessreviews", Subresource:"", Name:"", Parts:[]string{"subjectaccessreviews"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-apiserver:openshift-apiserver-sa", UID:"9d373b03-bf0c-4e55-aac8-da284b4996b0", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 7 with virtual start time 20.870428617s, queue will have 0 waiting & 1 executing

For tokenreview requests related to flowschema/openshift-apiserver-sar, QS(openshift-aggregated-api-delegated-auth) is set as expected:
2020-10-15T01:24:13.078346341Z I1015 01:24:13.078247      20 queueset.go:601] QS(openshift-aggregated-api-delegated-auth) at r=2020-10-15 01:24:13.078005390 v=20.893524334s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/authentication.k8s.io/v1/tokenreviews", Verb:"create", APIPrefix:"apis", APIGroup:"authentication.k8s.io", APIVersion:"v1", Namespace:"", Resource:"tokenreviews", Subresource:"", Name:"", Parts:[]string{"tokenreviews"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-apiserver:openshift-apiserver-sa", UID:"9d373b03-bf0c-4e55-aac8-da284b4996b0", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 7 with virtual start time 20.893524334s, queue will have 0 waiting & 1 executing 

For other requests related to flowschema/openshift-apiserver, search returns empty, seems due to no such requests happen:
$ grep -in 'dispatching request.*Name:"system:serviceaccount:openshift-apiserver:openshift-apiserver-sa"' $KUBE_APISERVER_LOG_FILES | grep -v -E '(subjectaccessreview|tokenreview)'
Then manually create test data:
$ oc project openshift-apiserver
$ SA_TOKEN=`oc serviceaccounts get-token openshift-apiserver-sa`
$ oc login --token $SA_TOKEN
Then get the kube-apiserver logs and check again, QS(workload-high) is set as expected:
2020-10-15T03:04:48.182473425Z I1015 03:04:48.182316      18 queueset.go:601] QS(workload-high) at r=2020-10-15 03:04:48.175288062 v=32.323561850s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/user.openshift.io/v1/users/~", Verb:"get", APIPrefix:"apis", APIGroup:"user.openshift.io", APIVersion:"v1", Namespace:"", Resource:"users", Subresource:"", Name:"~", Parts:[]string{"users", "~"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-apiserver:openshift-apiserver-sa", UID:"9d373b03-bf0c-4e55-aac8-da284b4996b0", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 104 with virtual start time 32.323561850s, queue will have 0 waiting & 1 executing

For requests related to flowschema/openshift-apiserver-operator, QS(openshift-control-plane-operators) is set as expected:
2020-10-15T01:24:09.597569308Z I1015 01:24:09.593720      20 queueset.go:601] QS(openshift-control-plane-operators) at r=2020-10-15 01:24:09.593590328 v=158.687585524s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-apiserver/services/api", Verb:"get", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-apiserver", Resource:"services", Subresource:"", Name:"api", Parts:[]string{"services", "api"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-apiserver-operator:openshift-apiserver-operator", UID:"acd6f5e4-bcbb-4312-9798-3eea7099a68c", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-apiserver-operator", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 56 with virtual start time 158.687585524s, queue will have 0 waiting & 1 executing

For subjectaccessreview requests related to flowschema/openshift-oauth-apiserver-sar, QS(openshift-aggregated-api-delegated-auth) is set as expected:
2020-10-15T01:24:09.480613388Z I1015 01:24:09.426771      20 queueset.go:601] QS(openshift-aggregated-api-delegated-auth) at r=2020-10-15 01:24:09.426729818 v=20.868775444s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/authorization.k8s.io/v1/subjectaccessreviews", Verb:"create", APIPrefix:"apis", APIGroup:"authorization.k8s.io", APIVersion:"v1", Namespace:"", Resource:"subjectaccessreviews", Subresource:"", Name:"", Parts:[]string{"subjectaccessreviews"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-oauth-apiserver:oauth-apiserver-sa", UID:"4d8eeaa2-f58c-4fab-a879-a132e6bb2b9c", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-oauth-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 12 with virtual start time 20.868775444s, queue will have 0 waiting & 1 executing

For tokenreview requests related to flowschema/openshift-oauth-apiserver-sar, search returns empty, seems due to no such requests happen:
$ grep -in 'dispatching request.*tokenreview.*Name:"system:serviceaccount:openshift-oauth-apiserver:oauth-apiserver-sa"' $KUBE_APISERVER_LOG_FILES
Then manually create test data:
$ oc project openshift-oauth-apiserver
$ SA_TOKEN=`oc serviceaccounts get-token oauth-apiserver-sa`
$ oc login --token $SA_TOKEN
$ oc get tokenreview
Error from server (MethodNotAllowed): the server does not allow this method on the requested resource
Then get the kube-apiserver logs and check again, QS(openshift-aggregated-api-delegated-auth) is set as expected:
2020-10-15T03:23:29.092339825Z I1015 03:23:29.092249      20 queueset.go:601] QS(openshift-aggregated-api-delegated-auth) at r=2020-10-15 03:23:29.092115949 v=62.130627802s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/authentication.k8s.io/v1/tokenreviews", Verb:"list", APIPrefix:"apis", APIGroup:"authentication.k8s.io", APIVersion:"v1", Namespace:"", Resource:"tokenreviews", Subresource:"", Name:"", Parts:[]string{"tokenreviews"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-oauth-apiserver:oauth-apiserver-sa", UID:"4d8eeaa2-f58c-4fab-a879-a132e6bb2b9c", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-oauth-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 12 with virtual start time 62.130627802s, queue will have 0 waiting & 1 executing

For other requests related to flowschema/openshift-oauth-apiserver, QS(workload-high):
2020-10-15T03:17:30.703936069Z I1015 03:17:30.703721      18 queueset.go:601] QS(workload-high) at r=2020-10-15 03:17:30.701345060 v=35.080778911s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/apis/user.openshift.io/v1/users/~", Verb:"get", APIPrefix:"apis", APIGroup:"user.openshift.io", APIVersion:"v1", Namespace:"", Resource:"users", Subresource:"", Name:"~", Parts:[]string{"users", "~"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-oauth-apiserver:oauth-apiserver-sa", UID:"4d8eeaa2-f58c-4fab-a879-a132e6bb2b9c", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-oauth-apiserver", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 97 with virtual start time 35.080778911s, queue will have 0 waiting & 1 executing

For requests related to flowschema/openshift-authentication-operator, QS(openshift-control-plane-operators) is set as expected:
2020-10-15T01:23:59.580673680Z I1015 01:23:59.580593      18 queueset.go:601] QS(openshift-control-plane-operators) at r=2020-10-15 01:23:59.580521955 v=277.483461501s: dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-oauth-apiserver", Verb:"get", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-oauth-apiserver", Resource:"namespaces", Subresource:"", Name:"openshift-oauth-apiserver", Parts:[]string{"namespaces", "openshift-oauth-apiserver"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-authentication-operator:authentication-operator", UID:"f7a9dd05-380b-45c1-9067-5edf0e432c20", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-authentication-operator", "system:authenticated"}, Extra:map[string][]string(nil)} from queue 57 with virtual start time 277.483461501s, queue will have 0 waiting & 1 executing

For requests related to flowschema/openshift-controller-manager, QS(workload-high) is set as expected:
2020-10-15T01:24:52.252021480Z I1015 01:24:52.251740      18 queueset.go:354] QS(workload-high): Dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-controller-manager/configmaps/openshift-master-controllers", Verb:"get", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-controller-manager", Resource:"configmaps", Subresource:"", Name:"openshift-master-controllers", Parts:[]string{"configmaps", "openshift-master-controllers"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-controller-manager:openshift-controller-manager-sa", UID:"28789c9b-bcd3-4505-a187-c6b7ada3d823", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-controller-manager", "system:authenticated"}, Extra:map[string][]string(nil)} from its queue

For requests related to flowschema/openshift-etcd-operator, QS(openshift-control-plane-operators) is set as expected:
2020-10-15T01:24:49.414447447Z I1015 01:24:49.414413      18 queueset.go:354] QS(openshift-control-plane-operators): Dispatching request &request.RequestInfo{IsResourceRequest:true, Path:"/api/v1/namespaces/openshift-etcd-operator/configmaps/openshift-cluster-etcd-operator-lock", Verb:"update", APIPrefix:"api", APIGroup:"", APIVersion:"v1", Namespace:"openshift-etcd-operator", Resource:"configmaps", Subresource:"", Name:"openshift-cluster-etcd-operator-lock", Parts:[]string{"configmaps", "openshift-cluster-etcd-operator-lock"}} &user.DefaultInfo{Name:"system:serviceaccount:openshift-etcd-operator:etcd-operator", UID:"16957387-ef34-48c4-8b58-6fddca086f61", Groups:[]string{"system:serviceaccounts", "system:serviceaccounts:openshift-etcd-operator", "system:authenticated"}, Extra:map[string][]string(nil)} from its queue

Comment 10 errata-xmlrpc 2020-10-27 16:47:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196