Bug 1906916 - Teach CVO about flowcontrol.apiserver.k8s.io/v1beta1
Summary: Teach CVO about flowcontrol.apiserver.k8s.io/v1beta1
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.7.0
Assignee: Jack Ottofaro
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-11 18:45 UTC by W. Trevor King
Modified: 2021-02-24 15:43 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Pivot to Kubernetes 1.20 Consequence: Unable to apply manifests requiring flowcontrol. Fix: Pickup flowcontrol.apiserver.k8s.io/v1beta1 to support api-server Kubernetes bump to 1.20.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:43:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-version-operator pull 489 0 None closed Bug 1906916: bump k8s.io from v0.19.0 to v0.20.0 2021-01-07 22:20:31 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:43:26 UTC

Description W. Trevor King 2020-12-11 18:45:06 UTC
CVO already understands v1alpha1:

$ oc adm release extract --to manifests quay.io/openshift-release-dev/ocp-release:4.6.8-x86_64
Extracted release payload from digest sha256:6ddbf56b7f9776c0498f23a54b65a06b3b846c1012200c5609c4bb716b6bdcdf created at 2020-12-09T11:35:37Z
$ grep -ir flowcontrol manifests 
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1

With the pivot to Kubernetes 1.20 [1], components need to push (or handle, or something) v1beta1 forms.  Currently the CVO chokes on that with:

2020-12-11T11:55:54.565747078Z E1211 11:55:54.565682       1 task.go:81] error running apply for flowschema "openshift-etcd-operator" (72 of 670): no kind "FlowSchema" is registered for version "flowcontrol.apiserver.k8s.io/v1beta1" in scheme "k8s.io/kubernetes/pkg/api/legacyscheme/scheme.go:30"

This bug is about avoiding that error, by bumping our vendored client-go and registering the v1beta1 handler.

We have no flowcontrol-specific handling today, so no promises that this will actually help with things like:

* CVO appropriately merges divering in-cluster flowcontrol specs with manifest specs.
* CVO notices when the controller fails to reconcile and sets the Dangling=False .status.condition.
* CVO notices when the controller is progressing, and blocks update-graph reconciliation until the in-cluster object is level.

But API-server folks are currently blocked on rebasing by this, and want to see if a naive bump to accept v1beta1 is sufficient to unblock them.

[1]: https://github.com/openshift/kubernetes/pull/471

Comment 3 Johnny Liu 2020-12-22 04:24:55 UTC
Verified this bug with 4.7.0-0.nightly-2020-12-20-055006, passed.

[root@preserve-jialiu-ansible ~]# oc get node
NAME                                                            STATUS   ROLES    AGE   VERSION
qe-metering-1221-9gn5p-master-0.c.openshift-qe.internal         Ready    master   15h   v1.20.0+87544c5
qe-metering-1221-9gn5p-master-1.c.openshift-qe.internal         Ready    master   15h   v1.20.0+87544c5
qe-metering-1221-9gn5p-master-2.c.openshift-qe.internal         Ready    master   15h   v1.20.0+87544c5
qe-metering-1221-9gn5p-worker-a-nq4pr.c.openshift-qe.internal   Ready    worker   14h   v1.20.0+87544c5
qe-metering-1221-9gn5p-worker-b-6sbtn.c.openshift-qe.internal   Ready    worker   14h   v1.20.0+87544c5
qe-metering-1221-9gn5p-worker-c-rfmzp.c.openshift-qe.internal   Ready    worker   14h   v1.20.0+87544c5

kubenate is getting to 1.20 version.


[root@preserve-jialiu-ansible demo5]# oc adm release extract --to manifests registry.svc.ci.openshift.org/ocp/release@sha256:ea2d954b1ac4b2818c419055afdb9ff87c5b95fa03c3258a26dc542f4ecab5d8
Extracted release payload created at 2020-12-20T06:04:57Z

[root@preserve-jialiu-ansible demo5]# grep -ir flowcontrol manifests 
manifests/0000_50_cluster-authentication-operator_09_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_50_cluster-authentication-operator_09_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_50_cluster-authentication-operator_09_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_50_cluster-authentication-operator_09_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_12_etcd-operator_10_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_20_kube-apiserver-operator_08_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_90_kube-apiserver-operator_05_api_performance_dashboard.yaml:              "expr": "histogram_quantile(0.99, sum(rate(apiserver_flowcontrol_request_wait_duration_seconds_bucket{apiserver=\"$apiserver\",execute=\"true\"}[$period])) by(flowSchema, priorityLevel, le))",
manifests/0000_90_kube-apiserver-operator_05_api_performance_dashboard.yaml:              "expr": "sum(rate(apiserver_flowcontrol_rejected_requests_total{apiserver=\"$apiserver\"}[$period])) by (flowSchema,priorityLevel,reason)",
manifests/0000_90_kube-apiserver-operator_05_api_performance_dashboard.yaml:              "expr": "sum(rate(apiserver_flowcontrol_dispatched_requests_total{apiserver=\"$apiserver\"}[$period])) by(flowSchema,priorityLevel)",
manifests/0000_90_kube-apiserver-operator_05_api_performance_dashboard.yaml:              "expr": "histogram_quantile(0.99, sum(rate(apiserver_flowcontrol_request_queue_length_after_enqueue_bucket{apiserver=\"$apiserver\"}[$period])) by(flowSchema, priorityLevel, le))",
manifests/0000_90_kube-apiserver-operator_05_api_performance_dashboard.yaml:              "expr": "sum(apiserver_flowcontrol_current_executing_requests{apiserver=\"$apiserver\"}) by (priorityLevel,flowSchema)",
manifests/0000_90_kube-apiserver-operator_05_api_performance_dashboard.yaml:              "expr": "histogram_quantile(0.99, sum(rate(apiserver_flowcontrol_request_execution_seconds_bucket{apiserver=\"$apiserver\"}[$period])) by(flowSchema, priorityLevel, le) ) ",
manifests/0000_90_kube-apiserver-operator_05_api_performance_dashboard.yaml:              "expr": "sum(apiserver_flowcontrol_current_inqueue_requests{apiserver=\"$apiserver\"}) by (flowSchema,priorityLevel)",
manifests/0000_90_kube-apiserver-operator_05_api_performance_dashboard.yaml:              "expr": "sum(apiserver_flowcontrol_request_concurrency_limit{apiserver=\"$apiserver\"}) by (priorityLevel)",
manifests/0000_70_cluster-network-operator_04_kubeapiserver_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_30_openshift-apiserver-operator_09_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_30_openshift-apiserver-operator_09_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_30_openshift-apiserver-operator_09_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
manifests/0000_50_cluster-openshift-controller-manager-operator_10_flowschema.yaml:apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1


[root@preserve-jialiu-ansible ~]#  oc get FlowSchema openshift-etcd-operator
NAME                      PRIORITYLEVEL                       MATCHINGPRECEDENCE   DISTINGUISHERMETHOD   AGE   MISSINGPL
openshift-etcd-operator   openshift-control-plane-operators   2000                 ByUser                15h   False


[root@preserve-jialiu-ansible demo5]# oc get FlowSchema openshift-etcd-operator -o yaml|grep apiVersion
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
  - apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
  - apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1

[root@preserve-jialiu-ansible ~]# oc get FlowSchema openshift-kube-apiserver-operator
NAME                                PRIORITYLEVEL                       MATCHINGPRECEDENCE   DISTINGUISHERMETHOD   AGE   MISSINGPL
openshift-kube-apiserver-operator   openshift-control-plane-operators   2000                 ByUser                14h   False

[root@preserve-jialiu-ansible demo5]# oc get FlowSchema openshift-kube-apiserver-operator -o yaml|grep apiVersion
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
  - apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
  - apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1

[root@preserve-jialiu-ansible ~]# oc  -n openshift-cluster-version logs cluster-version-operator-dcbc59f47-dsk48|grep flowschema|grep openshift-etcd-operator
I1222 00:30:48.582871       1 sync_worker.go:729] Running sync for flowschema "openshift-etcd-operator" (73 of 663)
I1222 00:30:48.675622       1 request.go:591] Throttling request took 92.457046ms, request: GET:https://api-int.qe-metering-1221.qe.gcp.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1alpha1/flowschemas/openshift-etcd-operator
I1222 00:30:48.775636       1 request.go:591] Throttling request took 95.755092ms, request: PUT:https://api-int.qe-metering-1221.qe.gcp.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1alpha1/flowschemas/openshift-etcd-operator
I1222 00:30:48.784113       1 sync_worker.go:741] Done syncing for flowschema "openshift-etcd-operator" (73 of 663)
I1222 00:34:20.505797       1 sync_worker.go:729] Running sync for flowschema "openshift-etcd-operator" (73 of 663)
I1222 00:34:20.599696       1 request.go:591] Throttling request took 93.711034ms, request: GET:https://api-int.qe-metering-1221.qe.gcp.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1alpha1/flowschemas/openshift-etcd-operator
I1222 00:34:20.699669       1 request.go:591] Throttling request took 92.355015ms, request: PUT:https://api-int.qe-metering-1221.qe.gcp.devcluster.openshift.com:6443/apis/flowcontrol.apiserver.k8s.io/v1alpha1/flowschemas/openshift-etcd-operator
I1222 00:34:20.706856       1 sync_worker.go:741] Done syncing for flowschema "openshift-etcd-operator" (73 of 663)

[root@preserve-jialiu-ansible ~]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-12-20-055006   True        False         14h     Cluster version is 4.7.0-0.nightly-2020-12-20-055006

If the above verification is not enough, pls let me know.

Comment 5 errata-xmlrpc 2021-02-24 15:43:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.