must-gather : shorturl.at/azI49 the customer is facing intermittent issues where Application deployments to OpenShift are failing. Additional some of the get commands give output randomly indicating issue. They are observing below messages on deployment failures. Some of the OpenShift commands also sometime throw similar messages. Msg1: [Action] Executing Apply Operation... command: oc apply --server=https://api.awsnp1.ocpnp.travp.net:6443 --namespace=ics-pipeline-dev --token=**** --filename=pythonhello-redis-deployment.yaml [Error] Apply operation failed. I1026 14:26:50.575471 30421 request.go:621] Throttling request took 1.090510627s, request: GET:https://api.awsnp1.ocpnp.travp.net:6443/apis/rbac.istio.io/v1alpha1?timeout=32s Msg2: command: oc apply --server=https://api.awsnp1.ocpnp.travp.net:6443 --namespace=bi-scbp-trvpasdatamapper-dev '--token=**** ' --insecure-skip-tls-verify --filename=appdeploy.yaml [Error] Apply operation failed. I1026 12:55:59.410679 7068 request.go:621] Throttling request took 1.084130682s, request: GET:https://api.awsnp1.ocpnp.travp.net:6443/apis/machine.openshift.io/v1beta1?timeout=32s Msg3: [Action] Executing Apply Operation... command: oc apply --server=https://api.awsnp1.ocpnp.travp.net:6443 --namespace=ics-pipeline-dev --token=**** --filename=pythonhello-redis-deployment.yaml [Error] Apply operation failed. I1026 13:39:36.705902 15598 request.go:621] Throttling request took 1.094693524s, request: GET:https://api.awsnp1.ocpnp.travp.net:6443/apis/cloudcredential.openshift.io/v1?timeout=32s Msg4: command: oc apply --server=https://api.awsnp1.ocpnp.travp.net:6443 --namespace=ics-pipeline-dev --token=**** --filename=pythonhello-redis-deployment.yaml [Error] Apply operation failed. I1026 14:15:49.845254 24133 request.go:621] Throttling request took 1.101490539s, request: GET:https://api.awsnp1.ocpnp.travp.net:6443/apis/console.openshift.io/v1?timeout=32s ================================= They made an upgrade of their development cluster to 4.5.15 and they are still seeing deployment failures. The main root cause for these failures seem to be the api request message that is coming out while executing oc cli commands. Is this some change with new version ? See below examples: Example: [Action] Executing Apply Operation... command: oc apply --server=https://api.awsnp1.ocpnp.travp.net:6443 --namespace=ics-pipeline-dev --token=**** --filename=pythonhello-redis-deployment.yaml [Error] Apply operation failed. I1031 12:28:16.114611 13520 request.go:621] Throttling request took 1.076172663s, request: GET:https://api.awsnp1.ocpnp.travp.net:6443/apis/networking.internal.knative.dev/v1alpha1?timeout=32s Command line execution- Using project "default" on server "https://api.awsnp1.ocpnp.travp.net:6443". n4ed4f@k0005403 ~ $ oc get co I1031 12:17:06.814332 5851 request.go:621] Throttling request took 1.017160755s, request: GET:https://api.awsnp1.ocpnp.travp.net:6443/apis/admissionregistration.k8s.io/v1beta1?timeout=32s NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.5.15 True False False 242d after checking the must-gather, I have found "Throttling request" log entries for the kubernetes-apiserver-operator and openshift-apiserver-operator ------------------------- 2020-10-31T17:01:20.735370107Z I1031 17:01:20.735324 1 request.go:621] Throttling request took 1.195878707s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/pods?labelSelector=apiserver%3Dtrue 2020-10-31T17:01:21.735380902Z I1031 17:01:21.735336 1 request.go:621] Throttling request took 1.197386863s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets?labelSelector=encryption.apiserver.operator.openshift.io%2Fcomponent%3Dopenshift-kube-apiserver 2020-10-31T17:01:22.935376312Z I1031 17:01:22.935332 1 request.go:621] Throttling request took 1.193540774s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/pods?labelSelector=apiserver%3Dtrue 2020-10-31T17:01:24.135340486Z I1031 17:01:24.135284 1 request.go:621] Throttling request took 1.195530242s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/encryption-config-78 2020-10-31T17:01:25.135360723Z I1031 17:01:25.135311 1 request.go:621] Throttling request took 1.19475135s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/encryption-config-78 2020-10-31T17:01:26.135364273Z I1031 17:01:26.135322 1 request.go:621] Throttling request took 1.188613098s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/configmaps/config 2020-10-31T17:01:27.335372106Z I1031 17:01:27.335333 1 request.go:621] Throttling request took 1.394951857s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/encryption-config-78 2020-10-31T17:01:28.5353632Z I1031 17:01:28.535322 1 request.go:621] Throttling request took 1.395701677s, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-config-managed/secrets? I have recommended the customer to apply the workaround 1 then 2 in the KCS: https://access.redhat.com/solutions/5448851 is it really safe to upgrade to 4.5.16 and would this fix the original issue and the Throttling request ? the customer is using OC CLI 4.5.15 to perform all deployment activities. he can still continue to use CLI 4.5.15 or you recommend to use 4.5.16 after cluster upgrade to 4.5.16 ?
Throttling requests is normal and to be expected, and to be handled by the client. The operator do that. If oc doesn't, this is a bug. Changing component.
As Stefan mentioned in the previous comment the throttling is perfectly normal behavior and oc is also respecting those since it's built on the same primitives as the operators are, namely client-go library. For the client oc apply command, I'd need logs from the execution with -v=8 to be able to diagnose what's preventing oc apply from succeeding. It will also be helpful to have the yaml definitions of those deployments.
shorturl.at/azI49 ==> requested information. I have attached all requested information on the link above. Looks like throttling messages coming on screen was bug : https://bugzilla.redhat.com/show_bug.cgi?id=1894574. Could you confirm ? the customer have upgraded Dev cluster to 4.5.16 and still seeing throttling messages on stdout. he don't see those messages in other clusters which are on 4.5.16. Dev Cluster has more load than any other clusters as this point. I have attached must-gather for the upgraded Dev cluster, yaml file used for apply and the apply command output which fails the IBM UCD plugin because Plugin don't expect any messages on stdout / stderr. Is there way to suppress these messages ? customer also wanted to update my observation if it helps for investigation. It looks like oc cli may have some role displaying throttling messages as well - he has a cluster which is on 4.4.16. he used 2 versions of oc clis to execute commands. he tried to get an "invalid resource" because he can consistently generate throttling message with that. See below: 1. oc cli version 4.4 $./oc version Client Version: openshift-clients-4.4.0-202006211643.p0-2-gd89e458c3 Server Version: 4.4.16 Kubernetes Version: v1.17.1+b83bc57 $./oc get ppv <=== invalid resource error: the server doesn't have a resource type "ppv" 2. oc cli version 4.5.16 $oc version Client Version: 4.5.16 Server Version: 4.4.16 Kubernetes Version: v1.17.1+b83bc57 $ oc get ppv <=== invalid resource I1117 10:55:20.750698 9336 request.go:621] Throttling request took 1.023522471s, request: GET:https://api.aws2.ocplb.travp.net:6443/apis/kibana.k8s.elastic.co/v1?timeout=32s error: the server doesn't have a resource type "ppv" With the oc cli 4.5.16 I am consistently getting throttling message. Another observation is this message comes up for the first time for particular api call but then for subsequent api calls it doesn't come up for sometime if the previous call was successful.
hello, Could you please check the requested information shorturl.at/azI49 and the description on my previous comment.
The throttling is caused by big amounts of CRDs during discovery, compare https://bugzilla.redhat.com/show_bug.cgi?id=1899575 Currently, there is no way to bypass these messages, in 4.7 we will are increasing the limit such that these should not happen that frequently. *** This bug has been marked as a duplicate of bug 1899575 ***