Description of problem: KAS/OAS/OauthAS logs spam: loopyWriter.run returning. connection error: desc = "transport is closing" But seems these messages are not indicative of any problem we care about, so they quite confuse users Version-Release number of selected component (if applicable): 4.7.0-0.nightly-2021-01-31-031653 How reproducible: *Always* The spam can be seen in any env on your hand, so will not attach must-gather Steps to Reproduce: 1. $ oc logs --timestamps -n openshift-kube-apiserver -c kube-apiserver kube-apiserver-ip-10-143-1-199.ec2.internal > logs/kube-apiserver-ip-10-143-1-199.ec2.internal.log $ oc logs --timestamps -n openshift-apiserver -c openshift-apiserver apiserver-6cc8bf5c8f-7gr7n > logs/apiserver-6cc8bf5c8f-7gr7n.log $ oc logs --timestamps -n openshift-oauth-apiserver oauth-apiserver-6bd4fbd859-l2f8b > logs/oauth-apiserver-6bd4fbd859-l2f8b.log 2. $ grep 'loopyWriter.run returning. connection error: desc = "transport is closing"' logs/kube-apiserver-ip-10-143-1-199.ec2.internal.log | wc -l 253 $ grep 'loopyWriter.run returning. connection error: desc = "transport is closing"' logs/apiserver-6cc8bf5c8f-7gr7n.log | wc -l 827 $ grep 'loopyWriter.run returning. connection error: desc = "transport is closing"' logs/oauth-apiserver-6bd4fbd859-l2f8b.log | wc -l 804 Take OAS for example: $ grep 'loopyWriter.run returning. connection error: desc = "transport is closing"' logs/apiserver-6cc8bf5c8f-7gr7n.log | grep "^2021-02-02T09:50" 2021-02-02T09:50:02.522899544Z I0202 09:50:02.522863 1 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing" 2021-02-02T09:50:05.128339529Z I0202 09:50:05.128313 1 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing" 2021-02-02T09:50:13.314228703Z I0202 09:50:13.314204 1 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing" 2021-02-02T09:50:35.511715474Z I0202 09:50:35.511680 1 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing" 2021-02-02T09:50:45.454702141Z I0202 09:50:45.454672 1 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing" 2021-02-02T09:50:47.907317981Z I0202 09:50:47.907288 1 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing" Actual results: 2. KAS/OAS/OauthAS spam such logs, they confuse users Expected results: 2. No such confusing spam Additional info: Bug 1844448#c0 once ever showed such logs too, now not see: $ oc logs --timestamps -n openshift-etcd-operator etcd-operator-fb8cbf695-hlzf2 > logs/etcd-operator-fb8cbf695-hlzf2.log $ grep 'loopyWriter.run.*connection error' logs/etcd-operator-fb8cbf695-hlzf2.log # none Also not see in etcd container logs.
This is https://github.com/grpc/grpc-go#the-rpc-failed-with-error-code--unavailable-desc--transport-is-closing, a behaviour of the grpc library used in the etcd client. Moving to etcd.
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.
We have a customer facing this for kube-apiserver pods. One of the pods show: ~~~ $ grep 'loopyWriter.run returning. connection error: desc = "transport is closing"' kas1.logs | wc -l 13583 ~~~ where kas1 is "oc logs <podname> -c kube-apiserver > kas1.logs" for one of the masters. This master is showing more than 60% cpu consumption and specifically this pod is consuming more cpu. If any information is required please let me know I'll update the bug.
Dear team, Any updates here? Thanks.
This is fixed in k8s 1.22 https://github.com/kubernetes/kubernetes/pull/102896
We will keep the BZ open until it lands.
steps followed are : 1. Install cluster version : 4.9.0-0.nightly-2021-07-06-025559 2. check the logs of apiserver and saw the message "loopyWriter.run returning. connection error: desc = "transport is closing" The same issue is also seen in Server Version: 4.7.0-0.nightly-2021-07-06-050841
1.22 rebase has not landed in openshift so we can retest when that happens.
We have a customer facing the same issue that the kube-api server is consuming max of CPU resources and the Kube-apiserver pods logs shows this output: ~~~ ccResolverWrapper: sending update to cc: {[{https://localhost:2379 <nil> 0 <nil>}] <nil> <nil>} 2021-07-09T06:36:50.297990740Z I0709 06:36:50.297885 17 clientconn.go:948] ClientConn switching balancer to "pick_first" 2021-07-09T06:36:50.300944216Z I0709 06:36:50.298272 17 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc02705cd60, {CONNECTING <nil>} 2021-07-09T06:36:50.318796017Z I0709 06:36:50.313817 17 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc02705cd60, {READY <nil>} 2021-07-09T06:36:50.318796017Z I0709 06:36:50.315980 17 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing" 2021-07-09T06:36:52.532112913Z W0709 06:36:52.531817 17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured 2021-07-09T06:36:52.582434039Z W0709 06:36:52.582254 17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured 2021-07-09T06:36:52.731282776Z W0709 06:36:52.731187 17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured 2021-07-09T06:36:52.931948178Z W0709 06:36:52.931846 17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured 2021-07-09T06:36:53.131199569Z W0709 06:36:53.131091 17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured 2021-07-09T06:36:53.330927116Z W0709 06:36:53.330835 17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured 2021-07-09T06:36:53.531801993Z W0709 06:36:53.531689 17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured ~~~ Can you suggest a workaround for this? If more information is required please let me know I'll update the bug.
Hello Team, Any update on this? Thanks
There is no workaround for etcd client balancer gRPC logs in apiserver, we are going to improve this in 4.9. >2021-07-09T06:36:50.297990740Z I0709 06:36:50.297885 17 clientconn.go:948] ClientConn switching balancer to "pick_first" >2021-07-09T06:36:50.300944216Z I0709 06:36:50.298272 17 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc02705cd60, {CONNECTING <nil>} >2021-07-09T06:36:50.318796017Z I0709 06:36:50.313817 17 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc02705cd60, {READY <nil>} >2021-07-09T06:36:50.318796017Z I0709 06:36:50.315980 17 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing" This is not root cause for CPU if you have issues around that you should start a new bug against apiserver.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days