Created attachment 1702073 [details] verbose log output Description of problem: on ARO/Azure we see huge number of panic in the logs coming from etcd-proxy container. These are stale logs being printed again and again into log output. Which is confusing and should be fixed as it create false positive results in other issue debugging. Version-Release number of selected component (if applicable): 4.3++ in 1 day we have around 4000 of same panic message printed in the logs: [root@aro-master-0 ~]# cat 1_day.log | grep " http: panic serving 10.11.40.6:4143" | wc -l 3991 Log from July 20 contains logs from 2020/06/30 Jul 22 12:35:42 aro-master-0 hyperkube[1743]: I0722 12:35:42.197045 1743 kubelet_getters.go:172] status for pod etcd-member-aro-master-0 updated to {Running [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2020-06-30 21:38:29 +0000 UTC } {Ready True 0001-01-01 00:00:00 +0000 UTC 2020-06-30 21:38:33 +0000 UTC } { ContainersReady True 0001-01-01 00:00:00 +0000 UTC 2020-06-30 21:38:33 +0000 UTC } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2020-04-29 16:38:45 +0000 UTC }] 10.43.147.9 10.43.147.9 [{10.43.147.9}] 2020-04-29 16:38:45 +0000 UTC [{discovery {nil nil &ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Comp leted,Message:,StartedAt:2020-06-30 21:38:25 +0000 UTC,FinishedAt:2020-06-30 21:38:26 +0000 UTC,ContainerID:cri-o://8117166a23a65203301b7d56d99cec6d5c50ea570acddf472c5a854f504c2cd6,}} {nil nil nil} true 1 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba1dd635fdceaa22b4a26fae0af6abc73e0ce669420d21689e219b70c58 86c0b quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba1dd635fdceaa22b4a26fae0af6abc73e0ce669420d21689e219b70c5886c0b cri-o://8117166a23a65203301b7d56d99cec6d5c50ea570acddf472c5a854f504c2cd6 <nil>} {certs {nil nil &ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Completed,Message:,StartedAt:2020-06-30 21:3 8:28 +0000 UTC,FinishedAt:2020-06-30 21:38:28 +0000 UTC,ContainerID:cri-o://01e459a51a9ecb153d72fc98f76f9bcd298566440ed692a72a2067e6e70195a8,}} {nil nil nil} true 0 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:01fa6f2e9ae6d9265b8623ef86640ada996bb461fb6e2472c9cf08d0804d6474 quay.io/openshift-release-dev/ocp- v4.0-art-dev@sha256:01fa6f2e9ae6d9265b8623ef86640ada996bb461fb6e2472c9cf08d0804d6474 cri-o://01e459a51a9ecb153d72fc98f76f9bcd298566440ed692a72a2067e6e70195a8 <nil>}] [{etcd-member {nil &ContainerStateRunning{StartedAt:2020-06-30 21:38:30 +0000 UTC,} nil} {nil nil &ContainerStateTerminated{ExitCode:0,Signal:0,Reason:C ompleted,Message:,StartedAt:2020-06-30 21:36:02 +0000 UTC,FinishedAt:2020-06-30 21:36:25 +0000 UTC,ContainerID:cri-o://b979f9c61b4c10b93dc1a0354cd96c709e3ff6ee9f847390a2207d46bbce9470,}} true 1 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bbd0ba73cca161d37f3d4beb1a4bf0761e499cdf20c6e3a435e579e8982105bf quay. io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bbd0ba73cca161d37f3d4beb1a4bf0761e499cdf20c6e3a435e579e8982105bf cri-o://7c2fa4b014624e7208da101ae3057913120d153ca3f3ff00ca8bd0a05cc74629 0xc000f2491a} {etcd-metrics {nil &ContainerStateRunning{StartedAt:2020-06-30 21:38:32 +0000 UTC,} nil} {nil nil &ContainerStateTerm inated{ExitCode:2,Signal:0,Reason:Error,Message:ile dialing dial tcp 10.43.147.9:9978: connect: connection refused". Reconnecting... Jul 22 12:35:42 aro-master-0 hyperkube[1743]: WARNING: 2020/06/30 21:36:04 grpc: addrConn.createTransport failed to connect to {https://etcd-0.3b15jxwb.eastus.aroapp.io:9978 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 10.43.147.9:9978: connect: connection refused". Reconnecting.. . Jul 22 12:35:42 aro-master-0 hyperkube[1743]: WARNING: 2020/06/30 21:36:05 grpc: addrConn.createTransport failed to connect to {https://etcd-0.3b15jxwb.eastus.aroapp.io:9978 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 10.43.147.9:9978: connect: connection refused". Reconnecting..
> 4.3++ This is fixed in 4.4 and new etcd bump to 4.3 is on it's way.
*** This bug has been marked as a duplicate of bug 1815634 ***
For context the panic is a result of failing to connect to etcd container because it is probably not running. Result is retry.