Bug 1859558 - Etcd proxy panic logs
Summary: Etcd proxy panic logs
Keywords:
Status: CLOSED DUPLICATE of bug 1815634
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.3.z
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Sam Batschelet
QA Contact: ge liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-22 13:07 UTC by Mangirdas Judeikis
Modified: 2020-07-22 13:53 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-22 13:51:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
verbose log output (14.20 KB, text/plain)
2020-07-22 13:07 UTC, Mangirdas Judeikis
no flags Details

Description Mangirdas Judeikis 2020-07-22 13:07:33 UTC
Created attachment 1702073 [details]
verbose log output

Description of problem:

on ARO/Azure we see huge number of panic in the logs coming from etcd-proxy container. These are stale logs being printed again and again into log output. 
Which is confusing and should be fixed as it create false positive results in other issue debugging.


Version-Release number of selected component (if applicable):

4.3++

in 1 day we have around 4000 of same panic message printed in the logs: 

[root@aro-master-0 ~]# cat 1_day.log | grep " http: panic serving 10.11.40.6:4143" | wc -l
3991


Log from July 20 contains logs from  2020/06/30 

Jul 22 12:35:42 aro-master-0 hyperkube[1743]: I0722 12:35:42.197045    1743 kubelet_getters.go:172] status for pod etcd-member-aro-master-0 updated to {Running [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2020-06-30 21:38:29 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2020-06-30 21:38:33 +0000 UTC  } {
ContainersReady True 0001-01-01 00:00:00 +0000 UTC 2020-06-30 21:38:33 +0000 UTC  } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2020-04-29 16:38:45 +0000 UTC  }]    10.43.147.9 10.43.147.9 [{10.43.147.9}] 2020-04-29 16:38:45 +0000 UTC [{discovery {nil nil &ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Comp
leted,Message:,StartedAt:2020-06-30 21:38:25 +0000 UTC,FinishedAt:2020-06-30 21:38:26 +0000 UTC,ContainerID:cri-o://8117166a23a65203301b7d56d99cec6d5c50ea570acddf472c5a854f504c2cd6,}} {nil nil nil} true 1 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba1dd635fdceaa22b4a26fae0af6abc73e0ce669420d21689e219b70c58
86c0b quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ba1dd635fdceaa22b4a26fae0af6abc73e0ce669420d21689e219b70c5886c0b cri-o://8117166a23a65203301b7d56d99cec6d5c50ea570acddf472c5a854f504c2cd6 <nil>} {certs {nil nil &ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Completed,Message:,StartedAt:2020-06-30 21:3
8:28 +0000 UTC,FinishedAt:2020-06-30 21:38:28 +0000 UTC,ContainerID:cri-o://01e459a51a9ecb153d72fc98f76f9bcd298566440ed692a72a2067e6e70195a8,}} {nil nil nil} true 0 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:01fa6f2e9ae6d9265b8623ef86640ada996bb461fb6e2472c9cf08d0804d6474 quay.io/openshift-release-dev/ocp-
v4.0-art-dev@sha256:01fa6f2e9ae6d9265b8623ef86640ada996bb461fb6e2472c9cf08d0804d6474 cri-o://01e459a51a9ecb153d72fc98f76f9bcd298566440ed692a72a2067e6e70195a8 <nil>}] [{etcd-member {nil &ContainerStateRunning{StartedAt:2020-06-30 21:38:30 +0000 UTC,} nil} {nil nil &ContainerStateTerminated{ExitCode:0,Signal:0,Reason:C
ompleted,Message:,StartedAt:2020-06-30 21:36:02 +0000 UTC,FinishedAt:2020-06-30 21:36:25 +0000 UTC,ContainerID:cri-o://b979f9c61b4c10b93dc1a0354cd96c709e3ff6ee9f847390a2207d46bbce9470,}} true 1 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bbd0ba73cca161d37f3d4beb1a4bf0761e499cdf20c6e3a435e579e8982105bf quay.
io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bbd0ba73cca161d37f3d4beb1a4bf0761e499cdf20c6e3a435e579e8982105bf cri-o://7c2fa4b014624e7208da101ae3057913120d153ca3f3ff00ca8bd0a05cc74629 0xc000f2491a} {etcd-metrics {nil &ContainerStateRunning{StartedAt:2020-06-30 21:38:32 +0000 UTC,} nil} {nil nil &ContainerStateTerm
inated{ExitCode:2,Signal:0,Reason:Error,Message:ile dialing dial tcp 10.43.147.9:9978: connect: connection refused". Reconnecting...                                                                                                                                                                                          
Jul 22 12:35:42 aro-master-0 hyperkube[1743]: WARNING: 2020/06/30 21:36:04 grpc: addrConn.createTransport failed to connect to {https://etcd-0.3b15jxwb.eastus.aroapp.io:9978 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 10.43.147.9:9978: connect: connection refused". Reconnecting..
.                                                                                                                                                                                                                                                                                                                             
Jul 22 12:35:42 aro-master-0 hyperkube[1743]: WARNING: 2020/06/30 21:36:05 grpc: addrConn.createTransport failed to connect to {https://etcd-0.3b15jxwb.eastus.aroapp.io:9978 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 10.43.147.9:9978: connect: connection refused". Reconnecting..

Comment 1 Sam Batschelet 2020-07-22 13:48:32 UTC
> 4.3++

This is fixed in 4.4 and new etcd bump to 4.3 is on it's way.

Comment 2 Sam Batschelet 2020-07-22 13:51:54 UTC

*** This bug has been marked as a duplicate of bug 1815634 ***

Comment 3 Sam Batschelet 2020-07-22 13:53:02 UTC
For context the panic is a result of failing to connect to etcd container because it is probably not running. Result is retry.


Note You need to log in before you can comment on or make changes to this bug.