Bug 1843498

Summary:	network failure during e2e: connection reset by peer
Product:	OpenShift Container Platform	Reporter:	Ben Parees <bparees>
Component:	Networking	Assignee:	Ben Bennett <bbennett>
Networking sub component:	openshift-sdn	QA Contact:	zhaozhanqi <zzhao>
Status:	CLOSED DUPLICATE	Docs Contact:
Severity:	unspecified
Priority:	unspecified	CC:	aos-bugs, mfojtik, sttts
Version:	4.5
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-06-05 09:59:01 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Ben Parees 2020-06-03 12:32:30 UTC

Seeing what appears to be network drops outs in many jobs across various releases:

https://search.apps.build01.ci.devcluster.openshift.com/?search=read%3A+connection+reset+by+peer&maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job

recent example seen in:
https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.5/707

networking appears to drop out:
fail [github.com/openshift/origin/test/extended/util/client.go:693]: Jun  1 08:29:49.561: Put https://api.ks2nzjv9-3c054.shiftstack.devcluster.openshift.com:6443/api/v1/namespaces/e2e-provisioning-9986: read tcp 172.16.14.183:35048->128.31.24.232:6443: read: connection reset by peer

Comment 1 Juan Luis de Sousa-Valadas 2020-06-04 13:46:47 UTC

Connection reset by peer means the this peer (in this case the client) has received a TCP RST. This ultimately meansthe other end (in this case the server) has decided to terminate the connection somewhat abruptly.
This can happen for several reasons, sometimes it's a non recoverable error in the communication such as receiving a malformed message, it may be a problem in the implementation of the server, the server shutting down closing the connections in a not very graceful manner, etc.

Finally, this may or may not be related to an underlying network problem, usually it's not.

I'm moving it to the kube-apiserver team so that they investigate why are they resetting the connection in this particular case, but
it's important to say that most of the searchs in that match won't be related to the kube-apiserver at all.

Comment 3 Stefan Schimanski 2020-06-05 07:52:12 UTC

> I'm moving it to the kube-apiserver team so that they investigate why are they resetting the connection in this particular case, but
it's important to say that most of the searchs in that match won't be related to the kube-apiserver at all.

Please do you homework and look at the very least into logs before sending around BZs. It's not very effective to let another team do the same work again. If there is a hint that it's really the server side, sure. Move it to the owning team.

Comment 4 Stefan Schimanski 2020-06-05 09:59:01 UTC


*** This bug has been marked as a duplicate of bug 1844384 ***