1843498 – network failure during e2e: connection reset by peer

Bug 1843498 - network failure during e2e: connection reset by peer

Summary: network failure during e2e: connection reset by peer

Keywords:
Status:	CLOSED DUPLICATE of bug 1844384
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Ben Bennett
QA Contact:	zhaozhanqi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-06-03 12:32 UTC by Ben Parees
Modified:	2020-06-05 09:59 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-06-05 09:59:01 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Ben Parees 2020-06-03 12:32:30 UTC

Seeing what appears to be network drops outs in many jobs across various releases:

https://search.apps.build01.ci.devcluster.openshift.com/?search=read%3A+connection+reset+by+peer&maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job

recent example seen in:
https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.5/707

networking appears to drop out:
fail [github.com/openshift/origin/test/extended/util/client.go:693]: Jun  1 08:29:49.561: Put https://api.ks2nzjv9-3c054.shiftstack.devcluster.openshift.com:6443/api/v1/namespaces/e2e-provisioning-9986: read tcp 172.16.14.183:35048->128.31.24.232:6443: read: connection reset by peer

Comment 1 Juan Luis de Sousa-Valadas 2020-06-04 13:46:47 UTC

Connection reset by peer means the this peer (in this case the client) has received a TCP RST. This ultimately meansthe other end (in this case the server) has decided to terminate the connection somewhat abruptly.
This can happen for several reasons, sometimes it's a non recoverable error in the communication such as receiving a malformed message, it may be a problem in the implementation of the server, the server shutting down closing the connections in a not very graceful manner, etc.

Finally, this may or may not be related to an underlying network problem, usually it's not.

I'm moving it to the kube-apiserver team so that they investigate why are they resetting the connection in this particular case, but
it's important to say that most of the searchs in that match won't be related to the kube-apiserver at all.

Comment 3 Stefan Schimanski 2020-06-05 07:52:12 UTC

> I'm moving it to the kube-apiserver team so that they investigate why are they resetting the connection in this particular case, but
it's important to say that most of the searchs in that match won't be related to the kube-apiserver at all.

Please do you homework and look at the very least into logs before sending around BZs. It's not very effective to let another team do the same work again. If there is a hint that it's really the server side, sure. Move it to the owning team.

Comment 4 Stefan Schimanski 2020-06-05 09:59:01 UTC


*** This bug has been marked as a duplicate of bug 1844384 ***

Note You need to log in before you can comment on or make changes to this bug.