Bug 1762298

Summary: kube-proxy fix for spurious connection resets
Product: OpenShift Container Platform Reporter: Dan Winship <danw>
Component: NetworkingAssignee: Dan Winship <danw>
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: vjaypurk
Version: 3.11.0   
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: In clusters with high network traffic where some packets are getting dropped, a previously-working connection to a service might suddenly fail with a "Connection reset by peer" error. Consequence: Clients would need to reconnect and retry; transferring large amounts of data might be difficult. Fix: An update was made to the iptables rules so they will handle TCP retransmits correctly. Result: Connections that have been successfully established will remain established until they are closed.
Story Points: ---
Clone Of:
: 1762300 1762301 1768436 (view as bug list) Environment:
Last Closed: 2020-01-23 11:07:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1762300, 1762301, 1768436    

Description Dan Winship 2019-10-16 13:06:07 UTC
https://github.com/kubernetes/kubernetes/pull/74840 fixed a problem in kube-proxy where if certain TCP ACKs got lost (eg because there's a lot of traffic and so some packets are getting dropped) then the connection might be spuriously closed when kube-proxy received an unexpected retransmit.

There's not an easy way to test this but it's a simple patch, and it appears to have fixed the problem for the original filers, and the fix went into kube 1.15 and hasn't caused problems for anyone else, and now we have a customer who would like it backported.

Comment 2 Dan Winship 2019-10-16 18:51:29 UTC
Weird. I'd swear I set this to Version: 3.11.0, Target Release 4.3.0, but somehow it ended up with Version: 3.9.0, Target Release 4.2.z.

Comment 3 Dan Winship 2019-11-04 12:46:06 UTC
This is already fixed in 4.3, and there is no useful QE that can be done (other than verifying that it didn't break anything else, which has implicitly already happened since the fix has always been in 4.3).

Comment 5 errata-xmlrpc 2020-01-23 11:07:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062