1679260 – Conntrack rule for UDP traffic is not removed when using NodePort and externalIPs

Bug 1679260 - Conntrack rule for UDP traffic is not removed when using NodePort and externalIPs

Summary: Conntrack rule for UDP traffic is not removed when using NodePort and externa...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.11.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	unspecified
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Casey Callendrello
QA Contact:	zhaozhanqi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-02-20 18:17 UTC by Juan Luis de Sousa-Valadas
Modified:	2023-09-07 19:45 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:44:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	/kubernetes kubernetes pull 75265	None	None	None	2020-08-05 08:49:44 UTC
Origin (Github)	22345	None	None	None	2019-03-28 15:05:21 UTC
Origin (Github)	22366	None	None	None	2019-05-10 15:24:23 UTC
Red Hat Product Errata	RHBA-2019:0758	None	None	None	2019-06-04 10:44:20 UTC

Description Juan Luis de Sousa-Valadas 2019-02-20 18:17:32 UTC

Description of problem:

The pod is restarted and has a new pod IP. There is a long living stream of udp packets to the nodeport created for this this pod. The old conntrack table entry pointing to the old IP of the pod is never cleaned up. 


Version-Release number of selected component (if applicable):
v3.11.69

How reproducible:
Always

Steps to Reproduce:
1. Create a service balancing udp with nodePort and externalIPs
2. delete the pod getting traffic
3. conntrack -L | grep podIP

Actual results:
udp      17 28 src=<ip outside ocp> dst=<service externalIP> sport=55212 dport=<serviceport> [UNREPLIED] src=<podIP/endpoint> dst=<tun0 interface of the node with the externalIP> sport=<serviceport> dport=55212 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=7

Expected results:

The conntrack rule is cleared.

Additional info:
Related bug https://bugzilla.redhat.com/show_bug.cgi?id=1659204

I haven't tested it but this is the function:
func ClearEntriesForNAT(execer exec.Interface, origin, dest string, protocol v1.Protocol) error {
        parameters := parametersWithFamily(utilnet.IsIPv6String(origin), "-D", "--orig-dst", origin, "--dst-nat", dest,
                "-p", protoStr(protocol))
        err := Exec(execer, parameters...)
        if err != nil && !strings.Contains(err.Error(), NoConnectionToDelete) {
                // TODO: Better handling for deletion failure. When failure occur, stale udp connection may not get flushed.
                // These stale udp connection will keep black hole traffic. Making this a best effort operation for now, since it
                // is expensive to baby sit all udp connections to kubernetes services.
                return fmt.Errorf("error deleting conntrack entries for UDP peer {%s, %s}, error: %v", origin, dest, err)
        }
        return nil
}

I understand we could fix it by simply replacing:

        parameters := parametersWithFamily(utilnet.IsIPv6String(origin), "-D", "--orig-dst", origin, "--dst-nat", dest,

for:
        parameters := parametersWithFamily(utilnet.IsIPv6String(origin), "-D", "--orig-dst", origin, "--reply-src", dest,

Comment 13 Weibin Liang 2019-04-04 17:56:07 UTC

Verified in v3.11.100 code and testing passed.


1 SVC with 3 endpoints, SVC with Nodeport and externalIP configured.

Testing when ep from 3 -> 2 -> 3 and from 1 -> 0 -> 1.
The old conntrack table entry was deleted and the new conntrack table entry
pointing to the new IP of the pod was created.

Comment 16 errata-xmlrpc 2019-06-04 10:44:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.