Bug 1487438

Summary: Conntrack table entry is not removed when UDP service is added after single pod was removed and added back
Product: OpenShift Container Platform Reporter: Ryan Howe <rhowe>
Component: NetworkingAssignee: Dan Winship <danw>
Status: CLOSED ERRATA QA Contact: Meng Bo <bmeng>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: aos-bugs, bbennett, byount, eparis, mmariyan, nbhatt, xtian
Target Milestone: ---Keywords: Reopened
Target Release: 3.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Conntrack entries for UDP traffic were not cleared when an endpoint was added for a service that previously had no endpoints. Consequence: The system could end up incorrectly caching a rule that would cause traffic to that service to be dropped rather than being sent to the new endpoint. Fix: The relevant conntrack entries are now deleted at the right time. Result: UDP services work correctly when endpoints are added and removed.
Story Points: ---
Clone Of:
: 1497767 1497768 (view as bug list) Environment:
Last Closed: 2018-06-25 15:27:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1497767, 1497768    

Description Ryan Howe 2017-08-31 23:06:15 UTC
Description of problem:

When a service that only has one pod linked, has that pod removed. UDP connections using the service IP after the removal create a conntrack table entry that does not get cleaned up when the pod is added back to the cluster. 


Version-Release number of selected component (if applicable):
3.5

How reproducible:
100% 

Steps to Reproduce:


Service IP = 172.25.5.107 

UDPSEND is on IP 10.0.2.198
UDPSINK is on IP 10.0.2.197



UDPSINK is seeing hello world messages:

$ oc logs udpsink-1-q507v -f
hello world
hello world
hello world

# conntrack -L -d 172.25.5.107
udp      17 29 src=10.0.2.198 dst=172.25.5.107 sport=35238 dport=8000 [UNREPLIED] src=10.0.2.197 dst=10.0.2.1 sport=8000 dport=35238 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1


Delete the pod 

# oc delete po/udpsink-1-q507v
pod "udpsink-1-q507v" deleted

Make UDP connections 

The following is seen in the conntrack table: 

# conntrack -L -d 172.25.5.107
udp      17 29 src=10.0.2.198 dst=172.25.5.107 sport=35238 dport=8000 [UNREPLIED] src=172.25.5.107 dst=10.0.2.1 sport=8000 dport=35238 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=7

Bring the pod up again and we continue to fail to get the UDP connects that are sent

To correct the table entry needs to be deleted. 


# conntrack -D -d 172.25.5.107
udp      17 29 src=10.0.2.198 dst=172.25.5.107 sport=35238 dport=8000 [UNREPLIED] src=172.25.5.107 dst=10.0.2.1 sport=8000 dport=35238 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=9


After it is deleted everything now works. 

# conntrack -L -d 172.25.5.107
udp      17 29 src=10.0.2.198 dst=172.25.5.107 sport=35238 dport=8000 [UNREPLIED] src=10.0.3.187 dst=10.0.2.1 sport=8000 dport=35238 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1
conntrack v1.4.4 (conntrack-tools): 1 flow entries have been shown.


https://github.com/kubernetes/kubernetes/blob/release-1.5/pkg/proxy/iptables/proxier.go

Comment 2 Dan Winship 2017-09-13 15:31:07 UTC
Upstream fix: https://github.com/kubernetes/kubernetes/pull/48524

Comment 4 Dan Winship 2017-09-13 15:45:20 UTC
Fix: https://github.com/openshift/origin/pull/16328

Comment 7 Meng Bo 2017-09-27 11:06:31 UTC
Checked on OCP v3.7.0-0.127.0, the conntrack entry will be deleted immediately once the svc endpoint gets deleted.

Verify the bug.

Comment 8 Ben Bennett 2017-10-05 18:12:08 UTC
*** Bug 1486956 has been marked as a duplicate of this bug. ***

Comment 11 errata-xmlrpc 2017-11-28 22:09:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188