Bug 1454928
| Summary: | NodePort with UDP does not distribute traffic | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Steven Walter <stwalter> |
| Component: | Networking | Assignee: | Ben Bennett <bbennett> |
| Status: | CLOSED NOTABUG | QA Contact: | Meng Bo <bmeng> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.5.0 | CC: | aos-bugs |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-05-25 13:46:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Steven Walter
2017-05-23 19:02:31 UTC
From customer: To reproduce, create a service with a type NodePort, protocol UDP and no session affinity. Then start two pods on different application nodes that listen on a UDP port. I happen to be using sflowtool from InMon but you could just as easily run a copy of netcat or even a simple Perl script that listens on a network port. At this point, you can then send a UDP stream into the cluster to any node and the NodePort magic kicks in doing the port translation and directing the traffic to a pod. For my test, I just happen to send traffic into cluster by sending it to one of the infra nodes but in reality, I could sent it to any app node or even the masters. At this point, you will notice that only one pod is receiving traffic even though there are two pods running and traffic should be randomly sent to both pods. Then, do something that makes the pod that is taking traffic stop (kill the pod or try to scale the DC down to less replicas where the 'working' pod is removed). What happens is that as soon as the pod that was taking all the traffic stops, even though there are other working pods in the service, the node that you are directing traffic to (one of the infra nodes in my test here) will start sending icmp unreachable messages back towards the device that is trying to send traffic into the cluster. In order to send related traffic for UDP to the same backing pod, we use conntrack for UDP in iptables. What that means is that UDP packets from the same source IP:port to the service IP:port will go to the same backend until the UDP conntrack entry times out after 180 seconds (if there is no traffic). http://www.iptables.info/en/connection-state.html#UDPCONNECTIONS If you want the sessions to be "different" you need to vary the source port being used. I agree that this is not great... but what we can do for UDP services us limited by the constraints of the protocol. |