Description of problem: A traffic listener pod is created and exposed via service type NodePort. Traffic is sent from client pod to the exposed nodeport on all Nodes IPs one by one. All other nodes shows UNREPLIED entry in conntrack table except the one client pod runs on (from where the traffic is sent). Client pod is just a ping pod utilized to send traffic.
All nodes supposed to be proxying the exposed service due to type NodePort.
$ oc get pods
NAME READY STATUS RESTARTS AGE
hello-pod 1/1 Running 0 22h <<<<Ping pod
udp-rc-lcbst 1/1 Running 0 51m <<<<Traffic listener pod
$ oc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
udp-rc-lcbst NodePort 172.30.154.219 <none> 8080:31963/UDP 105m
$ sudo podman run -rm --network host --privileged docker.io/aosqe/conntrack-tool conntrack -L | grep 31963
udp 17 5 src=172.31.130.146 dst=172.31.139.127 sport=34999 dport=31963 [UNREPLIED] src=172.31.139.127 dst=172.31.130.146 sport=31963 dport=34999 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp 17 12 src=172.31.130.146 dst=172.31.159.254 sport=52167 dport=31963 [UNREPLIED] src=172.31.159.254 dst=172.31.130.146 sport=31963 dport=52167 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp 17 20 src=172.31.130.146 dst=184.108.40.206 sport=37556 dport=31963 [UNREPLIED] src=220.127.116.11 dst=172.31.130.146 sport=31963 dport=37556 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1
udp 17 149 src=172.31.130.146 dst=172.31.130.146 sport=58178 dport=31963 src=10.129.2.23 dst=10.128.2.1 sport=8080 dport=58178 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1
sudo podman command is just running conntrack utility in a container and removes container post command execution
Version-Release number of selected component (if applicable): 4.0.0-0.nightly-2019-04-05-165550
$ oc version --short
Client Version: v4.0.22
Server Version: v1.13.4+ab11434
How reproducible: Always
Steps to Reproduce:
1. Create traffic listener pod and a ping pod. See in addtional info below
2. oc expose pod <traffic_listener_pod> --type=NodePort --port=8080 --protocol=UDP
3. Send traffic via client pod to all node IPs and nodeport one by one
Actual results: Not all nodes are responding to client but only that node on with client on
Expected results: Expecting all nodes to reply to client as the service type is NodePort which is supposed to expose service on all nodes
traffic listener pod template
"command": [ "/usr/bin/ncat", "-u", "-l", "8080","--keep-open", "--exec", "/bin/cat"],
$ oc get svc -oyaml
- apiVersion: v1
- nodePort: 31963
Ok further experiments tells me that it might be due to node to node network connectivity absence in 4.x. I am not able to ping one node from another node or vice versa. Is it a restriction on CoreOS on 4.x?
Looks like an AWS security group issue, from the console I can see we only opened the port range from 30000 to 32767 for TCP protocol. Maybe we need also open them for UDP.
Can you help get the output about iptables and netstat for your udp node port?
iptables-save | grep 31963
netstat -lnpu | grep 31963
I think all the related entries should be there.
Yup, we need to open this range for UDP as well, I'll file a PR.
(In reply to Meng Bo from comment #2)
> Looks like an AWS security group issue, from the console I can see we only
> opened the port range from 30000 to 32767 for TCP protocol. Maybe we need
> also open them for UDP.
> To Anurag,
> Can you help get the output about iptables and netstat for your udp node
> iptables-save | grep 31963
> netstat -lnpu | grep 31963
> I think all the related entries should be there.
iptables-save entries seems to be correct
$ sudo iptables-save | grep 31326
-A KUBE-NODEPORTS -p udp -m comment --comment "test/udp-rc-ctsj7:" -m udp --dport 31326 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p udp -m comment --comment "test/udp-rc-ctsj7:" -m udp --dport 31326 -j KUBE-SVC-J5HIX5PZU2ZRSTD5
While netstat doesn;t show the expected port range opened
$ netstat -lnpu | grep "Proto\|31326"
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
udp6 0 0 :::31326 :::* -
Will have to verify this on next good build. Not getting green build on 4.1 since 8 days. Thanks.
Verified on 4.1.0-0.nightly-2019-04-18-170154.
Port range 30000-32767 is now allowed for UDP for NodePort services. Test steps worked fine now as mentioned in description
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.