Description of problem: When creating services of type NodePort on a standard node configuration, the default iptables rules reject connections to the port exposed by the service. Version-Release number of selected component (if applicable): openshift v3.0.2.0-20-g656dc3e How reproducible: Always Steps to Reproduce: 1. Configure a service of type NodePort: apiVersion: v1 kind: Service spec: ... ports: - name: 8080-tcp nodePort: 30123 port: 8080 protocol: TCP targetPort: 8080 type: NodePort ... 2. Have a default iptables configuration in the nodes. This includes this last rule in the INPUT filter table: -A INPUT -j REJECT --reject-with icmp-host-prohibited 3. Try to access the port on a node from an external system: [openshift@master-29725 ~]$ curl node-d6398.example.com:30123 Actual results: curl: (7) Failed connect to node-d6398.example.com:30123; No route to host Expected results: The kubelet would add ACCEPT rules for the ports that are exposed as part of services of type NodePort Additional info: Adding a rule to accept the defined range of ports for nodeport services (30000-32767 by default) doesn't help, because the traffic actually gets redirected: [root@node-d6398 openshift]# iptables -L -t nat | grep 30123 REDIRECT tcp -- anywhere anywhere /* demo/example:8080-tcp */ tcp dpt:30123 redir ports 48117 DNAT tcp -- anywhere anywhere /* demo/example:8080-tcp */ tcp dpt:30123 to:192.168.55.17:48117 In this example, it's port 48117 that needs to be ACCEPT'ed
Still fails with Used: > cat > foo apiVersion: v1 kind: List items: - apiVersion: v1 kind: Pod metadata: name: bz-nodeport labels: bz: bz1280279 spec: containers: - name: bz-nodeport-nginx image: nginx ports: - containerPort: 80 - apiVersion: v1 kind: Service metadata: name: bz-nodeport spec: selector: bz: bz1280279 ports: - name: 80-tcp nodePort: 30123 port: 80 protocol: TCP targetPort: 80 type: NodePort ^D > oc create -f foo From one of the nodes to itself: > curl rh71-os1.example.com:43220 [Works] To another node: > curl rh71-os2.example.com:43220 curl: (7) Failed connect to rh71-os1.example.com:43220; Connection refused
Random brain dump. Maru: ignore if not useful. There are 4 ports involved here, in Ben's above example: port: 80 (how the service describes itself) targetPort: 80 (what port in the container to connect to, can be a name) nodePort: 30123 (what port all nodes should listen to for this service) servicePort: 43220 (what port the proxy listens to for this service) Traffic coming into the host will have a dest IP of the host IP and the dest port of the nodePort (30123). It will hit the PREROUTING chain of the `nat` table and will then jump to the KUBE-NODEPORT-CONTAINER chain. That chain will rewrite the dest port to the local servicePort (43220) We'll eventually hit the INPUT chain of the `filter` table where we hit the denial rule in question. (All of this is described in comment #0) I dislike the idea of adding an allow rule for the 43220, although that will get us working. How strange is it that you do not need to allow the port where traffic enters, but you need to allow a random port... I have no idea if it works, but maybe we can change not just the dport, but also to dest IP, to 127.0.0.1 ? Which would pass the later INPUT chain. If that doesn't work, maybe we can somehow mark the packet which comes in via a nodePort with some magic iptables mark which we allow later in the filter/INPUT. So traffic directly to the servicePort won't be allowed from outside and only nodePort traffic will be....
Ugh... do we really want to do the mangle / mark thing? I suppose we could also try redirecting (in the nat table) the proxy port to some other port that we always drop (since you can't drop in the nat table's PREROUTING chain). So we redirect the service port to the proxy, and (before that) the proxy to junk... and then drop junk? BUT Do we really care if we expose the (transitory) proxy port as well as the proper service port? I agree it's cleaner to disallow it. But is it worth adding more rules to the iptables?
Still fails with atomic-openshift-master-3.1.0.4-1.git.2.c5fa845.el7aos.x86_64 (Adding the missing fact to my comment above)
And the curl commands in my reproduction are wrong... they should be to port 30123
I am trying Openshift v3 with a single master with a single node setup. I installed the examples/sample-app for ruby hello openshift and the pods and service is up. When I try to use NodePort or LoadBalancer options for enabling external access to this service frontend, I keep getting as below : [root@openshift-master~]# curl openshift-node.tidalsoft:31597 curl: (7) Failed connect to openshift-node.tidalsoft:31597; No route to host [root@openshift-master~]# oc describe service frontend Name: frontend Namespace: test Labels: template=application-template-stibuild Selector: name=frontend Type: NodePort IP: 172.30.252.16 Port: web 5432/TCP NodePort: web 31597/TCP Endpoints: 10.1.0.10:8080,10.1.0.13:8080 Session Affinity: None No events. when I check the rules on node : [root@openshift-node~]# iptables -t nat -L | grep 31597 REDIRECT tcp -- anywhere anywhere /* test/frontend:web / tcp dpt:31597 redir ports 39433 DNAT tcp -- anywhere anywhere / test/frontend:web */ tcp dpt:31597 to:10.88.102.48:39433 Hence I added rules to allow the redirect port 39433 [root@openshift-node~]# iptables -I OS_FIREWALL_ALLOW -p tcp -m tcp --dport 39433 -j ACCEPT After adding this rule, external access starts working. I am now confused...is this something that's needed for external access or am I missing any config here? Also I observed that if the node is rebooted, this redir port gets changed and then again I have to manually add iptables rule to allow the new redir port.
It seems that not only NodePort services are affected: you hit the same problem when trying to use externalIPs: https://github.com/kubernetes/kubernetes/blob/release-1.1/docs/user-guide/services.md#external-ips For example this service: apiVersion: v1 kind: Service metadata: creationTimestamp: null name: test spec: externalIPs: - 192.168.8.77 ports: - name: 80-tcp port: 80 protocol: TCP targetPort: 80 selector: test: something sessionAffinity: None type: ClusterIP Results in the following rules: REDIRECT tcp -- 0.0.0.0/0 192.168.8.77 /* test/test:80-tcp */ tcp dpt:80 PHYSDEV match ! --physdev-is-in redir ports 55225 REDIRECT tcp -- 0.0.0.0/0 192.168.8.77 /* test/test:80-tcp */ tcp dpt:80 ADDRTYPE match dst-type LOCAL redir ports 55225 DNAT tcp -- 0.0.0.0/0 192.168.8.77 /* test/test:80-tcp */ tcp dpt:80 ADDRTYPE match dst-type LOCAL to:192.168.8.188:55225 and the redirection to port 55225 gets blocked. $ curl 192.168.8.77 curl: (7) Failed to connect to 192.168.8.77 port 80: No route to host
This bug will be fixed in the 3.1.1 release. 3.1.1 replaces the userspace kube proxy with the iptables kube proxy, and the iptables proxy is compatible with default deny firewall rules.
Can you help change the status to 'ON_QA' since this bug has been fixed. I tested with the following version: #oc version oc v3.1.1.6 kubernetes v1.1.0-origin-1107-g4c8e6f4 when the type is "NodePort" is specified in the service, for example the nodePort is 30000, the iptalbes will be added like: -A KUBE-NODEPORTS -p tcp -m comment --comment "default/hello-pod:http" -m tcp --dport 30000 -j MARK --set-xmark 0x4d415351/0xffffffff -A KUBE-NODEPORTS -p tcp -m comment --comment "default/hello-pod:http" -m tcp --dport 30000 -j KUBE-SVC-P7OB3FWBFXAO7AOM and we can access the service via $node{1,2,3}:$nodePort if the service is deleted, the chain will also be deleted. and the service cannot be accessed via the nodeport
Verified this bug according to comment 10
Maru, I was adding Trello cards about creating test cases corresponding to old bugs, and I wasn't sure if this is something that currently gets tested by k8s's own tests, or if it's something we need to add our own test for... ?
(In reply to Dan Winship from comment #12) > Maru, I was adding Trello cards about creating test cases corresponding to > old bugs, and I wasn't sure if this is something that currently gets tested > by k8s's own tests, or if it's something we need to add our own test for... ? The kube e2e test 'Services should be able to create a functioning NodePort service' should be sufficient when run against a deployment with a default deny firewall (like dind). We're currently only testing deployments with the iptables proxy, though. Do you think it would be useful to target the userspace proxy as well?
Hopefully upstream is testing its own code against the userspace proxy, and many of our changes relative to the upstream networking code only affect openshift-sdn, but we only support the userspace proxy for people running non-openshift-sdn plugins... So it doesn't seem worth doubling the number of network tests so we can test everything against both iptables and userspace. (If there were only a few tests where we'd want to test both ways, then maybe?
(In reply to Dan Winship from comment #14) > Hopefully upstream is testing its own code against the userspace proxy, and > many of our changes relative to the upstream networking code only affect > openshift-sdn, but we only support the userspace proxy for people running > non-openshift-sdn plugins... So it doesn't seem worth doubling the number of > network tests so we can test everything against both iptables and userspace. > (If there were only a few tests where we'd want to test both ways, then > maybe? The test runner currently runs the same tests against each cluster but it doesn't have to. It wouldn't take too long to deploy a dind cluster with the userspace proxy and only run the NodePort test against it.