Bug 1280279 - Ports exposed for NodePort services are blocked by default
Summary: Ports exposed for NodePort services are blocked by default
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.0.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Maru Newby
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On:
Blocks: 1267746
TreeView+ depends on / blocked
 
Reported: 2015-11-11 11:56 UTC by Josep 'Pep' Turro Mauri
Modified: 2019-09-12 09:16 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-29 20:58:34 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Origin (Github) 6214 0 None None None 2016-01-05 00:35:40 UTC

Description Josep 'Pep' Turro Mauri 2015-11-11 11:56:52 UTC
Description of problem:

When creating services of type NodePort on a standard node configuration, the default iptables rules reject connections to the port exposed by the service.

Version-Release number of selected component (if applicable):
openshift v3.0.2.0-20-g656dc3e

How reproducible:
Always

Steps to Reproduce:
1. Configure a service of type NodePort:

apiVersion: v1
kind: Service
spec:
  ...
  ports:
  - name: 8080-tcp
    nodePort: 30123
    port: 8080
    protocol: TCP
    targetPort: 8080
  type: NodePort
  ...

2. Have a default iptables configuration in the nodes. This includes this last rule in the INPUT filter table:

-A INPUT -j REJECT --reject-with icmp-host-prohibited

3. Try to access the port on a node from an external system:

[openshift@master-29725 ~]$ curl node-d6398.example.com:30123

Actual results:

curl: (7) Failed connect to node-d6398.example.com:30123; No route to host


Expected results:

The kubelet would add ACCEPT rules for the ports that are exposed as part of services of type NodePort

Additional info:

Adding a rule to accept the defined range of ports for nodeport services (30000-32767 by default) doesn't help, because the traffic actually gets redirected:

[root@node-d6398 openshift]# iptables -L -t nat | grep 30123
REDIRECT   tcp  --  anywhere             anywhere             /* demo/example:8080-tcp */ tcp dpt:30123 redir ports 48117
DNAT       tcp  --  anywhere             anywhere             /* demo/example:8080-tcp */ tcp dpt:30123 to:192.168.55.17:48117

In this example, it's port 48117 that needs to be ACCEPT'ed

Comment 1 Ben Bennett 2015-11-12 16:23:13 UTC
Still fails with

Used:
> cat > foo
apiVersion: v1
kind: List
items:
- apiVersion: v1
  kind: Pod
  metadata:
    name: bz-nodeport
    labels:
      bz: bz1280279
  spec:
    containers:
    - name: bz-nodeport-nginx
      image: nginx
      ports:
      - containerPort: 80
- apiVersion: v1
  kind: Service
  metadata:
    name: bz-nodeport
  spec:
    selector:
      bz: bz1280279
    ports:
    - name: 80-tcp
      nodePort: 30123
      port: 80
      protocol: TCP
      targetPort: 80
    type: NodePort
^D
> oc create -f foo

From one of the nodes to itself:
> curl rh71-os1.example.com:43220
[Works]

To another node:
> curl rh71-os2.example.com:43220
curl: (7) Failed connect to rh71-os1.example.com:43220; Connection refused

Comment 3 Eric Paris 2015-11-12 18:16:06 UTC
Random brain dump. Maru: ignore if not useful.

There are 4 ports involved here, in Ben's above example:

port: 80 (how the service describes itself)
targetPort: 80 (what port in the container to connect to, can be a name)
nodePort: 30123 (what port all nodes should listen to for this service)
servicePort: 43220 (what port the proxy listens to for this service)

Traffic coming into the host will have a dest IP of the host IP and the dest port of the nodePort (30123).

It will hit the PREROUTING chain of the `nat` table and will then jump to the KUBE-NODEPORT-CONTAINER chain. That chain will rewrite the dest port to the local servicePort (43220)

We'll eventually hit the INPUT chain of the `filter` table where we hit the denial rule in question. (All of this is described in comment #0)

I dislike the idea of adding an allow rule for the 43220, although that will get us working. How strange is it that you do not need to allow the port where traffic enters, but you need to allow a random port...

I have no idea if it works, but maybe we can change not just the dport, but also to dest IP, to 127.0.0.1 ? Which would pass the later INPUT chain.

If that doesn't work, maybe we can somehow mark the packet which comes in via a nodePort with some magic iptables mark which we allow later in the filter/INPUT. So traffic directly to the servicePort won't be allowed from outside and only nodePort traffic will be....

Comment 4 Ben Bennett 2015-11-12 18:58:07 UTC
Ugh... do we really want to do the mangle / mark thing?

I suppose we could also try redirecting (in the nat table) the proxy port to some other port that we always drop (since you can't drop in the nat table's PREROUTING chain).  So we redirect the service port to the proxy, and (before that) the proxy to junk... and then drop junk?

BUT Do we really care if we expose the (transitory) proxy port as well as the proper service port?  I agree it's cleaner to disallow it.  But is it worth adding more rules to the iptables?

Comment 5 Ben Bennett 2015-11-16 19:22:27 UTC
Still fails with atomic-openshift-master-3.1.0.4-1.git.2.c5fa845.el7aos.x86_64

(Adding the missing fact to my comment above)

Comment 6 Ben Bennett 2015-11-16 20:36:23 UTC
And the curl commands in my reproduction are wrong... they should be to port 30123

Comment 7 Monisha 2015-12-07 20:06:18 UTC
I am trying Openshift v3 with a single master with a single node setup.
I installed the examples/sample-app for ruby hello openshift and the pods and service is up. When I try to use NodePort or LoadBalancer options for enabling external access to this service frontend, I keep getting as below :

[root@openshift-master~]# curl openshift-node.tidalsoft:31597
curl: (7) Failed connect to openshift-node.tidalsoft:31597; No route to host

[root@openshift-master~]# oc describe service frontend
Name: frontend
Namespace: test
Labels: template=application-template-stibuild
Selector: name=frontend
Type: NodePort
IP: 172.30.252.16
Port: web 5432/TCP
NodePort: web 31597/TCP
Endpoints: 10.1.0.10:8080,10.1.0.13:8080
Session Affinity: None
No events.

when I check the rules on node :

[root@openshift-node~]# iptables -t nat -L | grep 31597
REDIRECT tcp -- anywhere anywhere /* test/frontend:web / tcp dpt:31597 redir ports 39433
DNAT tcp -- anywhere anywhere / test/frontend:web */ tcp dpt:31597 to:10.88.102.48:39433

Hence I added rules to allow the redirect port 39433

[root@openshift-node~]# iptables -I OS_FIREWALL_ALLOW -p tcp -m tcp --dport 39433 -j ACCEPT

After adding this rule, external access starts working. I am now confused...is this something that's needed for external access or am I missing any config here?
Also I observed that if the node is rebooted, this redir port gets changed and then again I have to manually add iptables rule to allow the new redir port.

Comment 8 Josep 'Pep' Turro Mauri 2016-01-11 20:16:37 UTC
It seems that not only NodePort services are affected: you hit the same problem when trying to use externalIPs:

  https://github.com/kubernetes/kubernetes/blob/release-1.1/docs/user-guide/services.md#external-ips

For example this service:

apiVersion: v1
kind: Service
metadata:
  creationTimestamp: null
  name: test
spec:
  externalIPs:
  - 192.168.8.77
  ports:
  - name: 80-tcp
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    test: something
  sessionAffinity: None
  type: ClusterIP

Results in the following rules:

REDIRECT   tcp  --  0.0.0.0/0            192.168.8.77          /* test/test:80-tcp */ tcp dpt:80 PHYSDEV match ! --physdev-is-in redir ports 55225
REDIRECT   tcp  --  0.0.0.0/0            192.168.8.77          /* test/test:80-tcp */ tcp dpt:80 ADDRTYPE match dst-type LOCAL redir ports 55225
DNAT       tcp  --  0.0.0.0/0            192.168.8.77          /* test/test:80-tcp */ tcp dpt:80 ADDRTYPE match dst-type LOCAL to:192.168.8.188:55225

and the redirection to port 55225 gets blocked.

$ curl 192.168.8.77
curl: (7) Failed to connect to 192.168.8.77 port 80: No route to host

Comment 9 Maru Newby 2016-01-12 18:06:15 UTC
This bug will be fixed in the 3.1.1 release.  3.1.1 replaces the userspace kube proxy with the iptables kube proxy, and the iptables proxy is compatible with default deny firewall rules.

Comment 10 zhaozhanqi 2016-01-28 09:00:09 UTC
Can you help change the status to 'ON_QA' since this bug has been fixed.

I tested with the following version: 

#oc version
oc v3.1.1.6
kubernetes v1.1.0-origin-1107-g4c8e6f4

when the type is "NodePort" is specified in the service, for example the nodePort is 30000, the iptalbes will be added like:
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/hello-pod:http" -m tcp --dport 30000 -j MARK --set-xmark 0x4d415351/0xffffffff
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/hello-pod:http" -m tcp --dport 30000 -j KUBE-SVC-P7OB3FWBFXAO7AOM

and we can access the service via $node{1,2,3}:$nodePort
if the service is deleted, the chain will also be deleted. and the service cannot be accessed via the nodeport

Comment 11 zhaozhanqi 2016-01-29 09:03:13 UTC
Verified this bug according to comment 10

Comment 12 Dan Winship 2016-03-23 13:39:10 UTC
Maru, I was adding Trello cards about creating test cases corresponding to old bugs, and I wasn't sure if this is something that currently gets tested by k8s's own tests, or if it's something we need to add our own test for... ?

Comment 13 Maru Newby 2016-03-23 16:27:01 UTC
(In reply to Dan Winship from comment #12)
> Maru, I was adding Trello cards about creating test cases corresponding to
> old bugs, and I wasn't sure if this is something that currently gets tested
> by k8s's own tests, or if it's something we need to add our own test for... ?

The kube e2e test 'Services should be able to create a functioning NodePort service' should be sufficient when run against a deployment with a default deny firewall (like dind).  We're currently only testing deployments with the iptables proxy, though.  Do you think it would be useful to target the userspace proxy as well?

Comment 14 Dan Winship 2016-03-23 17:22:30 UTC
Hopefully upstream is testing its own code against the userspace proxy, and many of our changes relative to the upstream networking code only affect openshift-sdn, but we only support the userspace proxy for people running non-openshift-sdn plugins... So it doesn't seem worth doubling the number of network tests so we can test everything against both iptables and userspace. (If there were only a few tests where we'd want to test both ways, then maybe?

Comment 15 Maru Newby 2016-03-23 17:45:01 UTC
(In reply to Dan Winship from comment #14)
> Hopefully upstream is testing its own code against the userspace proxy, and
> many of our changes relative to the upstream networking code only affect
> openshift-sdn, but we only support the userspace proxy for people running
> non-openshift-sdn plugins... So it doesn't seem worth doubling the number of
> network tests so we can test everything against both iptables and userspace.
> (If there were only a few tests where we'd want to test both ways, then
> maybe?

The test runner currently runs the same tests against each cluster but it doesn't have to.  It wouldn't take too long to deploy a dind cluster with the userspace proxy and only run the NodePort test against it.


Note You need to log in before you can comment on or make changes to this bug.