1769534 – NetworkPolicy object fails when router pods schedule on infra nodes (vSphere)

Bug 1769534 - NetworkPolicy object fails when router pods schedule on infra nodes (vSphere)

Summary: NetworkPolicy object fails when router pods schedule on infra nodes (vSphere)

Keywords:
Status:	CLOSED DUPLICATE of bug 1768608
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.2.z
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	4.4.0
Assignee:	Juan Luis de Sousa-Valadas
QA Contact:	zhaozhanqi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-11-06 19:54 UTC by Caden Marchese
Modified:	2023-03-24 15:55 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-02-24 15:43:50 UTC
Target Upstream Version:
Embargoed:
Flags:	cdc: needinfo-

Attachments	(Terms of Use)

Description Caden Marchese 2019-11-06 19:54:48 UTC

Description of problem:
When creating the below NetworkPolicy in OCP 4.2 to allow ingress only from router pods and pods in same namespace, it works as expected when the router pods are scheduled onto worker nodes, but not when they are scheduled onto dedicated infrastructure nodes.

apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
  name: allow-ingress-and-namespace
spec:
  ingress:
  - from:
    - podSelector: {}
  - from:
    - namespaceSelector:
        matchLabels:
          network.openshift.io/policy-group: ingress
      podSelector:
        matchLabels:
          ingresscontroller.operator.openshift.io/deployment-ingresscontroller: default
  podSelector: {}
  policyTypes:
  - Ingress

Labels for dedicated infrastructure nodes:

server.example.com  Ready    infra    16d   v1.14.6+c07e432da   10.110.6.124   10.110.6.124   Red Hat Enterprise Linux CoreOS 42.80.20191010.0 (Ootpa)   4.18.0-80.11.2.el8_0.x86_64   cri-o://1.14.11-0.23.dev.rhaos4.2.gitc41de67.el8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=server.example.com,kubernetes.io/os=linux,node-role.kubernetes.io/infra=,node.openshift.io/os_id=rhcos

server.example.com   Ready    infra    16d   v1.14.6+c07e432da   10.110.6.125   10.110.6.125   Red Hat Enterprise Linux CoreOS 42.80.20191010.0 (Ootpa)   4.18.0-80.11.2.el8_0.x86_64   cri-o://1.14.11-0.23.dev.rhaos4.2.gitc41de67.el8   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=server.example.com,kubernetes.io/os=linux,node-role.kubernetes.io/infra=,node.openshift.io/os_id=rhcos

Version-Release number of selected component (if applicable):
4.2

Steps to Reproduce:
1. Deploy OpenShift 4.2 with two dedicated infra nodes using above labels
2. Create the above NetworkPolicy
3. Schedule the router pods to above infra nodes

Actual results:

When openshift-ingress is running on infra:
2019-10-18T01:33:36.834001208Z E1018 01:33:36.833963       1 reflector.go:205] github.com/openshift/router/pkg/router/controller/factory/factory.go:112: Failed to list *v1.Route: the server is currently unable to handle the request (get routes.route.openshift.io)
2019-10-18T01:33:42.017830095Z E1018 01:33:42.017782       1 reflector.go:205] github.com/openshift/router/pkg/router/controller/factory/factory.go:112: Failed to list *v1.Route: the server is currently unable to handle the request (get routes.route.openshift.io)
2019-10-18T01:43:49.982019347Z E1018 01:43:49.980552       1 webhook.go:90] Failed to make webhook authenticator request: Post https://172.30.0.1:443/apis/authentication.k8s.io/v1beta1/tokenreviews: net/http: TLS handshake timeout
2019-10-18T01:43:51.487288912Z E1018 01:43:51.067796       1 webhook.go:90] Failed to make webhook authenticator request: Post https://172.30.0.1:443/apis/authentication.k8s.io/v1beta1/tokenreviews: unexpected EOF

No issues when they are on worker nodes (which is default, I believe).

Comment 9 Andrew McDermott 2019-11-21 17:51:24 UTC

I am unable to reproduce this issue using openshift-4.2 on AWS, or
what is master today (Thu 21 Nov 2019) on GCP. I also notice this was
not reproduced in comment #1 either. Am I missing the obvious in
these steps?

$ oc version
Client Version: v4.2.0-alpha.0-85-g75a68b3
Server Version: 4.2.8
Kubernetes Version: v1.14.6+dea7fb9

$ oc new-project netpol

#
# Create an echo service
#

$ oc apply -n netpol -f -<<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo
  labels:
    app: echo
spec:
  replicas: 2
  template:
    metadata:
      name: echo
      labels:
	app: echo
    spec:
      containers:
	- name: echo
	  image: docker.io/frobware/http-echo:0.2.4-scratch
	  imagePullPolicy: Always
	  args:
	    - "-text=echoecho"
      restartPolicy: Always
  selector:
    matchLabels:
      app: echo
---
apiVersion: v1
kind: Service
metadata:
  name: echo-service
spec:
  selector:
    app: echo
  ports:
  - name: http
    port: 5678
EOF

$ oc expose service echo-service

$ oc get all
NAME                        READY   STATUS    RESTARTS   AGE
pod/echo-79db45b5fb-4dmr5   1/1     Running   0          52s
pod/echo-79db45b5fb-cv2l2   1/1     Running   0          52s

NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/echo-service   ClusterIP   172.30.187.191   <none>        5678/TCP   52s

NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/echo   2/2     2            2           52s

NAME                              DESIRED   CURRENT   READY   AGE
replicaset.apps/echo-79db45b5fb   2         2         2       52s

NAME                                    HOST/PORT                                                        PATH   SERVICES       PORT   TERMINATION   WILDCARD
route.route.openshift.io/echo-service   echo-service-netpol.apps.amcdermo-428.devcluster.openshift.com          echo-service   http                 None

#
# Verify echo service is working
#

$ curl echo-service-netpol.apps.amcdermo-428.devcluster.openshift.com
echoecho

#
# Apply the network policy that allows from ingress only
#

$ oc apply -n netpol -f -<<EOF
apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
  name: allow-ingress
spec:
  ingress:
  - from:
    - namespaceSelector:
	matchLabels:
	  network.openshift.io/policy-group: ingress
      podSelector:
	matchLabels:
	  ingresscontroller.operator.openshift.io/deployment-ingresscontroller: default
  # the pods that will be subject to this policy - the TARGET
  podSelector: {}
  policyTypes:
  - Ingress
EOF

#
# Accessing this from my desktop is expected to work - and does:
#
$ curl echo-service-netpol.apps.amcdermo-428.devcluster.openshift.com
echoecho

#
# Trying from a POD should fail because of the policy rules
#

$ oc run -n netpol --generator=run-pod/v1 busybox --rm -ti --image=busybox -- /bin/sh
/ # wget --spider --timeout 30 echo-service.netpol:5678
Connecting to echo-service.netpol:5678 (172.30.58.141:5678)
wget: download timed out

#
# Create network policy to allow from same namespace:
#

$ oc apply -n netpol -f -<<EOF
apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
  name: allow-from-same-namespace
spec:
  ingress:
  - from:
    - podSelector: {}
  podSelector: {}
  policyTypes:
  - Ingress
EOF

#
# Repeat the busybox test and note this is now successful:
#

$ oc run -n netpol --generator=run-pod/v1 busybox --rm -ti --image=busybox -- /bin/sh
If you don't see a command prompt, try pressing enter.
/ # wget --spider --timeout 30 echo-service.netpol:5678
Connecting to echo-service.netpol:5678 (172.30.187.191:5678)
remote file exists

#
# Scale-up machinesets so that we label the new nodes as "infra" nodes.
#

$ oc get machinesets --all-namespaces
NAMESPACE               NAME                                   DESIRED   CURRENT   READY   AVAILABLE   AGE
openshift-machine-api   amcdermo-428-x2r5f-worker-us-east-1a   1         1         1       1           3h36m
openshift-machine-api   amcdermo-428-x2r5f-worker-us-east-1b   1         1         1       1           3h36m
openshift-machine-api   amcdermo-428-x2r5f-worker-us-east-1c   1         1         1       1           3h36m
openshift-machine-api   amcdermo-428-x2r5f-worker-us-east-1d   0         0                             3h36m
openshift-machine-api   amcdermo-428-x2r5f-worker-us-east-1e   0         0                             3h36m
openshift-machine-api   amcdermo-428-x2r5f-worker-us-east-1f   0         0                             3h36m

$ oc scale -n openshift-machine-api machinesets/amcdermo-428-x2r5f-worker-us-east-1a --replicas=2
$ oc scale -n openshift-machine-api machinesets/amcdermo-428-x2r5f-worker-us-east-1b --replicas=2

$ oc get nodes --no-headers --sort-by=.metadata.creationTimestamp |cat -n
     1	ip-10-0-154-178.ec2.internal   Ready   master   3h42m   v1.14.6+6ac6aa4b0
     2	ip-10-0-174-223.ec2.internal   Ready   master   3h41m   v1.14.6+6ac6aa4b0
     3	ip-10-0-133-226.ec2.internal   Ready   master   3h41m   v1.14.6+6ac6aa4b0
     4	ip-10-0-128-203.ec2.internal   Ready   worker   3h36m   v1.14.6+6ac6aa4b0
     5	ip-10-0-153-114.ec2.internal   Ready   worker   3h36m   v1.14.6+6ac6aa4b0
     6	ip-10-0-174-174.ec2.internal   Ready   worker   3h36m   v1.14.6+6ac6aa4b0
     7	ip-10-0-133-207.ec2.internal   Ready   worker   75s     v1.14.6+6ac6aa4b0
     8	ip-10-0-149-154.ec2.internal   Ready   worker   57s     v1.14.6+6ac6aa4b0

$ oc label node ip-10-0-133-207.ec2.internal node-role.kubernetes.io/worker-
$ oc label node ip-10-0-133-207.ec2.internal node-role.kubernetes.io/infra=""

$ oc label node ip-10-0-149-154.ec2.internal node-role.kubernetes.io/worker-
$ oc label node ip-10-0-149-154.ec2.internal node-role.kubernetes.io/infra=""

$ oc get nodes --no-headers --sort-by=.metadata.creationTimestamp |cat -n
     1	ip-10-0-154-178.ec2.internal   Ready   master   3h44m   v1.14.6+6ac6aa4b0
     2	ip-10-0-174-223.ec2.internal   Ready   master   3h44m   v1.14.6+6ac6aa4b0
     3	ip-10-0-133-226.ec2.internal   Ready   master   3h44m   v1.14.6+6ac6aa4b0
     4	ip-10-0-128-203.ec2.internal   Ready   worker   3h39m   v1.14.6+6ac6aa4b0
     5	ip-10-0-153-114.ec2.internal   Ready   worker   3h39m   v1.14.6+6ac6aa4b0
     6	ip-10-0-174-174.ec2.internal   Ready   worker   3h38m   v1.14.6+6ac6aa4b0
     7	ip-10-0-133-207.ec2.internal   Ready   infra    3m38s   v1.14.6+6ac6aa4b0
     8	ip-10-0-149-154.ec2.internal   Ready   infra    3m20s   v1.14.6+6ac6aa4b0

#
# These are the nodes the router is currently running on:
#

$ oc get pods -n openshift-ingress -o wide
NAME                              READY   STATUS    RESTARTS   AGE     IP           NODE                           NOMINATED NODE   READINESS GATES
router-default-6844655845-dfswx   1/1     Running   0          3h40m   10.129.2.7   ip-10-0-174-174.ec2.internal   <none>           <none>
router-default-6844655845-z4ltd   1/1     Running   0          3h40m   10.128.2.4   ip-10-0-153-114.ec2.internal   <none>           <none>

### patch-ingress-deployment.yaml
spec:
  nodePlacement:
    nodeSelector:
      matchLabels:
	node-role.kubernetes.io/infra: ""

#
# Patch the deployment so that they should run only on "infra" nodes.
#

$ oc patch ingresscontroller/default -n openshift-ingress-operator --patch "$(cat patch-ingress-deployment.yaml)" --type=merge

$ oc get pods -n openshift-ingress -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP           NODE                           NOMINATED NODE   READINESS GATES
router-default-5b6b8f5d94-7xbxf   1/1     Running   0          64s   10.131.2.4   ip-10-0-149-154.ec2.internal   <none>           <none>
router-default-5b6b8f5d94-zkt22   1/1     Running   0          43s   10.130.2.4   ip-10-0-133-207.ec2.internal   <none>           <none>

$ oc get node -l node-role.kubernetes.io/infra
NAME                           STATUS   ROLES   AGE   VERSION
ip-10-0-149-154.ec2.internal   Ready    infra   23m   v1.14.6+6ac6aa4b0
ip-10-0-133-207.ec2.internal   Ready    infra   23m   v1.14.6+6ac6aa4b0

##################################################
##################################################

The essence of this bug is:

   When creating the below NetworkPolicy in OCP 4.2 to allow ingress
   only from router pods and pods in same namespace, it works as
   expected when the router pods are scheduled onto worker nodes, but
   not when they are scheduled onto dedicated infrastructure nodes.

however I still see the network policy working as it did when router
pods were running on "worker" nodes.

##################################################
##################################################

#
# External access through the router still works:
#

$ curl echo-service-netpol.apps.amcdermo-428.devcluster.openshift.com
echoecho

#
# And repeating the busybox exercise I can still access the
# service/endpoint even though the nodes are "infra" nodes.
#

$ oc run -n netpol --generator=run-pod/v1 busybox --rm -ti --image=busybox -- /bin/sh
If you don't see a command prompt, try pressing enter.
/ # wget --spider --timeout 30 echo-service.netpol:5678
Connecting to echo-service.netpol:5678 (172.30.187.191:5678)
remote file exists

These steps are against openshift-4.2 (AWS) but I have also done this
against master (GCP) and observe the same behaviour.

Comment 10 Andrew McDermott 2019-11-21 18:26:57 UTC

I tried the same steps/procedure in comment #9 on:

$ oc version
Client Version: v4.2.0-alpha.0-85-g75a68b3
Server Version: 4.1.24
Kubernetes Version: v1.13.4+09f0e83

and was not able to reproduce the bug there either.

Comment 17 Borja Aranda 2019-12-05 18:16:54 UTC


*** This bug has been marked as a duplicate of bug 1768608 ***

Comment 18 Robert Bohne 2020-02-19 09:15:36 UTC

Reopen, because bug 1768608 is a doc bug. That's a kind of workaround but not a technical solution. the workaround to allow the default namespaces with netid 0 allos ALL traffic from netid 0 and not only the router pods, for that's not really secure!

Comment 19 Dan Mace 2020-02-21 14:59:16 UTC

Reassigning to SDN, as that team owns network policy.

Comment 20 Juan Luis de Sousa-Valadas 2020-02-24 15:43:50 UTC

Hi, this was a docs bug and it's fixed in BZ1768608

TL;DR: If you have endpointPublishingStrategy: HostNetwork and you want to allow it, you must allow a netnamespace with netid 0
$ oc label namespace default 'network.openshift.io/policy-group=ingress'
And add a NamespaceSelector network.openshift.io/policy-group=ingress

Fixed doc: https://docs.openshift.com/container-platform/4.2/networking/configuring-networkpolicy.html#nw-networkpolicy-multitenant-isolation_configuring-networkpolicy-plugin

*** This bug has been marked as a duplicate of bug 1768608 ***

Comment 21 Juan Luis de Sousa-Valadas 2020-02-24 16:20:36 UTC

Hi Robert,

> Reopen, because bug 1768608 is a doc bug. That's a kind of workaround but not a technical solution. the workaround to allow the default namespaces with netid 0 allos ALL traffic from netid 0 and not only the router pods, for that's not really secure!
This is working exactly as intended, pods with hostSubnet: true have the container runtime's (crio or docker) net namespace (as in linux namespaces). Hence, there is no way from the SDN to know from which pod traffic is coming, we can't even know if the traffic is from a pod at all to begin with, if you want something more granular networkPolicy level don't use hostsubnet.

If you have router pods in a small subset of nodes, you may allow traffic from an ipBlock and specify the tun0's ip address of those nodes, and you'll restrict it to a subset of nodes. https://kubernetes.io/docs/concepts/services-networking/network-policies/#networkpolicy-resource . I haven't actually tested it, but it should work.

Note You need to log in before you can comment on or make changes to this bug.