Bug 1768608 - NetworkPolicy not applied to router [NEEDINFO]
Summary: NetworkPolicy not applied to router
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 4.2.z
Hardware: Unspecified
OS: Unspecified
medium
urgent
Target Milestone: ---
: 4.4.0
Assignee: Jason Boxman
QA Contact: Xiaoli Tian
Vikram Goyal
URL:
Whiteboard:
: 1749844 1764220 1769534 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-04 19:47 UTC by Rob Cernich
Modified: 2020-09-09 14:38 UTC (History)
24 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-21 17:59:38 UTC
Target Upstream Version:
pbertera: needinfo? (dmace)
fshaikh: needinfo? (dmace)
jdesousa: needinfo? (fshaikh)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift openshift-docs pull 18871 None closed Fix netpol documentation for IngressControllers with hostSubnet 2020-11-17 13:29:32 UTC
Red Hat Issue Tracker MAISTRA-1180 Major Closed ingress gateway NetworkPolicy causes intermittent failures for gateway route 2020-12-02 14:35:24 UTC

Description Rob Cernich 2019-11-04 19:47:41 UTC
Description of problem:

Router cannot access pods in namespaces where NetworkPolicy resources restrict access.  The documented way of allowing ingress from the router does not work.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-openshift-ingress
spec:
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          network.openshift.io/policy-group: ingress
  podSelector: {}
  policyTypes:
  - Ingress

https://docs.openshift.com/container-platform/4.2/networking/configuring-networkpolicy.html#nw-networkpolicy-about_configuring-networkpolicy-plugin


Version-Release number of selected component (if applicable):


How reproducible:

Always.


Steps to Reproduce:
1. Create an application with a Service that is exposed through a Route.
2. Create a NetworkPolicy that restricts ingress to the namespace containing the application (e.g. using the deny-all example NetworkPolicy).
3. Create a NetworkPolicy that allows ingress from the router namespace (as documented above)

Actual results:

Route times out (i.e. the application is no longer accessible).

Expected results:

Route works just as it did before applying network policy.


Additional info:

This bug effectively prevents restricting ingress to namespaces that have routes.  The only workaround is to allow ingress from all namespaces.

FWIW, other pods in the openshift-ingress namespace can access the service/pods with the documented NetworkPolicy (e.g. a deployed busybox container can curl the service/pod, but curling from one of the router pods fails).

Comment 1 Miciah Dashiel Butler Masters 2019-11-04 22:33:40 UTC
I could not reproduce the issue on 4.2.2:

    % oc create -f ~/src/github.com/openshift/origin/examples/hello-openshift/hello-pod.json
    pod/hello-openshift created
    % oc expose pod hello-openshift
    service/hello-openshift exposed
    % oc expose svc hello-openshift
    route.route.openshift.io/hello-openshift exposed
    % oc get routes
    NAME              HOST/PORT                                                                            PATH   SERVICES
    PORT   TERMINATION   WILDCARD
    hello-openshift   hello-openshift-default.apps.ci-ln-8xhxhnk-d5d6b.origin-ci-int-aws.dev.rhcloud.com          hello-openshift
    8080                 None
    % oc create -f -
    apiVersion: networking.k8s.io/v1
    kind: NetworkPolicy
    metadata:
      name: allow-from-openshift-ingress
    spec:
      ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              network.openshift.io/policy-group: ingress
      podSelector: {}
      policyTypes:
      - Ingress
    networkpolicy.networking.k8s.io/allow-from-openshift-ingress created
    % oc create -f -
    kind: NetworkPolicy
    apiVersion: networking.k8s.io/v1
    metadata:
      name: deny-by-default
    spec:
      podSelector:
      ingress: []
    networkpolicy.networking.k8s.io/deny-by-default created
    % curl -s -o /dev/null -w $'%{http_code}\n' http://hello-openshift-default.apps.ci-ln-8xhxhnk-d5d6b.origin-ci-int-aws.dev.rhcloud.com
    200

I also tried restarting the router:

    % oc -n openshift-ingress delete pods -l ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default
    pod "router-default-858574b9c7-2fff2" deleted
    pod "router-default-858574b9c7-rdg6z" deleted

I still got 200s from the route.

Did I miss a step?

Can you check that the openshift-ingress namespace is labeled correctly? Check `oc get ns/openshift-ingress -o yaml` to make sure the "network.openshift.io/policy-group: ingress" label is present.

Comment 2 Casey Callendrello 2019-11-05 10:51:20 UTC
Chances are you moved your router to HostNetwork?

In accordance with upstream, hostnetwork pods are generally unselectable for NetworkPolicy, since the sourceIP of their traffic is not guaranteed. It happens to be in OpenShift that you can allow hostnetwork pods by allowing the default namespace. This needs to be better documented, but it is not a bug.

Comment 3 Rob Cernich 2019-11-05 15:42:04 UTC
It does appear the router deployment is configured with hostNetwork: true.  The instance was setup through quicklab.  Does the default router installation use host network, or would this be something that quicklab is configuring as part of their install?

Comment 4 Miciah Dashiel Butler Masters 2019-11-06 08:36:05 UTC
The ingress operator uses a LoadBalancer-type service on AWS, Azure, and GCP and uses host network on other platforms.

Comment 5 Casey Callendrello 2019-11-06 10:03:41 UTC
Interesting.

HostNetwork pods put us in a bit of an awkward place; we can't select them as pods, but thanks to some cute hacks that are *only in openshift-sdn* we logically stuff all host-network pods in the "default" namespace.

Which means that any technical solution we come up with has a big fat asterisk: if you're not using openshift-sdn, then it won't work. You also cannot select ingress pods by pod.

I think the solution is for openshift-sdn to label the Default namespace with a special label-selector (network.openshift.io/policy-group: host-network) and update our documentation.

Comment 6 Rob Cernich 2019-11-06 15:11:17 UTC
FWIW, router is no longer in default namespace.  It now resides in openshift-ingress.  Also, the documentation for allowing ingress from the router namespace is to have your NetworkPolicy key off network.openshift.io/policy-group=ingress label.

Comment 7 Casey Callendrello 2019-11-06 15:28:36 UTC
Yup, but selecting on the openshift-ingress namespace is ineffectual when the pods are host-network. It's just a nasty footgun as part of Kubernetes network policy.

Comment 8 Rob Cernich 2019-11-06 15:30:30 UTC
Understood, but it renders NetworkPolicy useless for isolating namespaces if you want to expose a route out of one of them.

Comment 9 Ben Bennett 2019-11-08 15:50:16 UTC
This is behaving as expected.

The solution for this is for us to use NodePort services and then we can run without hostnetwork.

Comment 10 Rob Cernich 2019-11-08 17:14:58 UTC
I'm not sure how you can say this is working as expected, when the expected behavior is to use NetworkPolicy to restrict ingress to pods in a namespace, while allowing ingress from the router, as described in the documentation.  While it may be that NetworkPolicy rules don't get applied when hostNetwork=true is "as designed," that is simply the reason the expected behavior is not being seen.  From a user's point of view, I don't really care what's going on behind the scenes, I just want the router to be able to connect to my pods when I'm using NetworkPolicy.

Comment 11 Juan Luis de Sousa-Valadas 2019-11-28 14:49:32 UTC
*** Bug 1749844 has been marked as a duplicate of this bug. ***

Comment 14 Borja Aranda 2019-12-05 18:16:54 UTC
*** Bug 1769534 has been marked as a duplicate of this bug. ***

Comment 15 Daneyon Hansen 2019-12-16 17:11:51 UTC
To clarify https://bugzilla.redhat.com/show_bug.cgi?id=1768608#c13, OCP 4 does not use host networking by default [1]. Host networking [2] can be used with any of the supported providers. As of [3], users have the ability to define the default ingresscontroller created by the installer.

[1] https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L75-L77
[2] https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L185-L187
[3] https://github.com/openshift/enhancements/blob/master/enhancements/user-defined-default-ingress-controller.md

Comment 16 Fatima 2019-12-31 15:02:04 UTC
Hi team,

Any updates?

Thanks,
Fatima

Comment 17 Daneyon Hansen 2020-01-06 18:33:03 UTC
Reassigning to the sdn team since the router is working as expected. Note that https://github.com/openshift/cluster-ingress-operator/pull/343 added nodePort service type support to the ingresscontroller resource.

Comment 19 Juan Luis de Sousa-Valadas 2020-01-07 10:56:49 UTC
Hi, this is intended by design but luckily there is a workaround:

TECHNICAL EXPLANATION:

The main issue here is how the SDN works, the SDN uses VXLAN to encapsulate traffic, VXLAN is essentially ethernet over UDP removing the bits that don't make sense in an overlay network (like checksums, you already have that covered in lower levels) and  adding a field called VNI (VXLAN Network Identifier).
When you do oc get netnamespace, you will see a NETID field which a numeric field. This numeric field is used to fill the VNI field in the VXLAN packet.

The way the SDN works with namespace selectors, is it checks the VNID of both packets. So if you are allowing traffic with a namespaceSelector pointing to a namespace with VNI 1234 in a project with VNI 5678, openvswitch will have a bunch of rules which look like:
(NOTE: this is pseudo syntax for clarity)
IF VNI_project1 = 1234 AND VNI_project2 = 5678 THEN allow traffic
IF VNI_project2 = 5678 THEN drop traffic

The problem here is because hostnetwork pods are *not* running on the SDN, the traffic they generate *always* has VNI 0, and you are allowing VNI 1234, not VNI 0. Therefore the flows we really want in this scenario is:
IF VNI_project1 = 0 AND VNI_project2 = 5678 THEN allow traffic
IF VNI_project2 = 5678 THEN drop traffic

WORKAROUND:
(I'm copying and pasting this from the duplicate bug 1749844)
Create a project with VNI 0, and allow traffic from it:
1. create a project hn-workaround
2- oc get netnamespace hn-workaround -o yaml | sed 's/netid:.*/netid: 0' | oc replace -f-
Patches don't seem to work, I haven't bothered enough to investigate why, but oc replace works fine
3- restart every sdn pod (I also tried creating the netnamespace before the projcut but either way seems to need a restart, again, I haven't bothered to understand why).

TODO ENGINEERING:
The CNO should add a netnamespace with netid: 0 like we do when using multitenant and proper documentation for it.
https://github.com/openshift/cluster-network-operator/blob/master/bindata/network/openshift-sdn/004-multitenant.yaml

Comment 20 Juan Luis de Sousa-Valadas 2020-01-07 12:59:20 UTC
I see the default project already has netid:0, therefore we only need to make a documentation change.
Here's the pull request for it: https://github.com/openshift/openshift-docs/pull/18871

Comment 21 Glenn Sommer 2020-01-08 10:20:11 UTC
Our installation is an UPI installation on VMware - following the OpenShift documentation for 4.2, and using the OpenShift provided ingress haproxy's.
We are facing the exact same problem, that the Network Policy for ingress does not seem to correctly match the OpenShift Ingress pods. (All network policy is setup as per OpenShift documentation, exactly as Rob is also mentioning)

I tested the proposed pull-request https://github.com/openshift/openshift-docs/pull/18871 - by adding the label "network.openshift.io/policy-group=ingress" to the 'default' project.
And I can confirm this seems to resolve the issue.

Comment 22 Jason Boxman 2020-01-28 22:41:51 UTC
I've created a new PR for the content:

https://github.com/openshift/openshift-docs/pull/19360

Comment 23 Jason Boxman 2020-01-28 22:53:28 UTC
*** Bug 1764220 has been marked as a duplicate of this bug. ***

Comment 24 Juan Luis de Sousa-Valadas 2020-02-24 15:43:50 UTC
*** Bug 1769534 has been marked as a duplicate of this bug. ***

Comment 26 Juan Luis de Sousa-Valadas 2020-09-09 14:38:12 UTC
Hi Rajeeb,

> 1. Why label `network.openshift.io/policy-group: ingress` is not given by default if endpoint publishing strategy is HostNetwork
That would be ambiguous because other components also have hostnetwork, we should have a label for it called `network.openshift.io/policy-group: hostnetwork` or something like that. I'm not quite sure about which component is responsible for labeling the default namespace, but I'll look into it, that request makes perfect sense.

> why can't OpenShift leave the NETID of the pods to be part of the SDN?

For two reasons:
1- Specfication: When you create a pod which has hostNetwork: true that literally means get my pod in the root netNamespace and don't treat it any differently than the rest of the node. If we treated the traffic any differently than the traffic coming from the node itself we wouldn't be complying the specification and we wouldn't be creating what you asked for (even if we create what you wanted).
2- Technical: It's impossible for us to distinguish the traffic from two different containers sharing the same net namespace in the virtual switch. It's important to say even if we could do it technically we wouldn't do it anyway because of the first reason.

> Why does OpenShift change the VNIDs of our pods to 0?
Because hostnetwork: true means it's running at host level and at host level the traffic to the SDN goes with VNID 0. Also in this traffic we don't define the VNID, 0 is the default value so technically the other pods get the VNID changed and this one doesn't.


> For consistent behaviour, I believe OpenShift should:
>   1- Better handle the NETIDs of pods when on HostNetwork, to emulate the User Experience when on non-host networks (i.e. setting the correct NETID on pods).
We chostan't do that for the two reasons explained before.

>   2- Label the default namespace if that will be the only one with the 0 netID, like <removed> mentioned.
I agree with this, it makes perfect sense and. I will look into it.

If you want a container running in the SDN but that listens on a node port you can use pod with a hostPort for that, I don't know if the ingress operator supports it though. https://kubernetes.io/docs/concepts/cluster-administration/networking/


Note You need to log in before you can comment on or make changes to this bug.