Description of problem: Router cannot access pods in namespaces where NetworkPolicy resources restrict access. The documented way of allowing ingress from the router does not work. apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-from-openshift-ingress spec: ingress: - from: - namespaceSelector: matchLabels: network.openshift.io/policy-group: ingress podSelector: {} policyTypes: - Ingress https://docs.openshift.com/container-platform/4.2/networking/configuring-networkpolicy.html#nw-networkpolicy-about_configuring-networkpolicy-plugin Version-Release number of selected component (if applicable): How reproducible: Always. Steps to Reproduce: 1. Create an application with a Service that is exposed through a Route. 2. Create a NetworkPolicy that restricts ingress to the namespace containing the application (e.g. using the deny-all example NetworkPolicy). 3. Create a NetworkPolicy that allows ingress from the router namespace (as documented above) Actual results: Route times out (i.e. the application is no longer accessible). Expected results: Route works just as it did before applying network policy. Additional info: This bug effectively prevents restricting ingress to namespaces that have routes. The only workaround is to allow ingress from all namespaces. FWIW, other pods in the openshift-ingress namespace can access the service/pods with the documented NetworkPolicy (e.g. a deployed busybox container can curl the service/pod, but curling from one of the router pods fails).
I could not reproduce the issue on 4.2.2: % oc create -f ~/src/github.com/openshift/origin/examples/hello-openshift/hello-pod.json pod/hello-openshift created % oc expose pod hello-openshift service/hello-openshift exposed % oc expose svc hello-openshift route.route.openshift.io/hello-openshift exposed % oc get routes NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD hello-openshift hello-openshift-default.apps.ci-ln-8xhxhnk-d5d6b.origin-ci-int-aws.dev.rhcloud.com hello-openshift 8080 None % oc create -f - apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-from-openshift-ingress spec: ingress: - from: - namespaceSelector: matchLabels: network.openshift.io/policy-group: ingress podSelector: {} policyTypes: - Ingress networkpolicy.networking.k8s.io/allow-from-openshift-ingress created % oc create -f - kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: deny-by-default spec: podSelector: ingress: [] networkpolicy.networking.k8s.io/deny-by-default created % curl -s -o /dev/null -w $'%{http_code}\n' http://hello-openshift-default.apps.ci-ln-8xhxhnk-d5d6b.origin-ci-int-aws.dev.rhcloud.com 200 I also tried restarting the router: % oc -n openshift-ingress delete pods -l ingresscontroller.operator.openshift.io/deployment-ingresscontroller=default pod "router-default-858574b9c7-2fff2" deleted pod "router-default-858574b9c7-rdg6z" deleted I still got 200s from the route. Did I miss a step? Can you check that the openshift-ingress namespace is labeled correctly? Check `oc get ns/openshift-ingress -o yaml` to make sure the "network.openshift.io/policy-group: ingress" label is present.
Chances are you moved your router to HostNetwork? In accordance with upstream, hostnetwork pods are generally unselectable for NetworkPolicy, since the sourceIP of their traffic is not guaranteed. It happens to be in OpenShift that you can allow hostnetwork pods by allowing the default namespace. This needs to be better documented, but it is not a bug.
It does appear the router deployment is configured with hostNetwork: true. The instance was setup through quicklab. Does the default router installation use host network, or would this be something that quicklab is configuring as part of their install?
The ingress operator uses a LoadBalancer-type service on AWS, Azure, and GCP and uses host network on other platforms.
Interesting. HostNetwork pods put us in a bit of an awkward place; we can't select them as pods, but thanks to some cute hacks that are *only in openshift-sdn* we logically stuff all host-network pods in the "default" namespace. Which means that any technical solution we come up with has a big fat asterisk: if you're not using openshift-sdn, then it won't work. You also cannot select ingress pods by pod. I think the solution is for openshift-sdn to label the Default namespace with a special label-selector (network.openshift.io/policy-group: host-network) and update our documentation.
FWIW, router is no longer in default namespace. It now resides in openshift-ingress. Also, the documentation for allowing ingress from the router namespace is to have your NetworkPolicy key off network.openshift.io/policy-group=ingress label.
Yup, but selecting on the openshift-ingress namespace is ineffectual when the pods are host-network. It's just a nasty footgun as part of Kubernetes network policy.
Understood, but it renders NetworkPolicy useless for isolating namespaces if you want to expose a route out of one of them.
This is behaving as expected. The solution for this is for us to use NodePort services and then we can run without hostnetwork.
I'm not sure how you can say this is working as expected, when the expected behavior is to use NetworkPolicy to restrict ingress to pods in a namespace, while allowing ingress from the router, as described in the documentation. While it may be that NetworkPolicy rules don't get applied when hostNetwork=true is "as designed," that is simply the reason the expected behavior is not being seen. From a user's point of view, I don't really care what's going on behind the scenes, I just want the router to be able to connect to my pods when I'm using NetworkPolicy.
*** Bug 1749844 has been marked as a duplicate of this bug. ***
*** Bug 1769534 has been marked as a duplicate of this bug. ***
To clarify https://bugzilla.redhat.com/show_bug.cgi?id=1768608#c13, OCP 4 does not use host networking by default [1]. Host networking [2] can be used with any of the supported providers. As of [3], users have the ability to define the default ingresscontroller created by the installer. [1] https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L75-L77 [2] https://github.com/openshift/api/blob/master/operator/v1/types_ingress.go#L185-L187 [3] https://github.com/openshift/enhancements/blob/master/enhancements/user-defined-default-ingress-controller.md
Hi team, Any updates? Thanks, Fatima
Reassigning to the sdn team since the router is working as expected. Note that https://github.com/openshift/cluster-ingress-operator/pull/343 added nodePort service type support to the ingresscontroller resource.
Hi, this is intended by design but luckily there is a workaround: TECHNICAL EXPLANATION: The main issue here is how the SDN works, the SDN uses VXLAN to encapsulate traffic, VXLAN is essentially ethernet over UDP removing the bits that don't make sense in an overlay network (like checksums, you already have that covered in lower levels) and adding a field called VNI (VXLAN Network Identifier). When you do oc get netnamespace, you will see a NETID field which a numeric field. This numeric field is used to fill the VNI field in the VXLAN packet. The way the SDN works with namespace selectors, is it checks the VNID of both packets. So if you are allowing traffic with a namespaceSelector pointing to a namespace with VNI 1234 in a project with VNI 5678, openvswitch will have a bunch of rules which look like: (NOTE: this is pseudo syntax for clarity) IF VNI_project1 = 1234 AND VNI_project2 = 5678 THEN allow traffic IF VNI_project2 = 5678 THEN drop traffic The problem here is because hostnetwork pods are *not* running on the SDN, the traffic they generate *always* has VNI 0, and you are allowing VNI 1234, not VNI 0. Therefore the flows we really want in this scenario is: IF VNI_project1 = 0 AND VNI_project2 = 5678 THEN allow traffic IF VNI_project2 = 5678 THEN drop traffic WORKAROUND: (I'm copying and pasting this from the duplicate bug 1749844) Create a project with VNI 0, and allow traffic from it: 1. create a project hn-workaround 2- oc get netnamespace hn-workaround -o yaml | sed 's/netid:.*/netid: 0' | oc replace -f- Patches don't seem to work, I haven't bothered enough to investigate why, but oc replace works fine 3- restart every sdn pod (I also tried creating the netnamespace before the projcut but either way seems to need a restart, again, I haven't bothered to understand why). TODO ENGINEERING: The CNO should add a netnamespace with netid: 0 like we do when using multitenant and proper documentation for it. https://github.com/openshift/cluster-network-operator/blob/master/bindata/network/openshift-sdn/004-multitenant.yaml
I see the default project already has netid:0, therefore we only need to make a documentation change. Here's the pull request for it: https://github.com/openshift/openshift-docs/pull/18871
Our installation is an UPI installation on VMware - following the OpenShift documentation for 4.2, and using the OpenShift provided ingress haproxy's. We are facing the exact same problem, that the Network Policy for ingress does not seem to correctly match the OpenShift Ingress pods. (All network policy is setup as per OpenShift documentation, exactly as Rob is also mentioning) I tested the proposed pull-request https://github.com/openshift/openshift-docs/pull/18871 - by adding the label "network.openshift.io/policy-group=ingress" to the 'default' project. And I can confirm this seems to resolve the issue.
I've created a new PR for the content: https://github.com/openshift/openshift-docs/pull/19360
*** Bug 1764220 has been marked as a duplicate of this bug. ***
Hi Rajeeb, > 1. Why label `network.openshift.io/policy-group: ingress` is not given by default if endpoint publishing strategy is HostNetwork That would be ambiguous because other components also have hostnetwork, we should have a label for it called `network.openshift.io/policy-group: hostnetwork` or something like that. I'm not quite sure about which component is responsible for labeling the default namespace, but I'll look into it, that request makes perfect sense. > why can't OpenShift leave the NETID of the pods to be part of the SDN? For two reasons: 1- Specfication: When you create a pod which has hostNetwork: true that literally means get my pod in the root netNamespace and don't treat it any differently than the rest of the node. If we treated the traffic any differently than the traffic coming from the node itself we wouldn't be complying the specification and we wouldn't be creating what you asked for (even if we create what you wanted). 2- Technical: It's impossible for us to distinguish the traffic from two different containers sharing the same net namespace in the virtual switch. It's important to say even if we could do it technically we wouldn't do it anyway because of the first reason. > Why does OpenShift change the VNIDs of our pods to 0? Because hostnetwork: true means it's running at host level and at host level the traffic to the SDN goes with VNID 0. Also in this traffic we don't define the VNID, 0 is the default value so technically the other pods get the VNID changed and this one doesn't. > For consistent behaviour, I believe OpenShift should: > 1- Better handle the NETIDs of pods when on HostNetwork, to emulate the User Experience when on non-host networks (i.e. setting the correct NETID on pods). We chostan't do that for the two reasons explained before. > 2- Label the default namespace if that will be the only one with the 0 netID, like <removed> mentioned. I agree with this, it makes perfect sense and. I will look into it. If you want a container running in the SDN but that listens on a node port you can use pod with a hostPort for that, I don't know if the ingress operator supports it though. https://kubernetes.io/docs/concepts/cluster-administration/networking/
Hi. This workaround does not work with OVN-Kubernetes network plugin.
Hi, It's not expected to work, please file a new bug.
https://bugzilla.redhat.com/show_bug.cgi?id=1909777
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days