Bug 1576857

Summary:

Defined NetworkPolicy Object does not control ingress when target pod is on the same node as router pod

Product:

OpenShift Container Platform

Reporter:

Tom Manor <tmanor>

Component:

Networking

Assignee:

Dan Winship <danw>

Status:

CLOSED CURRENTRELEASE

QA Contact:

zhaozhanqi <zzhao>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

3.7.0

CC:

aos-bugs, bbennett, cdc, dcbw, hongli, mhepburn, pasik, weliang

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

No Doc Update

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2019-08-05 13:59:11 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
recording of working policy	none
recording of nonworking policy	none

Description Tom Manor 2018-05-10 14:48:24 UTC

Created attachment 1434421 [details]
recording of working policy

Description of problem:
When workload pods are deployed onto the same node as a router pod, and you attempt to define a NetworkPolicy (such as a deny-all), traffic is still allowed to the endpoint, however, traffic appears restricted on the service.

Discovered while testing BZ 1569244

Version-Release number of selected component (if applicable):
OCP 3.7

How reproducible:
Always

Steps to Reproduce:
Using CEE QuickLabs and per the OCP documentation:
1. Updated master-config.yaml on all masters:
  networkPluginName: redhat/openshift-ovs-networkpolicy

2. Updated node-config.yaml on all masters:
  networkPluginName: redhat/openshift-ovs-networkpolicy
 and 
  networkPluginName: redhat/openshift-ovs-networkpolicy

3. Updated node-config.yaml on all nodes:
  networkPluginName: redhat/openshift-ovs-networkpolicy
 and 
  networkPluginName: redhat/openshift-ovs-networkpolicy

4. Performed restarts of services as defined in BZ 1569244 (Comment #4), as documented services to restart is incomplete:
   systemctl restart iptables
   systemctl restart openvswitch
   systemctl restart docker

On masters:
   systemctl restart atomic-openshift-master-api.service
   systemctl restart atomic-openshift-master-controllers.service

On masters and nodes:
   systemctl restart astomic-openshift-node.service

5. Create new project (simple helloworld PHP using console, for example)

6. oc project <new project>

7. Create deny-all NetworkPolicy on project

kind: NetworkPolicy
apiVersion: extensions/v1beta1
metadata:
  name: deny-by-default
spec:
  podSelector:
  ingress: []

8. Execute curl on service IP/port

9. Execute curl on endpoint URL


Actual results:
NetworkPolicy works on service IP/port, but does not work on endpoint URL

Expected results:
NetworkPolicy should work on both service IP/port and endpoint URL

Additional info:
Attached are 2 recordings, one showing the reflected steps above (not working) and one showing workload pod deployed to different node than the router pod (working).  As a work around, if workloads are deployed to separate node from router pods, the NetworkPolicy works as expected on both service IP/port and endpoint URL.

Comment 1 Tom Manor 2018-05-10 14:48:56 UTC

Created attachment 1434423 [details]
recording of nonworking policy

Comment 2 Tom Manor 2018-05-10 15:03:20 UTC

Note!

I missed a step in the process above.

After step 5, Create new project

5a. Verify that the deployed workload pod is on the same node as the router pod

This step is necessary to recreate the issue fully. Apologies for missing this step.

Comment 3 Weibin Liang 2018-05-10 18:42:48 UTC

Saw the same problem when tested, easy to reproduce it.

Comment 4 Meng Bo 2018-05-11 02:42:04 UTC

Correct me if I am wrong.

Looks like you are accessing the endpoint via the app url in the last step?
The network policy will not work on this kind of restriction. 

It used for manage the connection inside the cluster, but in your case, you access the pod via a route, that means the connection go out the cluster to the outside router and go into the cluster again. 
We should treat this kind of connection like a normal external access. It should work as expected.

Comment 5 Tom Manor 2018-05-11 14:32:57 UTC

@Meng,

That is correct.

The documentation certainly does not make that clear.  It states that a 'deny-all' policy restricts ALL traffic.

"To make a project "deny by default" add a NetworkPolicy object that matches all pods but accepts no traffic.

kind: NetworkPolicy
apiVersion: extensions/v1beta1
metadata:
  name: deny-by-default
spec:
  podSelector:
  ingress: []"

There is certainly the use case where customers want deployed pods accessible by other pods, but restricted from the browser (i.e. restricted from the router pod).  If the proper way to handle this is by removing the route to the service and pod, then this probably should be treated as a documentation bug/RFE and more clearly stated in the networking section of the documentation.

However, there is still an inconsistency.  If you are telling me that networkpolicy is not intended to work on the endpoint url, then it should apply regardless of where the pod is deployed relative to the router pod.  In one case, where the target pod and router pod are on separate nodes, access to the endpoint is restricted.  In the other case, target pod and router pod are on the same node, and access to the endpoint is allowed.

While customer can control where pods are deployed using NodeSelectors, they may not care where pods are deployed.  We still need to deal with the inconsistency of the endpoint access based on where workload pods are deployed relative to the router pod.

Or am I missing something?

Comment 6 Meng Bo 2018-05-14 08:12:01 UTC

I tried on my test env today, I think some of the points in my previous comment are wrong.

As you mentioned, the router to endpoint should also be restricted by the network policy. And it is true.

In the design of the network policy, the local pod to node connection will not be managed by networkpolicy.

So if the router node(infra node) is also running as compute node, then only the pod on that node will be accessible by the router on the same node. 
But if we set the infra node won't run the user pods. The problem will not be hit.

Comment 7 Dan Williams 2018-05-16 21:31:47 UTC

@danw should confirm when he returns from PTO:

1) We expect that NetworkPolicy applies only from SDN pod <-> SDN pod

2) We expect that NetworkPolicy does not apply to traffic from the host itself (eg coming into the SDN with .1 address of tun0).  This includes hostnetwork pods as they are in the hosts network namespace and thus enter the SDN via tun0.

3) Due to what appears to be an omission/bug (though @danw should confirm), traffic from one node that enters the SDN but is destined for a pod on a remote node will be dropped on that remote node because it has VNID=0 (like all tun0 traffic) but OpenFlow table 80 only allows VNID-mismatched traffic from the local node (for health checking/nodeport/etc).

Comment 8 Dan Winship 2018-05-21 14:24:35 UTC

(In reply to Dan Williams from comment #7)
> 1) We expect that NetworkPolicy applies only from SDN pod <-> SDN pod

In theory, this is false: NetworkPolicy applies to all traffic going into a pod. Although, eg, I don't think any implementation applies NetworkPolicy to HostPort traffic, etc.

> 2) We expect that NetworkPolicy does not apply to traffic from the host
> itself (eg coming into the SDN with .1 address of tun0).

Correct. This was explicitly required by the original spec proposal, although I'm not sure it's mentioned anywhere in the current documentation...

> This includes
> hostnetwork pods as they are in the hosts network namespace and thus enter
> the SDN via tun0.

Technically, yes, although I'm not sure anyone ever really considered this case. The "traffic from the node itself" exception was really only to support health checks.

At any rate, even if routers are hostNetwork pods, and hostNetwork pods should bypass NetworkPolicy, I don't think routes should bypass NetworkPolicy. (Which is not really a contradiction, since the fact that routes are implemented by hostNetwork pods is just an implementation detail.)

So I agree that there's a bug here.

> 3) Due to what appears to be an omission/bug (though @danw should confirm),
> traffic from one node that enters the SDN but is destined for a pod on a
> remote node will be dropped on that remote node because it has VNID=0 (like
> all tun0 traffic) but OpenFlow table 80 only allows VNID-mismatched traffic
> from the local node (for health checking/nodeport/etc).

That's not an omission or bug, that's the defined behavior.

Comment 9 Ben Bennett 2018-05-22 18:07:48 UTC

This is not a regression, and you can work around it by separating the infrastructure and the application pods.  Pushing to 3.11 since we don't know how to separate the health check traffic from the router traffic.

Comment 10 Casey Callendrello 2019-08-05 13:59:11 UTC

I am marking this fixed, mostly because the router pod is no longer in HostNetwork starting with 4.1. While the "problem" isn't fixed, we no longer have significant host-network components besides the apiserver.

In fact, I would veto any changes to 3.11 that would fix it. It's too big a change to introduce in a z-stream release.