Created attachment 1434421 [details]
recording of working policy
Description of problem:
When workload pods are deployed onto the same node as a router pod, and you attempt to define a NetworkPolicy (such as a deny-all), traffic is still allowed to the endpoint, however, traffic appears restricted on the service.
Discovered while testing BZ 1569244
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Using CEE QuickLabs and per the OCP documentation:
1. Updated master-config.yaml on all masters:
2. Updated node-config.yaml on all masters:
3. Updated node-config.yaml on all nodes:
4. Performed restarts of services as defined in BZ 1569244 (Comment #4), as documented services to restart is incomplete:
systemctl restart iptables
systemctl restart openvswitch
systemctl restart docker
systemctl restart atomic-openshift-master-api.service
systemctl restart atomic-openshift-master-controllers.service
On masters and nodes:
systemctl restart astomic-openshift-node.service
5. Create new project (simple helloworld PHP using console, for example)
6. oc project <new project>
7. Create deny-all NetworkPolicy on project
8. Execute curl on service IP/port
9. Execute curl on endpoint URL
NetworkPolicy works on service IP/port, but does not work on endpoint URL
NetworkPolicy should work on both service IP/port and endpoint URL
Attached are 2 recordings, one showing the reflected steps above (not working) and one showing workload pod deployed to different node than the router pod (working). As a work around, if workloads are deployed to separate node from router pods, the NetworkPolicy works as expected on both service IP/port and endpoint URL.
Created attachment 1434423 [details]
recording of nonworking policy
I missed a step in the process above.
After step 5, Create new project
5a. Verify that the deployed workload pod is on the same node as the router pod
This step is necessary to recreate the issue fully. Apologies for missing this step.
Saw the same problem when tested, easy to reproduce it.
Correct me if I am wrong.
Looks like you are accessing the endpoint via the app url in the last step?
The network policy will not work on this kind of restriction.
It used for manage the connection inside the cluster, but in your case, you access the pod via a route, that means the connection go out the cluster to the outside router and go into the cluster again.
We should treat this kind of connection like a normal external access. It should work as expected.
That is correct.
The documentation certainly does not make that clear. It states that a 'deny-all' policy restricts ALL traffic.
"To make a project "deny by default" add a NetworkPolicy object that matches all pods but accepts no traffic.
There is certainly the use case where customers want deployed pods accessible by other pods, but restricted from the browser (i.e. restricted from the router pod). If the proper way to handle this is by removing the route to the service and pod, then this probably should be treated as a documentation bug/RFE and more clearly stated in the networking section of the documentation.
However, there is still an inconsistency. If you are telling me that networkpolicy is not intended to work on the endpoint url, then it should apply regardless of where the pod is deployed relative to the router pod. In one case, where the target pod and router pod are on separate nodes, access to the endpoint is restricted. In the other case, target pod and router pod are on the same node, and access to the endpoint is allowed.
While customer can control where pods are deployed using NodeSelectors, they may not care where pods are deployed. We still need to deal with the inconsistency of the endpoint access based on where workload pods are deployed relative to the router pod.
Or am I missing something?
I tried on my test env today, I think some of the points in my previous comment are wrong.
As you mentioned, the router to endpoint should also be restricted by the network policy. And it is true.
In the design of the network policy, the local pod to node connection will not be managed by networkpolicy.
So if the router node(infra node) is also running as compute node, then only the pod on that node will be accessible by the router on the same node.
But if we set the infra node won't run the user pods. The problem will not be hit.
@danw should confirm when he returns from PTO:
1) We expect that NetworkPolicy applies only from SDN pod <-> SDN pod
2) We expect that NetworkPolicy does not apply to traffic from the host itself (eg coming into the SDN with .1 address of tun0). This includes hostnetwork pods as they are in the hosts network namespace and thus enter the SDN via tun0.
3) Due to what appears to be an omission/bug (though @danw should confirm), traffic from one node that enters the SDN but is destined for a pod on a remote node will be dropped on that remote node because it has VNID=0 (like all tun0 traffic) but OpenFlow table 80 only allows VNID-mismatched traffic from the local node (for health checking/nodeport/etc).
(In reply to Dan Williams from comment #7)
> 1) We expect that NetworkPolicy applies only from SDN pod <-> SDN pod
In theory, this is false: NetworkPolicy applies to all traffic going into a pod. Although, eg, I don't think any implementation applies NetworkPolicy to HostPort traffic, etc.
> 2) We expect that NetworkPolicy does not apply to traffic from the host
> itself (eg coming into the SDN with .1 address of tun0).
Correct. This was explicitly required by the original spec proposal, although I'm not sure it's mentioned anywhere in the current documentation...
> This includes
> hostnetwork pods as they are in the hosts network namespace and thus enter
> the SDN via tun0.
Technically, yes, although I'm not sure anyone ever really considered this case. The "traffic from the node itself" exception was really only to support health checks.
At any rate, even if routers are hostNetwork pods, and hostNetwork pods should bypass NetworkPolicy, I don't think routes should bypass NetworkPolicy. (Which is not really a contradiction, since the fact that routes are implemented by hostNetwork pods is just an implementation detail.)
So I agree that there's a bug here.
> 3) Due to what appears to be an omission/bug (though @danw should confirm),
> traffic from one node that enters the SDN but is destined for a pod on a
> remote node will be dropped on that remote node because it has VNID=0 (like
> all tun0 traffic) but OpenFlow table 80 only allows VNID-mismatched traffic
> from the local node (for health checking/nodeport/etc).
That's not an omission or bug, that's the defined behavior.
This is not a regression, and you can work around it by separating the infrastructure and the application pods. Pushing to 3.11 since we don't know how to separate the health check traffic from the router traffic.
I am marking this fixed, mostly because the router pod is no longer in HostNetwork starting with 4.1. While the "problem" isn't fixed, we no longer have significant host-network components besides the apiserver.
In fact, I would veto any changes to 3.11 that would fix it. It's too big a change to introduce in a z-stream release.