Description of problem: OVN-Kubernetes EgressFirewall block API server , every egress firewall must allow essential accesses like the API endpoints Version-Release number of selected component (if applicable): 4.9.0-0.nightly-2021-08-14-065522 How reproducible: Always Steps to Reproduce: 1. Create a namespace test and a pod in it. Before create egressfirewall, API server can be accessed. oc rsh -n test hello-pod / # curl -k https://172.30.0.1 -I HTTP/2 403 audit-id: 3ad34d91-7053-49f6-b078-23070401be10 cache-control: no-cache, private content-type: application/json x-content-type-options: nosniff x-kubernetes-pf-flowschema-uid: 193f6e51-617f-49a8-b0da-ead5794486bf x-kubernetes-pf-prioritylevel-uid: 2b134631-418d-424b-991f-e9756abdb9ef content-length: 234 date: Tue, 03 Aug 2021 06:16:16 GMT / # 2. Then create egressfirewall, with deny to 0.0.0.0/0 oc get egressfirewall -n test -o yaml ..... spec: egress: - to: cidrSelector: 0.0.0.0/0 type: Deny status: status: EgressFirewall Rules applied .......... oc rsh -n test hello-pod / # curl -k https://172.30.0.1 -I --connect-timeout 5 curl: (28) Connection timed out after 5001 milliseconds Actual results: Egress firewall will block the access to API server. Expected results: Egressfirewall shoud not block access to API server. Additional info: Workaround: Add API service endpoints to allow rules. oc get ep -n default NAME ENDPOINTS AGE kubernetes 10.0.50.67:6443,10.0.53.46:6443,10.0.77.215:6443 4h21m Added ep's IP subnet to the allowed rule. ........ spec: egress: - to: cidrSelector: 10.0.0.0/16 type: Allow - to: cidrSelector: 0.0.0.0/0 type: Deny status: status: EgressFirewall Rules applied ........ 3. API Service can be accessed. $ oc rsh -n test hello-pod / # / # curl -k https://172.30.0.1 -I --connect-timeout 5 HTTP/2 403 audit-id: a5db328c-6762-4d01-95d8-7e65f4f011a9 cache-control: no-cache, private content-type: application/json x-content-type-options: nosniff x-kubernetes-pf-flowschema-uid: 193f6e51-617f-49a8-b0da-ead5794486bf x-kubernetes-pf-prioritylevel-uid: 2b134631-418d-424b-991f-e9756abdb9ef content-length: 234 date: Tue, 03 Aug 2021 06:51:01 GMT
Hello Team, Any update on this please ? It's been almost more than a month customer reported this issue / bug and we are still not even progressing no where Thanks to provide your inputs on this . Regards IMMANUVEL
Hello Team, Can anyone please update me about the status of the Bugzilla? The customer is asking for an update? Is there any ETA for the fix? Regards, Mridul Markandey
The openshift-sdn implementation of egress firewall does not implicitly allow access to node IPs. OVN-Kube will have functional parity with SDN features. Access to any external IP (including node IPs) must be enabled explicitly in the egress firewall rules.
If the reason for closing this is feature parity, please note that openshift-sdn egress firewall IS NOT applied on service cluster IPs. A sample ofproto trace for openshift-sdn while trying to access 172.30.0.1 when there is an egressnetworkpolicy that blocks it: $ ovs-appctl ofproto/trace br0 ip,nw_dst=172.30.0.1,nw_src=10.129.2.5,in_port=6 Flow: ip,in_port=6,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,nw_src=10.129.2.5,nw_dst=172.30.0.1,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=0 bridge("br0") ------------- 0. ct_state=-trk,ip, priority 1000 ct(table=0) drop -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 0. -> Sets the packet to an untracked state, and clears all the conntrack fields. Final flow: unchanged Megaflow: recirc_id=0,ct_state=-trk,eth,ip,in_port=6,nw_frag=no Datapath actions: ct,recirc(0x96b99) =============================================================================== recirc(0x96b99) - resume conntrack with default ct_state=trk|new (use --ct-next to customize) =============================================================================== Flow: recirc_id=0x96b99,ct_state=new|trk,eth,ip,in_port=6,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,nw_src=10.129.2.5,nw_dst=172.30.0.1,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=0 bridge("br0") ------------- thaw Resuming from table 0 0. ip, priority 100 goto_table:20 20. ip,in_port=6,nw_src=10.129.2.5, priority 100 load:0x29f9e9->NXM_NX_REG0[] goto_table:21 21. priority 0 goto_table:30 30. ip,nw_dst=172.30.0.0/16, priority 100 goto_table:60 60. priority 200 output:2 Final flow: recirc_id=0x96b99,ct_state=new|trk,eth,ip,reg0=0x29f9e9,in_port=6,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,nw_src=10.129.2.5,nw_dst=172.30.0.1,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=0 Megaflow: recirc_id=0x96b99,ct_state=-rpl+trk,eth,ip,in_port=6,nw_src=10.129.2.5,nw_dst=172.30.0.0/16,nw_frag=no This is because the service IP path in the OVS tables does not traverse table 101, where the egress network policy is applied: ovs-ofctl -O OpenFlow13 dump-flows br0 table=101 cookie=0x0, duration=3.060s, table=101, n_packets=0, n_bytes=0, priority=2,ip,reg0=0x29f9e9,nw_dst=1.2.3.0/24 actions=output:tun0 cookie=0x0, duration=3.060s, table=101, n_packets=0, n_bytes=0, priority=1,ip,reg0=0x29f9e9 actions=drop cookie=0x0, duration=1289745.457s, table=101, n_packets=78758, n_bytes=63245038, priority=0 actions=output:tun0 So if the reason to reject this is feature parity, service IPs must not be blocked (either always or optionally). Otherwise, feature parity cannot be used as an argument to reject this. Thanks and regards.
Or maybe that's a bug to solve on the openshift-sdn side, though
Thanks Pablo for clarifying. After talking with Dan Winship this behavior is expected on openshift-sdn, so we need to make it the same for OVN. I'll re-open the upstream PR and suggest we need to fix pod -> service backed by node ips, not access to any node ip directly from a pod.
Hello Team, Is there any update and case had been reopened long back. BR ESWAR
I've been looking at potential solutions for this. Some background on egress firewall: 1. In shared gateway mode, egress firewall is implemented as ACLs applied to the join switch. 2. In local gateway mode, egress firewall is implemented as ACLs applied to the worker switch. A typical OVNK topology looks like this: pod-----worker switch----ovn_cluster_router---join switch---gateway router---<external network> With OVNK service DNAT happens on the worker switch. The desire to bypass egress firewall for packets destined for service is possible in local gateway mode because we can evaluate ACLs before the packet is load balanced or we can evaluate CT states on the packet in the worker switch post DNAT (if ct.dnat == true; then bypass egress firewall). However, in shared gateway mode by the time the packet hits the join switch we have no idea if the packet has been DNAT'ed. The CT states are cleared. Additionally there are other bugs with egress firewall that we need to read CT state to know what to do with the packet. I think the best path forward here is to consolidate egress firewall in both gateway modes to be on the worker switch. This is going to take some more effort than just a simple bug fix so it is going to take more time. I would suggest as a workaround for now to poke holes in egress firewall for the specific k8s node backends that need connectivity. I'll provide an ETA or status update once I confirm with the rest of the OVNK community that the above approach is acceptable.
After some further discussion with the OVNK upstream community as well as internal discussions about SDN behavior, we have decided that the current OVN behavior is in fact the originally desired function of egress firewall. The fact that OpenShift SDN allows services to bypass the egress firewall and access endpoints that should be blocked is a side effect of the SDN implementation and not the desired behavior. The current method to allow access to host IPs is to manually create allow rules in egress firewalls to those specific hosts, or a generic CIDR rule that encompasses the host networks. However, we understand that manually configuring this isn't ideal as nodes can come and go, and allowing an entire CIDR may not be plausible as a user may not want pods to reach other hosts on the external network. Therefore we will add a node selector to the egress firewall destination spec in OVN. That way a user can specify a label applied to one or more nodes and the kubernetes node IPs will be included in that rule. The OpenShift SDN existing behavior will not change.
I pushed a patch adding this behavior upstream: https://github.com/ovn-org/ovn-kubernetes/pull/3002 We will target 4.12 for adding this into OCP and track it in JIRA: https://issues.redhat.com/browse/SDN-3098 For now as previously mentioned the workaround is to manually add rules to allow access to the required node IPs.