Today, OVN allows all the traffic that is not explicitly forbidden by an ACL. OpenStack works the other way around, blocking all the traffic that is not explicitly allowed. In order for OpenStack to enable this behavior in OVN, it artificially creates a Port Group in the NB database where all ports are added to: In large deployments, this port group may have several thousands of ports and it gets updated very frequently causing a lot of transactions in NB and, hence, in the SB database as well as load to northd, expansion of the associated Address Sets, insertion/deletion of OpenFlow rules, etcetera. It would be beneficial to add a configuration setting into OVN that allows defining the default allow/deny behavior in the ACL stage so that we can spare this burden to all layers by simply having a low priority flow that drops traffic.
RFC with initial implementation posted upstream for discussion: https://patchwork.ozlabs.org/project/ovn/list/?series=293812&state=* This is RFC because we need to figure out the best default behavior for some of the edge cases.
as the feature is still in development, no accurate reproducer for the moment, would try to automate it after the feature is ready. flush needinfo for the moment.
set coverage- before the patches are ready.
V1 posted for review: https://patchwork.ozlabs.org/project/ovn/list/?series=295774&state=*
V1 was accepted but introduced a bug; this follow up is needed too: https://patchwork.ozlabs.org/project/ovn/list/?series=296961&state=*
It works well on ACLs with combinations of L3 and L4 protocols. I tested it with the following ACLs having set default_acl_drop to true: ovn-nbctl acl-add ls from-lport 1 "ip4 && tcp" allow ovn-nbctl acl-add ls from-lport 1 "ip4 && tcp" allow-related ovn-nbctl --apply-after-lb acl-add ls from-lport 1 "ip4 && tcp" allow ovn-nbctl --apply-after-lb acl-add ls from-lport 1 "ip4 && tcp" allow-related All those worked as expected. However, it is not working with these ACLs: ovn-nbctl acl-add ls from-lport 1 "ip4" allow ovn-nbctl acl-add ls from-lport 1 "ip6" allow ovn-nbctl acl-add ls from-lport 1 "icmp4" allow ovn-nbctl acl-add ls from-lport 1 "icmp6" allow With these ACLs, all the flows are being dropped whether ip4 or ip6. Am I missing something?? Here is the reproducer: systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 systemctl start openvswitch ovs-vsctl set open . external_ids:system-id=hv1 # IP address configuration to physical interface ifconfig ens1f0 42.42.42.1 netmask 255.0.0.0 ovs-vsctl set open . external_ids:ovn-remote=tcp:42.42.42.1:6642 ovs-vsctl set open . external_ids:ovn-encap-type=geneve ovs-vsctl set open . external_ids:ovn-encap-ip=42.42.42.1 systemctl restart ovn-controller ovn-nbctl ls-add ls ovn-nbctl lsp-add ls lsp1 -- lsp-set-addresses lsp1 00:00:00:00:00:01 ovn-nbctl lsp-add ls lsp2 -- lsp-set-addresses lsp2 00:00:00:00:00:02 ip netns add vm1 ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal ip link set vm1 netns vm1 ip netns exec vm1 ip link set vm1 address 00:00:00:00:00:01 ip netns exec vm1 ip addr add 42.42.42.2/24 dev vm1 ip netns exec vm1 ip -6 addr add 2001::2/64 dev vm1 ip netns exec vm1 ip link set vm1 up ip netns exec vm1 ip link set lo up ovs-vsctl set Interface vm1 external_ids:iface-id=lsp1 ip netns add vm2 ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal ip link set vm2 netns vm2 ip netns exec vm2 ip link set vm2 address 00:00:00:00:00:02 ip netns exec vm2 ip addr add 42.42.42.3/24 dev vm2 ip netns exec vm2 ip -6 addr add 2001::3/64 dev vm2 ip netns exec vm2 ip link set vm2 up ip netns exec vm2 ip link set lo up ovs-vsctl set Interface vm2 external_ids:iface-id=lsp2 ovn-nbctl --wait=hv set NB_Global . options:default_acl_drop=true ovn-nbctl acl-add ls from-lport 1 "ip4" allow ip netns exec vm2 ping 42.42.42.2 -c 3 ## ping failed ip netns exec vm2 ping6 2001::2 -c 3 ## ping failed ovn-nbctl acl-del ls ovn-nbctl acl-add ls from-lport 1 "icmp4" allow ip netns exec vm2 ping 42.42.42.2 -c 3 ## ping failed ip netns exec vm2 ping6 2001::2 -c 3 ## ping failed
It's a bit confusing but there are actually 3 ACL stages in the switch pipeline: 1. ingress pipeline, before load balancer (all ACLs added with "from-lport"). 2. ingress pipeline, after load balancer (all ACLs added with "--apply-after-lb, from-lport"). 3. egress pipeline (all ACLs added with "to-lport"). If NB_Global.options:default_acl_drop is set to "true", we need to punch holes in all these stages. In your case there are two issues: - IP traffic gets dropped in stages "2" and "3" above. - ARP/ND traffic gets dropped in all stages. I added the following ACLs on top of your config and traffic flows fine then: ovn-nbctl --apply-after-lb acl-add ls from-lport 1 1 allow # Allow everything in the ingress pipeline after LB. ovn-nbctl acl-add ls to-lport 1 1 allow # Allow everything in the egress pipeline. ovn-nbctl acl-add ls from-lport 2 "arp || nd" allow # Allow all ARP/ND packets in the ingress pipeline. Regards, Dumitru
Thanks a lot Dumitru for the explanation. Until that point, the test scenario was working. However, for "allow-related" action, it is again showing the same problem. Consider the same simple network of one LS and 2 VMs: ovn-nbctl acl-del ls ovn-nbctl --wait=sb acl-add ls from-lport 100 "ip" allow-related ovn-nbctl --apply-after-lb acl-add ls from-lport 100 1 allow ovn-nbctl acl-add ls to-lport 100 1 allow ovn-nbctl acl-add ls from-lport 200 "arp || nd" allow ovn-nbctl --wait=hv sync ip netns exec vm2 ping 42.42.42.2 -c 3 # ping failed ip netns exec vm2 ping6 2001::2 -c 3 # ping failed I also tried this way: ovn-nbctl acl-del ls ovn-nbctl --wait=sb acl-add ls from-lport 100 "ip" allow-related ovn-nbctl --apply-after-lb acl-add ls from-lport 100 1 allow-related ovn-nbctl acl-add ls to-lport 100 1 allow-related ovn-nbctl acl-add ls from-lport 200 "arp || nd" allow-related ovn-nbctl --wait=hv sync ip netns exec vm2 ping 42.42.42.2 -c 3 # ping failed ip netns exec vm2 ping6 2001::2 -c 3 # ping failed I tried sending the TCP packets, but no luck. echo \"abcdefg\" >> send.pkt ip netns exec vm1 ncat -l 2345 > tcp1.pkt & ip netns exec vm2 ncat 42.42.42.2 2345 < send.pkt I think these two flows are blocking the traffic: table=8 (ls_in_acl ), priority=1 , match=(ip && !ct.est), action=(drop;) table=4 (ls_out_acl ), priority=1 , match=(ip && !ct.est), action=(drop;) Complete dump-flows is here below: ovn-sbctl dump-flows | grep -E "ls_.*_acl" table=4 (ls_in_pre_acl ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) table=4 (ls_in_pre_acl ), priority=110 , match=(eth.mcast), action=(next;) table=4 (ls_in_pre_acl ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2 || (udp && udp.src == 546 && udp.dst == 547)), action=(next;) table=4 (ls_in_pre_acl ), priority=100 , match=(ip), action=(reg0[0] = 1; next;) table=4 (ls_in_pre_acl ), priority=0 , match=(1), action=(next;) table=7 (ls_in_acl_hint ), priority=7 , match=(ct.new && !ct.est), action=(reg0[7] = 1; reg0[9] = 1; next;) table=7 (ls_in_acl_hint ), priority=6 , match=(!ct.new && ct.est && !ct.rpl && ct_mark.blocked == 1), action=(reg0[7] = 1; reg0[9] = 1; next;) table=7 (ls_in_acl_hint ), priority=5 , match=(!ct.trk), action=(reg0[8] = 1; reg0[9] = 1; next;) table=7 (ls_in_acl_hint ), priority=4 , match=(!ct.new && ct.est && !ct.rpl && ct_mark.blocked == 0), action=(reg0[8] = 1; reg0[10] = 1; next;) table=7 (ls_in_acl_hint ), priority=3 , match=(!ct.est), action=(reg0[9] = 1; next;) table=7 (ls_in_acl_hint ), priority=2 , match=(ct.est && ct_mark.blocked == 1), action=(reg0[9] = 1; next;) table=7 (ls_in_acl_hint ), priority=1 , match=(ct.est && ct_mark.blocked == 0), action=(reg0[10] = 1; next;) table=7 (ls_in_acl_hint ), priority=0 , match=(1), action=(next;) table=8 (ls_in_acl ), priority=65532, match=(!ct.est && ct.rel && !ct.new && !ct.inv && ct_mark.blocked == 0), action=(next;) table=8 (ls_in_acl ), priority=65532, match=(ct.est && !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_mark.blocked == 0), action=(reg0[9] = 0; reg0[10] = 0; next;) table=8 (ls_in_acl ), priority=65532, match=(ct.inv || (ct.est && ct.rpl && ct_mark.blocked == 1)), action=(drop;) table=8 (ls_in_acl ), priority=65532, match=(nd || nd_ra || nd_rs || mldv1 || mldv2), action=(next;) table=8 (ls_in_acl ), priority=34000, match=(eth.dst == $svc_monitor_mac), action=(next;) table=8 (ls_in_acl ), priority=1200 , match=(reg0[7] == 1 && (arp || nd)), action=(reg0[1] = 1; next;) table=8 (ls_in_acl ), priority=1200 , match=(reg0[8] == 1 && (arp || nd)), action=(next;) table=8 (ls_in_acl ), priority=1100 , match=(reg0[7] == 1 && (ip)), action=(reg0[1] = 1; next;) table=8 (ls_in_acl ), priority=1100 , match=(reg0[8] == 1 && (ip)), action=(next;) table=8 (ls_in_acl ), priority=1 , match=(ip && !ct.est), action=(drop;) table=8 (ls_in_acl ), priority=1 , match=(ip && ct.est && ct_mark.blocked == 1), action=(reg0[1] = 1; next;) table=8 (ls_in_acl ), priority=0 , match=(1), action=(drop;) table=12(ls_in_acl_after_lb ), priority=1100 , match=(reg0[7] == 1 && (1)), action=(reg0[1] = 1; next;) table=12(ls_in_acl_after_lb ), priority=1100 , match=(reg0[8] == 1 && (1)), action=(next;) table=12(ls_in_acl_after_lb ), priority=0 , match=(1), action=(drop;) table=1 (ls_out_pre_acl ), priority=110 , match=(eth.mcast), action=(next;) table=1 (ls_out_pre_acl ), priority=110 , match=(eth.src == $svc_monitor_mac), action=(next;) table=1 (ls_out_pre_acl ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2 || (udp && udp.src == 546 && udp.dst == 547)), action=(next;) table=1 (ls_out_pre_acl ), priority=100 , match=(ip), action=(reg0[0] = 1; next;) table=1 (ls_out_pre_acl ), priority=0 , match=(1), action=(next;) table=3 (ls_out_acl_hint ), priority=7 , match=(ct.new && !ct.est), action=(reg0[7] = 1; reg0[9] = 1; next;) table=3 (ls_out_acl_hint ), priority=6 , match=(!ct.new && ct.est && !ct.rpl && ct_mark.blocked == 1), action=(reg0[7] = 1; reg0[9] = 1; next;) table=3 (ls_out_acl_hint ), priority=5 , match=(!ct.trk), action=(reg0[8] = 1; reg0[9] = 1; next;) table=3 (ls_out_acl_hint ), priority=4 , match=(!ct.new && ct.est && !ct.rpl && ct_mark.blocked == 0), action=(reg0[8] = 1; reg0[10] = 1; next;) table=3 (ls_out_acl_hint ), priority=3 , match=(!ct.est), action=(reg0[9] = 1; next;) table=3 (ls_out_acl_hint ), priority=2 , match=(ct.est && ct_mark.blocked == 1), action=(reg0[9] = 1; next;) table=3 (ls_out_acl_hint ), priority=1 , match=(ct.est && ct_mark.blocked == 0), action=(reg0[10] = 1; next;) table=3 (ls_out_acl_hint ), priority=0 , match=(1), action=(next;) table=4 (ls_out_acl ), priority=65532, match=(!ct.est && ct.rel && !ct.new && !ct.inv && ct_mark.blocked == 0), action=(next;) table=4 (ls_out_acl ), priority=65532, match=(ct.est && !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_mark.blocked == 0), action=(next;) table=4 (ls_out_acl ), priority=65532, match=(ct.inv || (ct.est && ct.rpl && ct_mark.blocked == 1)), action=(drop;) table=4 (ls_out_acl ), priority=65532, match=(nd || nd_ra || nd_rs || mldv1 || mldv2), action=(next;) table=4 (ls_out_acl ), priority=34000, match=(eth.src == $svc_monitor_mac), action=(next;) table=4 (ls_out_acl ), priority=1100 , match=(reg0[7] == 1 && (1)), action=(reg0[1] = 1; next;) table=4 (ls_out_acl ), priority=1100 , match=(reg0[8] == 1 && (1)), action=(next;) table=4 (ls_out_acl ), priority=1 , match=(ip && !ct.est), action=(drop;) table=4 (ls_out_acl ), priority=1 , match=(ip && ct.est && ct_mark.blocked == 1), action=(reg0[1] = 1; next;) table=4 (ls_out_acl ), priority=0 , match=(1), action=(drop;)
(In reply to Ehsan Elahi from comment #11) > Thanks a lot Dumitru for the explanation. Until that point, the test > scenario was working. However, for "allow-related" action, it is again > showing the same problem. Consider the same simple network of one LS and 2 > VMs: > > ovn-nbctl acl-del ls > ovn-nbctl --wait=sb acl-add ls from-lport 100 "ip" allow-related > ovn-nbctl --apply-after-lb acl-add ls from-lport 100 1 allow > ovn-nbctl acl-add ls to-lport 100 1 allow > ovn-nbctl acl-add ls from-lport 200 "arp || nd" allow > ovn-nbctl --wait=hv sync > ip netns exec vm2 ping 42.42.42.2 -c 3 # ping failed > ip netns exec vm2 ping6 2001::2 -c 3 # ping failed > It seems there's a bug with apply-after-lb ACLs. I'm working on a potential fix. I'll post it upstream as soon as possible.
I posted a patch that should fix the apply-after-lb related issue: https://patchwork.ozlabs.org/project/ovn/list/?series=336872&state=* I'll update the BZ once the fix is accepted and backported.
V2 posted for review: https://patchwork.ozlabs.org/project/ovn/list/?series=337271&state=*
V2 merged and backported; the next downstream builds (22.06, 22.09, 22.12) will also include this fix.
ovn22.12 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163616 ovn22.09 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163617 ovn22.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163618 ovn22.06 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163619 ovn22.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163620
Hi Dumitru, I created a toplogy as below: # ----------------------------LR----------------------- # | 42.42.42.0 77.77.77.0 | 66.66.66.0 | # ------LS1------ ------LS2------- -----LS3------ # | | | | | | # | | | | | | # VM11 VM12 Vm21 VM22 VM31 VM32 # vm11, vm21, vm31 are on HV1 # vm12, vm22, vm32 are on HV0 The complete reproducer can be seen at http://pastebin.test.redhat.com/1091694 I noted that if NB_Global . options:default_acl_drop=true, and there is one acl configured on one ls (say ls1), then the traffic is affected only on that logical switch and all the traffic on other switches run as normal, until you configure acl on other switches. This seems logical. The problem arises when I create a port group with only one logical port (vm12) as its member. Then, even after creating punch holes for the traffic, all the traffic is blocked on the whole logical switch (ls1). ovn-nbctl pg-add pg vm12 ovn-nbctl acl-add pg from-lport 1001 "inport==@pg && ip4" allow ip netns exec vm11 ping 42.42.42.12 -c 3 # ping failed as expected ip netns exec vm11 ping6 2001::12 -c 3 # ping failed as expected ip netns exec vm11 ping 77.77.77.22 -c 3 # ping failed, should be passed as src and dst ports are not part of the port group pg ip netns exec vm11 ping6 2002::22 -c 3 # ping failed, should be passed as src and dst ports are not part of the port group pg ovn-nbctl --apply-after-lb acl-add ls1 from-lport 1 1 allow # Allow everything in the ingress pipeline after LB. ovn-nbctl acl-add ls1 to-lport 1 1 allow # Allow everything in the egress pipeline. ovn-nbctl acl-add ls1 from-lport 2 "arp || nd" allow # Allow all ARP/ND packets in the ingress pipeline. ip netns exec vm11 ping 42.42.42.12 -c 3 # ping failed ip netns exec vm11 ping6 2001::12 -c 3 # ping failed ip netns exec vm11 ping 77.77.77.22 -c 3 # ping failed ip netns exec vm11 ping6 2002::22 -c 3 # ping failed ovn-nbctl acl-del ls1 ovn-nbctl --apply-after-lb acl-add pg from-lport 1 1 allow ovn-nbctl acl-add pg to-lport 1 1 allow ovn-nbctl acl-add pg from-lport 2 "arp || nd" allow ip netns exec vm11 ping 42.42.42.12 -c 3 # ping failed ip netns exec vm11 ping6 2001::12 -c 3 # ping failed ip netns exec vm11 ping 77.77.77.22 -c 3 # ping failed ip netns exec vm11 ping6 2002::22 -c 3 # ping failed Am I missing something?
The bz can be reproduced on any ovn release without setting default_acl_drop to true. The reproducer can be found in comment 9 and comment 11. Verified on: ovn22.12-central-22.12.0-20.el8fdp.x86_64 openvswitch-selinux-extra-policy-1.0-29.el8fdp.noarch openvswitch2.17-2.17.0-74.el8fdp.x86_64 ovn22.12-22.12.0-20.el8fdp.x86_64 ovn22.12-host-22.12.0-20.el8fdp.x86_64
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.