The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1947807 - [OVN][RFE] Change OVN to drop traffic by default based on a configuration setting
Summary: [OVN][RFE] Change OVN to drop traffic by default based on a configuration set...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn22.12
Version: FDP 21.I
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Dumitru Ceara
QA Contact: Ehsan Elahi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-09 09:47 UTC by Daniel Alvarez Sanchez
Modified: 2025-02-10 04:00 UTC (History)
10 users (show)

Fixed In Version: ovn22.12-22.12.0-18.el8fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2025-02-10 04:00:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1221 0 None None None 2022-04-05 11:22:07 UTC

Description Daniel Alvarez Sanchez 2021-04-09 09:47:56 UTC
Today, OVN allows all the traffic that is not explicitly forbidden by an ACL.

OpenStack works the other way around, blocking all the traffic that is not explicitly allowed. In order for OpenStack to enable this behavior in OVN, it artificially creates a Port Group in the NB database where all ports are added to:

In large deployments, this port group may have several thousands of ports and it gets updated very frequently causing a lot of transactions in NB and, hence, in the SB database as well as load to northd, expansion of the associated Address Sets, insertion/deletion of OpenFlow rules, etcetera.

It would be beneficial to add a configuration setting into OVN that allows defining the default allow/deny behavior in the ACL stage so that we can spare this burden to all layers by simply having a low priority flow that drops traffic.

Comment 1 Dumitru Ceara 2022-04-06 14:35:05 UTC
RFC with initial implementation posted upstream for discussion:
https://patchwork.ozlabs.org/project/ovn/list/?series=293812&state=*

This is RFC because we need to figure out the best default behavior for some of the edge cases.

Comment 3 Jianlin Shi 2022-04-12 02:58:12 UTC
as the feature is still in development, no accurate reproducer for the moment, would try to automate it after the feature is ready. flush needinfo for the moment.

Comment 4 Jianlin Shi 2022-04-12 08:59:54 UTC
set coverage- before the patches are ready.

Comment 5 Dumitru Ceara 2022-04-19 13:38:41 UTC
V1 posted for review: https://patchwork.ozlabs.org/project/ovn/list/?series=295774&state=*

Comment 6 Dumitru Ceara 2022-04-26 15:38:30 UTC
V1 was accepted but introduced a bug; this follow up is needed too:
https://patchwork.ozlabs.org/project/ovn/list/?series=296961&state=*

Comment 9 Ehsan Elahi 2023-01-10 19:18:27 UTC
It works well on ACLs with combinations of L3 and L4 protocols. I tested it with the following ACLs having set default_acl_drop to true:
ovn-nbctl acl-add ls from-lport 1 "ip4 && tcp" allow
ovn-nbctl acl-add ls from-lport 1 "ip4 && tcp" allow-related
ovn-nbctl --apply-after-lb acl-add ls from-lport 1 "ip4 && tcp" allow
ovn-nbctl --apply-after-lb acl-add ls from-lport 1 "ip4 && tcp" allow-related

All those worked as expected. 

However, it is not working with these ACLs:
ovn-nbctl acl-add ls from-lport 1 "ip4" allow
ovn-nbctl acl-add ls from-lport 1 "ip6" allow
ovn-nbctl acl-add ls from-lport 1 "icmp4" allow
ovn-nbctl acl-add ls from-lport 1 "icmp6" allow

With these ACLs, all the flows are being dropped whether ip4 or ip6. 

Am I missing something??

Here is the reproducer:

systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
systemctl start openvswitch
ovs-vsctl set open . external_ids:system-id=hv1
# IP address configuration to physical interface
ifconfig ens1f0 42.42.42.1 netmask 255.0.0.0
ovs-vsctl set open . external_ids:ovn-remote=tcp:42.42.42.1:6642
ovs-vsctl set open . external_ids:ovn-encap-type=geneve
ovs-vsctl set open . external_ids:ovn-encap-ip=42.42.42.1
systemctl restart ovn-controller

ovn-nbctl ls-add ls
ovn-nbctl lsp-add ls lsp1 -- lsp-set-addresses lsp1 00:00:00:00:00:01
ovn-nbctl lsp-add ls lsp2 -- lsp-set-addresses lsp2 00:00:00:00:00:02

ip netns add vm1
ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal
ip link set vm1 netns vm1
ip netns exec vm1 ip link set vm1 address 00:00:00:00:00:01
ip netns exec vm1 ip addr add 42.42.42.2/24 dev vm1
ip netns exec vm1 ip -6 addr add 2001::2/64 dev vm1
ip netns exec vm1 ip link set vm1 up
ip netns exec vm1 ip link set lo up
ovs-vsctl set Interface vm1 external_ids:iface-id=lsp1

ip netns add vm2
ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal
ip link set vm2 netns vm2
ip netns exec vm2 ip link set vm2 address 00:00:00:00:00:02
ip netns exec vm2 ip addr add 42.42.42.3/24 dev vm2
ip netns exec vm2 ip -6 addr add 2001::3/64 dev vm2
ip netns exec vm2 ip link set vm2 up
ip netns exec vm2 ip link set lo up
ovs-vsctl set Interface vm2 external_ids:iface-id=lsp2

ovn-nbctl --wait=hv set NB_Global . options:default_acl_drop=true
ovn-nbctl acl-add ls from-lport 1 "ip4" allow
ip netns exec vm2 ping 42.42.42.2 -c 3  ## ping failed
ip netns exec vm2 ping6 2001::2 -c 3  ## ping failed

ovn-nbctl acl-del ls
ovn-nbctl acl-add ls from-lport 1 "icmp4" allow
ip netns exec vm2 ping 42.42.42.2 -c 3  ## ping failed
ip netns exec vm2 ping6 2001::2 -c 3  ## ping failed

Comment 10 Dumitru Ceara 2023-01-11 15:53:55 UTC
It's a bit confusing but there are actually 3 ACL stages in the switch pipeline:

1. ingress pipeline, before load balancer (all ACLs added with "from-lport").
2. ingress pipeline, after load balancer (all ACLs added with "--apply-after-lb, from-lport").
3. egress pipeline (all ACLs added with "to-lport").

If NB_Global.options:default_acl_drop is set to "true", we need to punch holes in all these stages.

In your case there are two issues:
- IP traffic gets dropped in stages "2" and "3" above.
- ARP/ND traffic gets dropped in all stages.

I added the following ACLs on top of your config and traffic flows fine then:

  ovn-nbctl --apply-after-lb acl-add ls from-lport 1 1 allow # Allow everything in the ingress pipeline after LB.
  ovn-nbctl acl-add ls to-lport 1 1 allow                    # Allow everything in the egress pipeline.
  ovn-nbctl acl-add ls from-lport 2 "arp || nd" allow        # Allow all ARP/ND packets in the ingress pipeline.

Regards,
Dumitru

Comment 11 Ehsan Elahi 2023-01-12 19:02:06 UTC
Thanks a lot Dumitru for the explanation. Until that point, the test scenario was working. However, for "allow-related" action, it is again showing the same problem. Consider the same simple network of one LS and 2 VMs:

ovn-nbctl acl-del ls
ovn-nbctl --wait=sb acl-add ls from-lport 100 "ip" allow-related
ovn-nbctl --apply-after-lb acl-add ls from-lport 100 1 allow
ovn-nbctl acl-add ls to-lport 100 1 allow
ovn-nbctl acl-add ls from-lport 200 "arp || nd" allow
ovn-nbctl --wait=hv sync
ip netns exec vm2 ping 42.42.42.2 -c 3   # ping failed
ip netns exec vm2 ping6 2001::2 -c 3  # ping failed

I also tried this way:

ovn-nbctl acl-del ls
ovn-nbctl --wait=sb acl-add ls from-lport 100 "ip" allow-related
ovn-nbctl --apply-after-lb acl-add ls from-lport 100 1 allow-related
ovn-nbctl acl-add ls to-lport 100 1 allow-related
ovn-nbctl acl-add ls from-lport 200 "arp || nd" allow-related
ovn-nbctl --wait=hv sync
ip netns exec vm2 ping 42.42.42.2 -c 3   # ping failed
ip netns exec vm2 ping6 2001::2 -c 3  # ping failed

I tried sending the TCP packets, but no luck. 

echo \"abcdefg\" >> send.pkt
ip netns exec vm1 ncat  -l 2345 > tcp1.pkt &
ip netns exec vm2 ncat  42.42.42.2 2345 < send.pkt

I think these two flows are blocking the traffic:
table=8 (ls_in_acl          ), priority=1    , match=(ip && !ct.est), action=(drop;)
table=4 (ls_out_acl         ), priority=1    , match=(ip && !ct.est), action=(drop;)

Complete dump-flows is here below:

ovn-sbctl dump-flows | grep -E "ls_.*_acl"

table=4 (ls_in_pre_acl      ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
  table=4 (ls_in_pre_acl      ), priority=110  , match=(eth.mcast), action=(next;)
  table=4 (ls_in_pre_acl      ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2 || (udp && udp.src == 546 && udp.dst == 547)), action=(next;)
  table=4 (ls_in_pre_acl      ), priority=100  , match=(ip), action=(reg0[0] = 1; next;)
  table=4 (ls_in_pre_acl      ), priority=0    , match=(1), action=(next;)
  table=7 (ls_in_acl_hint     ), priority=7    , match=(ct.new && !ct.est), action=(reg0[7] = 1; reg0[9] = 1; next;)
  table=7 (ls_in_acl_hint     ), priority=6    , match=(!ct.new && ct.est && !ct.rpl && ct_mark.blocked == 1), action=(reg0[7] = 1; reg0[9] = 1; next;)
  table=7 (ls_in_acl_hint     ), priority=5    , match=(!ct.trk), action=(reg0[8] = 1; reg0[9] = 1; next;)
  table=7 (ls_in_acl_hint     ), priority=4    , match=(!ct.new && ct.est && !ct.rpl && ct_mark.blocked == 0), action=(reg0[8] = 1; reg0[10] = 1; next;)
  table=7 (ls_in_acl_hint     ), priority=3    , match=(!ct.est), action=(reg0[9] = 1; next;)
  table=7 (ls_in_acl_hint     ), priority=2    , match=(ct.est && ct_mark.blocked == 1), action=(reg0[9] = 1; next;)
  table=7 (ls_in_acl_hint     ), priority=1    , match=(ct.est && ct_mark.blocked == 0), action=(reg0[10] = 1; next;)
  table=7 (ls_in_acl_hint     ), priority=0    , match=(1), action=(next;)
  table=8 (ls_in_acl          ), priority=65532, match=(!ct.est && ct.rel && !ct.new && !ct.inv && ct_mark.blocked == 0), action=(next;)
  table=8 (ls_in_acl          ), priority=65532, match=(ct.est && !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_mark.blocked == 0), action=(reg0[9] = 0; reg0[10] = 0; next;)
  table=8 (ls_in_acl          ), priority=65532, match=(ct.inv || (ct.est && ct.rpl && ct_mark.blocked == 1)), action=(drop;)
  table=8 (ls_in_acl          ), priority=65532, match=(nd || nd_ra || nd_rs || mldv1 || mldv2), action=(next;)
  table=8 (ls_in_acl          ), priority=34000, match=(eth.dst == $svc_monitor_mac), action=(next;)
  table=8 (ls_in_acl          ), priority=1200 , match=(reg0[7] == 1 && (arp || nd)), action=(reg0[1] = 1; next;)
  table=8 (ls_in_acl          ), priority=1200 , match=(reg0[8] == 1 && (arp || nd)), action=(next;)
  table=8 (ls_in_acl          ), priority=1100 , match=(reg0[7] == 1 && (ip)), action=(reg0[1] = 1; next;)
  table=8 (ls_in_acl          ), priority=1100 , match=(reg0[8] == 1 && (ip)), action=(next;)
  table=8 (ls_in_acl          ), priority=1    , match=(ip && !ct.est), action=(drop;)
  table=8 (ls_in_acl          ), priority=1    , match=(ip && ct.est && ct_mark.blocked == 1), action=(reg0[1] = 1; next;)
  table=8 (ls_in_acl          ), priority=0    , match=(1), action=(drop;)
  table=12(ls_in_acl_after_lb ), priority=1100 , match=(reg0[7] == 1 && (1)), action=(reg0[1] = 1; next;)
  table=12(ls_in_acl_after_lb ), priority=1100 , match=(reg0[8] == 1 && (1)), action=(next;)
  table=12(ls_in_acl_after_lb ), priority=0    , match=(1), action=(drop;)
  table=1 (ls_out_pre_acl     ), priority=110  , match=(eth.mcast), action=(next;)
  table=1 (ls_out_pre_acl     ), priority=110  , match=(eth.src == $svc_monitor_mac), action=(next;)
  table=1 (ls_out_pre_acl     ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2 || (udp && udp.src == 546 && udp.dst == 547)), action=(next;)
  table=1 (ls_out_pre_acl     ), priority=100  , match=(ip), action=(reg0[0] = 1; next;)
  table=1 (ls_out_pre_acl     ), priority=0    , match=(1), action=(next;)
  table=3 (ls_out_acl_hint    ), priority=7    , match=(ct.new && !ct.est), action=(reg0[7] = 1; reg0[9] = 1; next;)
  table=3 (ls_out_acl_hint    ), priority=6    , match=(!ct.new && ct.est && !ct.rpl && ct_mark.blocked == 1), action=(reg0[7] = 1; reg0[9] = 1; next;)
  table=3 (ls_out_acl_hint    ), priority=5    , match=(!ct.trk), action=(reg0[8] = 1; reg0[9] = 1; next;)
  table=3 (ls_out_acl_hint    ), priority=4    , match=(!ct.new && ct.est && !ct.rpl && ct_mark.blocked == 0), action=(reg0[8] = 1; reg0[10] = 1; next;)
  table=3 (ls_out_acl_hint    ), priority=3    , match=(!ct.est), action=(reg0[9] = 1; next;)
  table=3 (ls_out_acl_hint    ), priority=2    , match=(ct.est && ct_mark.blocked == 1), action=(reg0[9] = 1; next;)
  table=3 (ls_out_acl_hint    ), priority=1    , match=(ct.est && ct_mark.blocked == 0), action=(reg0[10] = 1; next;)
  table=3 (ls_out_acl_hint    ), priority=0    , match=(1), action=(next;)
  table=4 (ls_out_acl         ), priority=65532, match=(!ct.est && ct.rel && !ct.new && !ct.inv && ct_mark.blocked == 0), action=(next;)
  table=4 (ls_out_acl         ), priority=65532, match=(ct.est && !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_mark.blocked == 0), action=(next;)
  table=4 (ls_out_acl         ), priority=65532, match=(ct.inv || (ct.est && ct.rpl && ct_mark.blocked == 1)), action=(drop;)
  table=4 (ls_out_acl         ), priority=65532, match=(nd || nd_ra || nd_rs || mldv1 || mldv2), action=(next;)
  table=4 (ls_out_acl         ), priority=34000, match=(eth.src == $svc_monitor_mac), action=(next;)
  table=4 (ls_out_acl         ), priority=1100 , match=(reg0[7] == 1 && (1)), action=(reg0[1] = 1; next;)
  table=4 (ls_out_acl         ), priority=1100 , match=(reg0[8] == 1 && (1)), action=(next;)
  table=4 (ls_out_acl         ), priority=1    , match=(ip && !ct.est), action=(drop;)
  table=4 (ls_out_acl         ), priority=1    , match=(ip && ct.est && ct_mark.blocked == 1), action=(reg0[1] = 1; next;)
  table=4 (ls_out_acl         ), priority=0    , match=(1), action=(drop;)

Comment 12 Dumitru Ceara 2023-01-13 15:16:01 UTC
(In reply to Ehsan Elahi from comment #11)
> Thanks a lot Dumitru for the explanation. Until that point, the test
> scenario was working. However, for "allow-related" action, it is again
> showing the same problem. Consider the same simple network of one LS and 2
> VMs:
> 
> ovn-nbctl acl-del ls
> ovn-nbctl --wait=sb acl-add ls from-lport 100 "ip" allow-related
> ovn-nbctl --apply-after-lb acl-add ls from-lport 100 1 allow
> ovn-nbctl acl-add ls to-lport 100 1 allow
> ovn-nbctl acl-add ls from-lport 200 "arp || nd" allow
> ovn-nbctl --wait=hv sync
> ip netns exec vm2 ping 42.42.42.2 -c 3   # ping failed
> ip netns exec vm2 ping6 2001::2 -c 3  # ping failed
> 

It seems there's a bug with apply-after-lb ACLs.  I'm working on a potential fix.  I'll post it upstream as soon as possible.

Comment 13 Dumitru Ceara 2023-01-16 15:03:35 UTC
I posted a patch that should fix the apply-after-lb related issue:

https://patchwork.ozlabs.org/project/ovn/list/?series=336872&state=*

I'll update the BZ once the fix is accepted and backported.

Comment 14 Dumitru Ceara 2023-01-18 12:50:51 UTC
V2 posted for review: https://patchwork.ozlabs.org/project/ovn/list/?series=337271&state=*

Comment 15 Dumitru Ceara 2023-01-23 15:25:07 UTC
V2 merged and backported; the next downstream builds (22.06, 22.09, 22.12) will also include this fix.

Comment 16 OVN Bot 2023-01-24 05:08:31 UTC
ovn22.12 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163616
ovn22.09 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163617
ovn22.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163618
ovn22.06 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163619
ovn22.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2163620

Comment 20 Ehsan Elahi 2023-02-14 11:30:00 UTC
Hi Dumitru,
I created a toplogy as below:

# ----------------------------LR-----------------------
# | 42.42.42.0      77.77.77.0 |           66.66.66.0 |
# ------LS1------       ------LS2-------       -----LS3------
# |             |       |              |       |            |
# |             |       |              |       |            |
# VM11         VM12    Vm21          VM22     VM31         VM32

# vm11, vm21, vm31 are on HV1
# vm12, vm22, vm32 are on HV0

The complete reproducer can be seen at http://pastebin.test.redhat.com/1091694

I noted that if NB_Global . options:default_acl_drop=true, and there is one acl configured on one ls (say ls1), then the traffic is affected only on that logical switch and all the traffic on other switches run as normal, until you configure acl on other switches. 
This seems logical. 

The problem arises when I create a port group with only one logical port (vm12) as its member. Then, even after creating punch holes for the traffic, all the traffic is blocked on the whole logical switch (ls1). 

ovn-nbctl pg-add pg vm12
ovn-nbctl acl-add pg from-lport 1001 "inport==@pg && ip4" allow
 
ip netns exec vm11 ping 42.42.42.12 -c 3  # ping failed as expected
ip netns exec vm11 ping6 2001::12 -c 3 # ping failed as expected
ip netns exec vm11 ping 77.77.77.22 -c 3  # ping failed, should be passed as src and dst ports are not part of the port group pg
ip netns exec vm11 ping6 2002::22 -c 3 # ping failed, should be passed as src and dst ports are not part of the port group pg

ovn-nbctl --apply-after-lb acl-add ls1 from-lport 1 1 allow # Allow everything in the ingress pipeline after LB.
ovn-nbctl acl-add ls1 to-lport 1 1 allow                    # Allow everything in the egress pipeline.
ovn-nbctl acl-add ls1 from-lport 2 "arp || nd" allow        # Allow all ARP/ND packets in the ingress pipeline.
ip netns exec vm11 ping 42.42.42.12 -c 3  # ping failed
ip netns exec vm11 ping6 2001::12 -c 3 # ping failed
ip netns exec vm11 ping 77.77.77.22 -c 3  # ping failed
ip netns exec vm11 ping6 2002::22 -c 3 # ping failed
 
ovn-nbctl acl-del ls1
 
ovn-nbctl --apply-after-lb acl-add pg from-lport 1 1 allow
ovn-nbctl acl-add pg to-lport 1 1 allow
ovn-nbctl acl-add pg from-lport 2 "arp || nd" allow
ip netns exec vm11 ping 42.42.42.12 -c 3  # ping failed
ip netns exec vm11 ping6 2001::12 -c 3 # ping failed
ip netns exec vm11 ping 77.77.77.22 -c 3  # ping failed
ip netns exec vm11 ping6 2002::22 -c 3 # ping failed

Am I missing something?

Comment 21 Ehsan Elahi 2023-03-03 07:45:00 UTC
The bz can be reproduced on any ovn release without setting default_acl_drop to true. The reproducer can be found in comment 9 and comment 11. 
Verified on:

ovn22.12-central-22.12.0-20.el8fdp.x86_64
openvswitch-selinux-extra-policy-1.0-29.el8fdp.noarch
openvswitch2.17-2.17.0-74.el8fdp.x86_64
ovn22.12-22.12.0-20.el8fdp.x86_64
ovn22.12-host-22.12.0-20.el8fdp.x86_64

Comment 23 Red Hat Bugzilla 2025-02-10 04:00:33 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.


Note You need to log in before you can comment on or make changes to this bug.