The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 2186059 - Router and Neighbor Advertisement not working when all traffic is blocked for a port and all ACLs are stateless
Summary: Router and Neighbor Advertisement not working when all traffic is blocked for...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn23.03
Version: FDP 23.B
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Ihar Hrachyshka
QA Contact: Ehsan Elahi
URL:
Whiteboard:
Depends On:
Blocks: 1827598 2149731 2210279
TreeView+ depends on / blocked
 
Reported: 2023-04-12 01:49 UTC by Ihar Hrachyshka
Modified: 2023-08-29 12:37 UTC (History)
12 users (show)

Fixed In Version: ovn22.12-22.12.0-50.el8fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2149731
Environment:
Last Closed: 2023-08-21 02:08:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2814 0 None None None 2023-04-12 01:51:38 UTC
Red Hat Product Errata RHBA-2023:4684 0 None None None 2023-08-21 02:08:21 UTC

Description Ihar Hrachyshka 2023-04-12 01:49:05 UTC
+++ This bug was initially created as a clone of Bug #2149731 +++

Description of problem:
When a stateless security group is attached to the instance it fails to get an IPv6 address using SLAAC or stateless DHCP. An explicit rule is required to allow ICMPv6 traffic.

Checked with the custom security group (only egress traffic is allowed) as well as with the default security group (egress and ingress from the same SG are allowed).



Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20221115.n.2
Red Hat Enterprise Linux release 9.1 (Plow)

How reproducible:
100%


Steps to Reproduce:
openstack network create net_dual_slaac
openstack subnet create --subnet-range 10.100.1.0/24 --network net_dual_slaac subnet_dual_slaac
openstack subnet create --subnet-range 2001:0:0:1::0/64 --ip-version 6 --ipv6-ra-mode slaac --ipv6-address-mode slaac --network net_dual_slaac subnet_dual_slaac_ipv6
openstack router create router_test_boot
EXT_NET=`openstack network list --external -f value -c Name`
openstack router set --external-gateway $EXT_NET router_test_boot
openstack router add subnet router_test_boot subnet_dual_slaac
openstack security group create --stateless test_sg
openstack server create --image <IMG> --flavor <FLAV> --network net_dual_slaac --security-group test_sg vm_1

Actual results:
only IPv4 address appear on the instance


Expected results:
IPv6 address is expected

Additional info:
can be worked around by adding icmpv6 rule:
# openstack security group rule create --ingress --protocol icmpv6 test_sg

--- Additional comment from Ihar Hrachyshka on 2022-12-06 19:48:39 UTC ---

DHCPv6 should work by default for stateless SGs, same as for stateful.

--- Additional comment from Eran Kuris on 2023-03-02 10:29:52 UTC ---

Hi Ihar, 
can you update regarding the fix of this issue?

--- Additional comment from Ihar Hrachyshka on 2023-03-28 12:36:54 UTC ---

Status update:

1) patches are posted in upstream;
2) upstream reviewers (Slawek and Rodolfo) suggested that this topic needs more elaboration and discussion since they don't necessarily agree with the assumption that both metadata and ipv6 dhcp should work by default for stateless SGs; (I disagree)
3) they suggested to have a discussion on this topic during the vPTG this week; specifically, this Wed at 9am EST we'll discuss this exact topic;
4) once we have a resolution on what can be implemented upstream, I will work on adjusting the existing patches to upstream (if needed) this Friday.

Note that the above suggests that we may not have the bug fixed as expected in the test plan; at least upstream. So we may have to adjust the test plan maybe? The discussion this Wed should clarify what's possible in upstream.

--- Additional comment from Ihar Hrachyshka on 2023-04-12 01:47:28 UTC ---

I now believe that the bug is not for Neutron to fix (though it's technically possible). It's an inconsistency between "pure stateless" and "mixed-stateful" networks in OVN northd implementation. This should be fixed by: https://patchwork.ozlabs.org/project/ovn/list/?series=350425 (currently on review).

This bug should probably become a test tracker for a clone to ovn component where the actual fix belongs.

Comment 1 Ihar Hrachyshka 2023-04-19 19:55:24 UTC
The fix merged upstream. Backports pending.

Comment 2 Ihar Hrachyshka 2023-04-19 20:00:40 UTC
For OpenStack needs, the fix should be backported down to 22.12 at least.

Comment 3 Ihar Hrachyshka 2023-04-21 12:38:43 UTC
For reference, the final fix: https://github.com/ovn-org/ovn/commit/071cd7385f4aaf6e0e4635aa16a84e174b53d4ef

Test scenario is included, but to test it manually,

0) create LS with a IPv6 addressed port
1) define an ACL that blocks all traffic for the port
2) then initiate NA / RA / MLD protocols for the port and observe that it receives replies from neighbors / router regardless of blocking ACL

Comment 6 Ehsan Elahi 2023-07-18 08:23:18 UTC
Reproduced on:
openvswitch2.17-2.17.0-102.el8fdp.x86_64
ovn22.12-22.12.0-37.el8fdp.x86_64

Here is the reproducer:

######### ipv6_ns_na.py ########
#! /usr/bin/python
import sys
from scapy.all import *
p = Ether(dst='ff:ff:ff:ff:ff:ff', src='00:00:00:00:00:01')/IPv6(src="2000::2", dst="ff02::1:ff00:2")/ICMPv6ND_NS(tgt='2000::3')
sendp(p,iface="vm1")

######### ipv6_rs_ra.py ########
#! /usr/bin/python
import sys
from scapy.all import *
p = Ether(dst='ff:ff:ff:ff:ff:ff', src='00:00:00:00:00:01')/IPv6(src="2000::2", dst="fe80::1234")/ICMPv6ND_RS()
sendp(p,iface="vm1")

######### ipv6_mld.py ########
#! /usr/bin/python
import sys
from scapy.all import *
p = Ether(dst='ff:ff:ff:ff:ff:ff', src='00:00:00:00:00:01')/IPv6(src="fe80::1", dst="fe80::1234")/ICMPv6MLQuery2()
sendp(p,iface="vm1")
##########################################################################

systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
systemctl start openvswitch
ovs-vsctl set open . external_ids:system-id=hv1
ovs-vsctl set open . external_ids:ovn-remote=tcp:127.0.0.1:6642
ovs-vsctl set open . external_ids:ovn-encap-type=geneve
ovs-vsctl set open . external_ids:ovn-encap-ip=127.0.0.1
systemctl start ovn-controller

ovn-nbctl lr-add rtr
ovn-nbctl lrp-add rtr rtr-ls 00:00:00:00:01:00 42.42.42.1 2000::1/64
ovn-nbctl set Logical_Router_Port rtr-ls ipv6_ra_configs:address_mode="slaac"

ovn-nbctl ls-add ls
ovn-nbctl lsp-add ls ls-rtr
ovn-nbctl lsp-set-addresses ls-rtr "00:00:00:00:01:00 42.42.42.1/24 2000::1/64"
ovn-nbctl lsp-set-type ls-rtr router
ovn-nbctl lsp-set-options ls-rtr router-port=rtr-ls
ovn-nbctl lsp-add ls lsp1
ovn-nbctl lsp-set-addresses lsp1 "00:00:00:00:00:01 42.42.42.2 2000::2"

ovn-nbctl lsp-add ls lsp2
ovn-nbctl lsp-set-addresses lsp2 "00:00:00:00:00:02 42.42.42.3 2000::3"
ip netns add vm1
ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal
ip link set vm1 netns vm1
ip netns exec vm1 ip link set vm1 address 00:00:00:00:00:01
ip netns exec vm1 ip addr add 42.42.42.2/24 dev vm1
ip netns exec vm1 ip -6 addr add 2000::2/64 dev vm1
ip netns exec vm1 ip link set vm1 up
ip netns exec vm1 ip route add default via 42.42.42.1
ip netns exec vm1 ip -6 route add default via 2000::1
ovs-vsctl set Interface vm1 external_ids:iface-id=lsp1
ip netns add vm2
ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal
ip link set vm2 netns vm2
ip netns exec vm2 ip link set vm2 address 00:00:00:00:00:02
ip netns exec vm2 ip addr add 42.42.42.3/24 dev vm2
ip netns exec vm2 ip -6 addr add 2000::3/64 dev vm2
ip netns exec vm2 ip link set vm2 up
ip netns exec vm2 ip link set lo up
ip netns exec vm2 ip route add default via 42.42.42.1
ip netns exec vm2 ip -6 route add default via 2000::1
ovs-vsctl set Interface vm2 external_ids:iface-id=lsp2

ip netns exec vm1 ping -c 3 42.42.42.3
ip netns exec vm1 ping6 2000::3 -c3

# forbid all traffic for the ports
ovn-nbctl acl-add ls from-lport 1000 "inport == \"lsp1\"" drop
ovn-nbctl --apply-after-lb acl-add ls from-lport 1000 "inport == \"lsp1\"" drop
ovn-nbctl acl-add ls to-lport 1000 "outport == \"lsp1\"" drop

ovn-nbctl acl-add ls from-lport 1000 "inport == \"lsp2\"" drop
ovn-nbctl --apply-after-lb acl-add ls from-lport 1000 "inport == \"lsp2\"" drop
ovn-nbctl acl-add ls to-lport 1000 "outport == \"lsp2\"" drop

ovn-nbctl --wait=hv sync
## ping should be failed this time. 
ip netns exec vm1 ping -c 3 42.42.42.3   
ip netns exec vm1 ping6 2000::3 -c3

ip netns exec vm2 tcpdump -U -i any  -w ipv6_ns_na.pcap&
ip netns exec vm1 python3 ipv6_ns_na.py
sleep 3
pkill tcpdump
tcpdump -r ipv6_ns_na.pcap -nnle

ip netns exec vm2 tcpdump -U -i any  -w ipv6_rs_ra.pcap&
ip netns exec vm1 python3 ipv6_rs_ra.py
sleep 3
pkill tcpdump
tcpdump -r ipv6_rs_ra.pcap -nnle

ip netns exec vm2 tcpdump -U -i any  -w ipv6_mld.pcap&
ip netns exec vm1 python3 ipv6_mld.py
sleep 3
pkill tcpdump
tcpdump -r ipv6_mld.pcap -nnle

################# RESULTS ####################

#### non fixed version #####
## No higher priority flows are created for ND, RS, or MLD traffic
# ovn-sbctl dump-flows | grep -E "ls_.*_acl"
  table=4 (ls_in_pre_acl      ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
  table=4 (ls_in_pre_acl      ), priority=0    , match=(1), action=(next;)
  table=7 (ls_in_acl_hint     ), priority=0    , match=(1), action=(next;)
  table=8 (ls_in_acl          ), priority=34000, match=(eth.dst == $svc_monitor_mac), action=(next;)
  table=8 (ls_in_acl          ), priority=2000 , match=(inport == "lsp1"), action=(/* drop */)
  table=8 (ls_in_acl          ), priority=2000 , match=(inport == "lsp2"), action=(/* drop */)
  table=8 (ls_in_acl          ), priority=0    , match=(1), action=(next;)
  table=17(ls_in_acl_after_lb ), priority=0    , match=(1), action=(next;)
  table=1 (ls_out_pre_acl     ), priority=110  , match=(eth.src == $svc_monitor_mac), action=(next;)
  table=1 (ls_out_pre_acl     ), priority=0    , match=(1), action=(next;)
  table=3 (ls_out_acl_hint    ), priority=0    , match=(1), action=(next;)
  table=4 (ls_out_acl         ), priority=34000, match=(eth.src == $svc_monitor_mac), action=(next;)
  table=4 (ls_out_acl         ), priority=2000 , match=(outport == "lsp1"), action=(/* drop */)
  table=4 (ls_out_acl         ), priority=2000 , match=(outport == "lsp2"), action=(/* drop */)
  table=4 (ls_out_acl         ), priority=0    , match=(1), action=(next;)

############ Nothing is received at vm2 as all the traffic was blocked. Empty dump. 
# tcpdump -r ipv6_ns_na.pcap -nnle
  reading from file ipv6_ns_na.pcap, link-type LINUX_SLL (Linux cooked v1)
  dropped privs to tcpdump
  [1]+  Done                    ip netns exec vm2 tcpdump -U -i any -w ipv6_ns_na.pcap


##### Fixed version ##########

Verified on:
penvswitch2.17-2.17.0-102.el8fdp.x86_64
ovn22.12-22.12.0-73.el8fdp.x86_64

## Higher priority flows are created for ND, RS, or MLD traffic
ovn-sbctl dump-flows | grep -E "ls_.*_acl"
  table=4 (ls_in_pre_acl      ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
  table=4 (ls_in_pre_acl      ), priority=0    , match=(1), action=(next;)
  table=7 (ls_in_acl_hint     ), priority=0    , match=(1), action=(next;)
  table=8 (ls_in_acl          ), priority=65532, match=(nd || nd_ra || nd_rs || mldv1 || mldv2), action=(next;)
  table=8 (ls_in_acl          ), priority=34000, match=(eth.dst == $svc_monitor_mac), action=(next;)
  table=8 (ls_in_acl          ), priority=2000 , match=(inport == "lsp1"), action=(/* drop */)
  table=8 (ls_in_acl          ), priority=2000 , match=(inport == "lsp2"), action=(/* drop */)
  table=8 (ls_in_acl          ), priority=0    , match=(1), action=(next;)
  table=17(ls_in_acl_after_lb ), priority=65532, match=(nd || nd_ra || nd_rs || mldv1 || mldv2), action=(next;)
  table=17(ls_in_acl_after_lb ), priority=2000 , match=(inport == "lsp1"), action=(/* drop */)
  table=17(ls_in_acl_after_lb ), priority=2000 , match=(inport == "lsp2"), action=(/* drop */)
  table=17(ls_in_acl_after_lb ), priority=0    , match=(1), action=(next;)
  table=1 (ls_out_pre_acl     ), priority=110  , match=(eth.src == $svc_monitor_mac), action=(next;)
  table=1 (ls_out_pre_acl     ), priority=0    , match=(1), action=(next;)
  table=3 (ls_out_acl_hint    ), priority=0    , match=(1), action=(next;)
  table=4 (ls_out_acl         ), priority=65532, match=(nd || nd_ra || nd_rs || mldv1 || mldv2), action=(next;)
  table=4 (ls_out_acl         ), priority=34000, match=(eth.src == $svc_monitor_mac), action=(next;)
  table=4 (ls_out_acl         ), priority=2000 , match=(outport == "lsp1"), action=(/* drop */)
  table=4 (ls_out_acl         ), priority=2000 , match=(outport == "lsp2"), action=(/* drop */)
  table=4 (ls_out_acl         ), priority=0    , match=(1), action=(next;)

##### Tcpdump for NS, RS and MLD packets received at VM2 from VM1. 

# tcpdump -r ipv6_ns_na.pcap -nnle
  reading from file ipv6_ns_na.pcap, link-type LINUX_SLL (Linux cooked v1)
  dropped privs to tcpdump
  12:38:31.129093   B 00:00:00:00:00:01 ethertype IPv6 (0x86dd), length 80: 2000::2 > ff02::1:ff00:2: ICMP6, neighbor solicitation, who has 2000::3, length 24
  12:38:31.129112 Out 00:00:00:00:00:02 ethertype IPv6 (0x86dd), length 88: 2000::3 > 2000::2: ICMP6, neighbor advertisement, tgt is 2000::3, length 32
  12:38:36.129401 Out 00:00:00:00:00:02 ethertype IPv6 (0x86dd), length 88: fe80::200:ff:fe00:2 > 2000::2: ICMP6, neighbor solicitation, who has 2000::2, length 32
  12:38:36.130179  In 00:00:00:00:00:01 ethertype IPv6 (0x86dd), length 88: 2000::2 > fe80::200:ff:fe00:2: ICMP6, neighbor advertisement, tgt is 2000::2, length 32
  [1]+  Done                    ip netns exec vm2 tcpdump -U -i any -w ipv6_ns_na.pcap

# tcpdump -r ipv6_rs_nd.pcap -nnle
  reading from file ipv6_rs_nd.pcap, link-type LINUX_SLL (Linux cooked v1)
  dropped privs to tcpdump
  13:15:52.536703   B 00:00:00:00:00:01 ethertype IPv6 (0x86dd), length 64: 2000::2 > fe80::1234: ICMP6, router solicitation, length 8

# tcpdump -r ipv6_mld.pcap -nnle
  reading from file ipv6_mld.pcap, link-type LINUX_SLL (Linux cooked v1)
  dropped privs to tcpdump
  18:54:45.175487   B 00:00:00:00:00:01 ethertype IPv6 (0x86dd), length 84: fe80::1 > fe80::1234: ICMP6, multicast listener query v2 [gaddr ::], length 28

Comment 8 errata-xmlrpc 2023-08-21 02:08:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn23.03 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:4684


Note You need to log in before you can comment on or make changes to this bug.