The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1847570 - [RFE] Add support for BFD from OVN routers to other directly connected L3 devices
Summary: [RFE] Add support for BFD from OVN routers to other directly connected L3 dev...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: RHEL 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: lorenzo bianconi
QA Contact: ying xu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-16 15:49 UTC by Tim Rozet
Modified: 2021-03-15 14:36 UTC (History)
3 users (show)

Fixed In Version: ovn2.13-20.12.0-4.el7fdn
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-15 14:36:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:0836 0 None None None 2021-03-15 14:36:23 UTC

Description Tim Rozet 2020-06-16 15:49:20 UTC
Description of problem:
Today with an OVN router we have the ability to configure a pool of ecmp routes like:
ovn-nbctl --policy=src-ip --ecmp lr-route-add GR_node1 10.0.0.6/32 1.1.1.3
ovn-nbctl --policy=src-ip --ecmp lr-route-add GR_node1 10.0.0.6/32 1.1.1.4


Consider the following external diagram:

                                                                                                
                                                                                                
                                                                                  +------------+
                                                                                  |EX router 1 |
                                                                   External Net   |1.1.1.3     |
+----------+               +----------+             +----------+              ----|            |
| 10.0.0.6 |               | OVN DR   |             |OVN GR    |1.1.1.1------/    +------------+
| POD      |---------------|          |-------------|          |----/                           
|          |               |          |             |          | --------\       +------------+ 
+----------+               +----------+             +----------+          -------|            | 
                                                                                 |  EX router2| 
                                                                                 |  1.1.1.4   | 
                                                                                 +------------+ 
                                                                                                
                                                                                                
                                                                                                
                                                                                                
This will allow traffic coming from 10.0.0.6 to be hashed and load-balanced to either 1.1.1.3, or 1.1.1.4. These destinations may be on an external network, or may be ports attached to another OVN node. The only requirement is that these endpoints are in a different subnet, so that traffic *must* flow through an OVN router to be balanced by ecmp.

This problem with this setup is that if 1.1.1.3 or 1.1.1.4 have a network outage, there is no way for OVN router to know and remove that endpoint from its ecmp pool. In other routers it is possible to configure bi-directional forwarding detection (BFD) with routing protocols between directly connected neighbors. This is even possible on configured static routes (like shown above). The request is to be able to do this same configuration with an OVN router (GR or DR) to be able to bring up a BFD session with a directly connected neighbor. The configuration would look like this:

1. enable BFD on the L3 interface for the router (typically done on other routers) and specify BFD timers (this is somewhat optional for this use case)
2. enable BFD on the routes programmed into the router:
ovn-nbctl --policy=src-ip --ecmp lr-route-add GR_node1 10.0.0.6/32 1.1.1.3 bfd
ovn-nbctl --policy=src-ip --ecmp lr-route-add GR_node1 10.0.0.6/32 1.1.1.4 bfd

Upon receiving this configuration, OVN would then send BFD messages to 1.1.1.3 and 1.1.1.4. This would require the .3 and .4 endpoints to also have BFD configured and point to the OVN Router IP.

Since the OVN "router" is really a logical construct and these are control plane packets, I think ovn-controller will need to handle processing this control traffic. Then somehow when an outage is detected on a peer, openflow needs to no longer use that endpoint in its ecmp group.

Comment 4 ying xu 2021-02-18 10:05:14 UTC
verified on version:
# rpm -qa|grep ovn
ovn2.13-host-20.12.0-15.el8fdp.x86_64
ovn2.13-central-20.12.0-15.el8fdp.x86_64
ovn2.13-20.12.0-15.el8fdp.x86_64


topo:
vm1-----network1----router1-------public------external-----pc
           |
           vm2

script:
		ovn-nbctl ls-add network1
		ovn-nbctl lsp-add network1 vm1
		ovn-nbctl lsp-set-addresses vm1 "40:44:00:00:00:01 192.168.1.11 2000::11"
		ovn-nbctl lsp-add network1 vm2
		ovn-nbctl lsp-set-addresses vm2 "40:44:00:00:00:02 192.168.1.12 2000::12"

		ovn-nbctl ls-add public
		ovn-nbctl lsp-add public public-localnet
		ovn-nbctl lsp-set-type public-localnet localnet
		ovn-nbctl lsp-set-addresses public-localnet unknown
		ovn-nbctl lsp-set-options public-localnet network_name=external
		ovs-vsctl add-br br-labNet
		ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=external:br-labNet
		ovs-vsctl add-port br-labNet ha_veth0
		ip link set br-labNet up

		ovn-nbctl lr-add router1
		ovn-nbctl lrp-add router1 router1-net1 40:44:00:00:00:04 192.168.1.1/24 2000::1/64
		ovn-nbctl lsp-add network1 net1-router1
		ovn-nbctl lsp-set-type net1-router1 router
		ovn-nbctl lsp-set-addresses net1-router1 router
		ovn-nbctl lsp-set-options net1-router1 router-port=router1-net1

		ovn-nbctl lrp-add router1 router1-net2 40:44:00:00:00:05 192.168.2.1/24 2001::1/64
		ovn-nbctl lsp-add network2 net2-router1
		ovn-nbctl lsp-set-type net2-router1 router
		ovn-nbctl lsp-set-addresses net2-router1 router
		ovn-nbctl lsp-set-options net2-router1 router-port=router1-net2

		ovn-nbctl lrp-add router1 router1-public 40:44:00:00:00:06 172.16.1.1/24 2002::1/64
		ovn-nbctl lsp-add public public-router1
		ovn-nbctl lsp-set-type public-router1 router
		ovn-nbctl lsp-set-addresses public-router1 router
		ovn-nbctl lsp-set-options public-router1 router-port=router1-public

		ovn-nbctl --id=@gc0 create Gateway_Chassis name=public-gw1 chassis_name=hv1 priority=20 -- --id=@gc1 create Gateway_Chassis name=public-gw2 chassis_name=hv0 priority=10 -- set Logical_Router_Port router1-public 'gateway_chassis=[@gc0,@gc1]'


		ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal
		ip netns add vm1
		ip link set vm1 netns vm1
		ip netns exec vm1 ip link set lo up
		ip netns exec vm1 ip link set vm1 up
		ip netns exec vm1 ip link set vm1 address 40:44:00:00:00:01
		ip netns exec vm1 ip addr add 192.168.1.11/24 dev vm1
		ip netns exec vm1 ip -6 addr add 2000::11/64 dev vm1
		ip netns exec vm1 ip route add default via 192.168.1.1 dev vm1
		ip netns exec vm1 ip -6 route add default via 2000:: dev vm1
		ovs-vsctl set Interface vm1 external_ids:iface-id=vm1

		ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal
		ip netns add vm2
		ip link set vm2 netns vm2
		ip netns exec vm2 ip link set lo up
		ip netns exec vm2 ip link set vm2 up
		ip netns exec vm2 ip link set vm2 address 40:44:00:00:00:02
		ip netns exec vm2 ip addr add 192.168.1.12/24 dev vm2
		ip netns exec vm2 ip -6 addr add 192.168.1.12/24 dev vm2
		ip netns exec vm2 ip route add default via 192.168.1.1 dev vm2
		ip netns exec vm2 ip -6 route add default via 2000:: dev vm2
		ovs-vsctl set Interface vm2 external_ids:iface-id=vm2

		ip netns add external
		ip link add ha_veth0 type veth peer name ha_veth0_p netns external
		ip netns exec external ip link set lo up
		ip netns exec external ip link set ha_veth0_p up
		ip link set ha_veth0 up
		ip netns exec external ip addr add 172.16.1.50/24 dev ha_veth0_p
		ip netns exec external ip addr add 172.16.1.51/24 dev ha_veth0_p
		ip netns exec external ip -6 addr add 2002::50/64 dev ha_veth0_p
		ip netns exec external ip -6 addr add 2002::51/64 dev ha_veth0_p
		ip link add veth0 type veth peer name veth0_peer

		ip link set up dev veth0
		ip link set veth0_peer netns external
		ip netns exec external ip link set up dev veth0_peer
		ip netns exec external ip addr add 192.168.100.1/24 dev veth0_peer
		ip netns exec external ip -6 addr add 2003::1/24 dev veth0_pee
		ip addr add 192.168.100.2/24 dev veth0
		ip -6 addr add 2003::2/24 dev veth0
		ip route add 172.16.1.0/24 via 192.168.100.1
		ip -6 route add 2002::/64 via 2003::1
		ip netns exec external ip route add default via 172.16.1.1
		ip netns exec external ip -6 route add default via 2002::1
		ip netns exec external sysctl net.ipv4.ip_forward=1
		ip netns exec external sysctl net.ipv6.ip_forward=1

		uuid=$(ovn-nbctl create bfd logical_port=router1-public dst_ip=172.16.1.50 min_tx=250 min_rx=250 detect_mult=10)
		uuid1=$(ovn-nbctl create bfd logical_port=router1-public dst_ip="2002\:\:50" min_tx=250 min_rx=250 detect_mult=10)
		uuid2=$(ovn-nbctl create bfd logical_port=router1-public dst_ip="2002\:\:51" min_tx=250 min_rx=250 detect_mult=10)
		uuid3=$(ovn-nbctl create bfd logical_port=router1-public dst_ip=172.16.1.51 min_tx=250 min_rx=250 detect_mult=10)

                ovn-nbctl  --ecmp lr-route-add router1 192.168.100.0/24 172.16.1.50
                ovn-nbctl --bfd=$uuid3 --ecmp lr-route-add router1 192.168.100.0/24 172.16.1.51
                ovn-nbctl  --ecmp lr-route-add router1 2003::/24 2002::50
                ovn-nbctl --bfd=$uuid2 --ecmp lr-route-add router1 2003::/24 2002::51
                uid=$(ovn-nbctl --bare --columns _uuid find logical_router_static_route nexthop=172.16.1.50)
                uid1=$(ovn-nbctl --bare --columns _uuid find logical_router_static_route nexthop="2002\:\:50")
                ovn-nbctl set logical_router_static_route $route_uuid bfd=$uuid
                ovn-nbctl set logical_router_static_route $route_uuid1 bfd=$uuid1
                ovn-nbctl --wait=hv sync
                ip netns exec external bfdd-beacon --listen=0.0.0.0
                ip netns exec external bfdd-control allow 172.16.1.1
                sleep 5
# ovn-nbctl list bfd
_uuid               : 5e0ce2bc-d556-4e35-b860-7b44e7c61f47
detect_mult         : 10
dst_ip              : "172.16.1.51"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : up          ----------------------------------nexthop is up

_uuid               : 5df64215-4ef1-4e08-b2b9-82b9ec5b987c
detect_mult         : 10
dst_ip              : "2002::51"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : down

_uuid               : c2b026d3-3f86-4fea-841d-a2361f0651f4
detect_mult         : 10
dst_ip              : "172.16.1.50"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : up        -------------------------------------nexthop is up

_uuid               : 9ac508df-cca7-4fd9-8638-30388236293f
detect_mult         : 10
dst_ip              : "2002::50"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : down

ping from vm1 to pc:
# ip netns exec vm1 ping 192.168.100.2
PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
64 bytes from 192.168.100.2: icmp_seq=1 ttl=62 time=1.24 ms
64 bytes from 192.168.100.2: icmp_seq=2 ttl=62 time=0.074 ms
^C
--- 192.168.100.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.074/0.657/1.241/0.584 ms


# ovn-sbctl dump-flows router1 | grep '192.168.100.0'
  table=10(lr_in_ip_routing   ), priority=49   , match=(ip4.dst == 192.168.100.0/24), action=(ip.ttl--; flags.loopback = 1; reg8[0..15] = 1; reg8[16..31] = select(1, 2);)

now, set one nexthop(172.16.1.50) down:
# ovn-sbctl dump-flows router1 | grep '192.168.100.0'
  table=10(lr_in_ip_routing   ), priority=49   , match=(ip4.dst == 192.168.100.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 172.16.1.51; reg1 = 172.16.1.1; eth.src = 40:44:00:00:00:06; outport = "router1-public"; flags.loopback = 1; next;)

# ovn-nbctl list bfd
_uuid               : 5e0ce2bc-d556-4e35-b860-7b44e7c61f47
detect_mult         : 10
dst_ip              : "172.16.1.51"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : up

_uuid               : 5df64215-4ef1-4e08-b2b9-82b9ec5b987c
detect_mult         : 10
dst_ip              : "2002::51"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : down

_uuid               : c2b026d3-3f86-4fea-841d-a2361f0651f4
detect_mult         : 10
dst_ip              : "172.16.1.50"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : down   --------------------------------nexthop down now

_uuid               : 9ac508df-cca7-4fd9-8638-30388236293f
detect_mult         : 10
dst_ip              : "2002::50"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : down

# ip netns exec vm1 ping 192.168.100.2
PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
64 bytes from 192.168.100.2: icmp_seq=1 ttl=62 time=1.05 ms
64 bytes from 192.168.100.2: icmp_seq=2 ttl=62 time=0.080 ms
64 bytes from 192.168.100.2: icmp_seq=3 ttl=62 time=0.067 ms
^C
--- 192.168.100.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2022ms
rtt min/avg/max/mdev = 0.067/0.399/1.052/0.461 ms


now make all nexthop bfd down(stop bfd listen,but link is ok)
# ip netns exec external bfdd-control stop 172.16.1.1
stopping

# ovn-nbctl list bfd
_uuid               : 5e0ce2bc-d556-4e35-b860-7b44e7c61f47
detect_mult         : 10
dst_ip              : "172.16.1.51"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : down         -------------------all down

_uuid               : 5df64215-4ef1-4e08-b2b9-82b9ec5b987c
detect_mult         : 10
dst_ip              : "2002::51"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : down

_uuid               : c2b026d3-3f86-4fea-841d-a2361f0651f4
detect_mult         : 10
dst_ip              : "172.16.1.50"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : down             -----------------------------all down

_uuid               : 9ac508df-cca7-4fd9-8638-30388236293f
detect_mult         : 10
dst_ip              : "2002::50"
external_ids        : {}
logical_port        : router1-public
min_rx              : 250
min_tx              : 250
options             : {}
status              : down

# ovn-sbctl dump-flows router1 | grep '192.168.100.0'   -------------------------no flows about 192.168.100.0
[root@dell-per740-53 load_balance]# 

ping fail at this time
# ip netns exec vm1 ping 192.168.100.2
PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
^C
--- 192.168.100.2 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2069ms

Comment 6 errata-xmlrpc 2021-03-15 14:36:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0836


Note You need to log in before you can comment on or make changes to this bug.