The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1978796 - ECMP routes with invalid next hops still result in OF groups getting programmed
Summary: ECMP routes with invalid next hops still result in OF groups getting programmed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: RHEL 8.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: lorenzo bianconi
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-02 18:31 UTC by Tim Rozet
Modified: 2022-12-15 00:21 UTC (History)
7 users (show)

Fixed In Version: ovn2.13-20.12.0-172.el8fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1991793 (view as bug list)
Environment:
Last Closed: 2022-12-15 00:21:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1410 0 None None None 2021-08-10 03:47:11 UTC
Red Hat Product Errata RHBA-2022:9044 0 None None None 2022-12-15 00:21:52 UTC

Internal Links: 1991793

Description Tim Rozet 2021-07-02 18:31:30 UTC
Description of problem:
When having an ECMP route that would result in a non-local next hop, northd complains that it is invalid and refuses to program some of the flows, however the path is still added to the openflow group. Consider this example:

[root@master-1 ~]# ovn-nbctl lr-route-list GR_worker-148
IPv4 Routes
         
            192.168.8.129              10.75.69.166 src-ip ecmp ecmp-symmetric-reply
            192.168.8.129                198.19.3.7 src-ip ecmp ecmp-symmetric-reply


The GR in this case is on the 198.19.3.x network, making 10.75.69.166 as a next hop invalid. Northd warns about this:

2021-07-02T14:40:13Z|104189|ovn_northd|WARN|No path for static route 192.168.8.129; next hop 10.75.69.166

However, the OF group still has 2 paths:

[root@worker-148 ~]# ovs-ofctl dump-groups br-int 140
NXST_GROUP_DESC reply (xid=0x2):
 group_id=140,type=select,selection_method=dp_hash,bucket=bucket_id:0,weight:100,actions=load:0x1->OXM_OF_PKT_REG4[48..63],resubmit(,19),bucket=bucket_id:1,weight:100,actions=load:0x2->OXM_OF_PKT_REG4[48..63],resubmit(,19)

If bucket 0 is chosen, the ECMP route will go through the 198.x next hop and work. If bucket 1 is chosen, the packet is dropped because there is no matching flow in table 19:

    group:140
     -> using bucket 1
    bucket 1
            set_field:0x2000000000000/0xffff000000000000->xreg4
            resubmit(,19)
        19. No match.
            drop

In the lflows we can see that there are 2 paths:
table=10(lr_in_ip_routing   ), priority=64   , match=(ip4.src == 192.168.8.129/32), action=(ip.ttl--; flags.loopback = 1; reg8[0..15] = 3; reg8[16..31] = select(1, 2);)

However in table 11 there is only one path matching:
 table=11(lr_in_ip_routing_ecmp), priority=100  , match=(reg8[0..15] == 3 && reg8[16..31] == 1), action=(reg0 = 198.19.3.7; reg1 = 198.19.2.61; eth.src = 98:03:9b:8f:15:ac; outport = "rtoe-GR_worker-148"; next;)



Version-Release number of selected component (if applicable):
ovn2.13-20.12.0-24.el8fdp.x86_64

Comment 5 Jianlin Shi 2021-08-12 08:57:28 UTC
tried with following script:

systemctl start openvswitch
systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:127.0.0.1:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=127.0.0.1
systemctl restart ovn-controller

ovn-nbctl ls-del ls
ovn-nbctl lr-del rtr

ovn-nbctl \
    -- lr-add rtr \
    -- lrp-add rtr rtr-ls 00:00:00:00:01:00 42.42.42.1/24 4242::1/64 \
    -- ls-add ls \
    -- lsp-add ls ls-rtr \
    -- lsp-set-addresses ls-rtr 00:00:00:00:01:00 \
    -- lsp-set-type ls-rtr router \
    -- lsp-set-options ls-rtr router-port=rtr-ls \
    -- lsp-add ls vm1 -- lsp-set-addresses vm1 00:00:00:00:00:01 \
    -- lsp-add ls vm2 -- lsp-set-addresses vm2 00:00:00:00:00:02 \

ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal
ovs-vsctl set Interface vm1 external_ids:iface-id=vm1
ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal
ovs-vsctl set Interface vm2 external_ids:iface-id=vm2

ip netns add vm1
ip link set vm1 netns vm1
ip netns exec vm1 ip link set vm1 address 00:00:00:00:00:01
ip netns exec vm1 ip addr add 42.42.42.2/24 dev vm1
ip netns exec vm1 ip addr add 4242::2/64 dev vm1
ip netns exec vm1 ip link set vm1 up
ip netns exec vm1 ip r a default via 42.42.42.1
ip netns exec vm1 ip -6 route add default via 4242::1
ip netns add vm2
ip link set vm2 netns vm2
ip netns exec vm2 ip link set vm2 address 00:00:00:00:00:02
ip netns exec vm2 ip addr add 42.42.42.3/24 dev vm2
ip netns exec vm2 ip addr add 4242::3/64 dev vm2
ip netns exec vm2 ip link set vm2 up
ip netns exec vm2 ip r a default via 42.42.42.1
ip netns exec vm2 ip -6 route add default via 4242::1

ovn-nbctl --ecmp-symmetric-reply lr-route-add rtr 1.0.0.1  42.42.42.2
ovn-nbctl --ecmp-symmetric-reply lr-route-add rtr 1.0.0.1  2.1.1.1

ovn-nbctl --wait=sb sync
ovn-sbctl lflow-list rtr | grep table=10
ovn-sbctl lflow-list rtr | grep lr_in_ip_routing_ecmp

result on ovn-2021-21.06.0-12.el8:

[root@wsfd-advnetlab18 bz1978796]# rpm -qa | grep -E "openvswitch2.15|ovn-2021"
ovn-2021-host-21.06.0-12.el8fdp.x86_64
python3-openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn-2021-central-21.06.0-12.el8fdp.x86_64
openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn-2021-21.06.0-12.el8fdp.x86_64

+ ovn-sbctl lflow-list rtr
+ grep table=10
  table=10(lr_in_ip_routing   ), priority=550  , match=(nd_rs || nd_ra), action=(drop;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(inport == "rtr-ls" && ip6.dst == fe80::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = fe80::200:ff:fe00:100; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(ip6.dst == 4242::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = 4242::1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=65   , match=(ip4.dst == 1.0.0.1/32), action=(ip.ttl--; flags.loopback = 1; reg8[0..15] = 1; reg8[16..31] = select(1, 2);)
  table=10(lr_in_ip_routing   ), priority=49   , match=(ip4.dst == 42.42.42.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = ip4.dst; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
+ ovn-sbctl lflow-list rtr
+ grep lr_in_ip_routing_ecmp
  table=11(lr_in_ip_routing_ecmp), priority=150  , match=(reg8[0..15] == 0), action=(next;)
  table=11(lr_in_ip_routing_ecmp), priority=100  , match=(reg8[0..15] == 1 && reg8[16..31] == 1), action=(reg0 = 42.42.42.2; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; next;)

result on ovn-2021-21.06.0-18.el8:

[root@wsfd-advnetlab18 bz1978796]# rpm -qa | grep -E "openvswitch2.15|ovn-2021"
python3-openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn-2021-21.06.0-18.el8fdp.x86_64
ovn-2021-host-21.06.0-18.el8fdp.x86_64
openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn-2021-central-21.06.0-18.el8fdp.x86_64

+ ovn-sbctl lflow-list rtr
+ grep table=10
  table=10(lr_in_ip_routing   ), priority=550  , match=(nd_rs || nd_ra), action=(drop;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(inport == "rtr-ls" && ip6.dst == fe80::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = fe80::200:ff:fe00:100; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(ip6.dst == 4242::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = 4242::1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=65   , match=(ip4.dst == 1.0.0.1/32), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 42.42.42.2; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=49   , match=(ip4.dst == 42.42.42.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = ip4.dst; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
+ ovn-sbctl lflow-list rtr
+ grep lr_in_ip_routing_ecmp
  table=11(lr_in_ip_routing_ecmp), priority=150  , match=(reg8[0..15] == 0), action=(next;)

@lorenzo.bianconi Is this the expected result?

Comment 6 lorenzo bianconi 2021-08-12 09:18:01 UTC
yes, it is correct since running updated version of ovn the second ecmp route is discarded (invalid next hop) and we have just the first regular route:

table=10(lr_in_ip_routing   ), priority=65   , match=(ip4.dst == 1.0.0.1/32), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 42.42.42.2; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)

if you try to add another ecmp route with a valid next hop, ovn will configure the select() action for them.

Comment 7 Jianlin Shi 2021-08-12 09:33:53 UTC
also verified on ovn2.13-20.12.0-173.el8:

[root@wsfd-advnetlab18 bz1978796]# rpm -qa | grep -E "openvswitch2.15|ovn2.13"
ovn2.13-central-20.12.0-173.el8fdp.x86_64
python3-openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn2.13-20.12.0-173.el8fdp.x86_64
ovn2.13-host-20.12.0-173.el8fdp.x86_64
openvswitch2.15-2.15.0-32.el8fdp.x86_64

+ ovn-nbctl --ecmp-symmetric-reply lr-route-add rtr 1.0.0.1 42.42.42.2
+ ovn-nbctl --ecmp-symmetric-reply lr-route-add rtr 1.0.0.1 2.1.1.1
+ ovn-nbctl --wait=sb sync
+ ovn-sbctl lflow-list rtr
+ grep table=10
  table=10(lr_in_ip_routing   ), priority=550  , match=(nd_rs || nd_ra), action=(drop;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(inport == "rtr-ls" && ip6.dst == fe80::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = fe80::200:ff:fe00:100; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(ip6.dst == 4242::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = 4242::1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=65   , match=(ip4.dst == 1.0.0.1/32), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 42.42.42.2; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=49   , match=(ip4.dst == 42.42.42.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = ip4.dst; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
+ ovn-sbctl lflow-list rtr
+ grep lr_in_ip_routing_ecmp
  table=11(lr_in_ip_routing_ecmp), priority=150  , match=(reg8[0..15] == 0), action=(next;)

Comment 8 Jianlin Shi 2021-08-12 09:35:32 UTC
also verified on ovn2.13-20.12.0-173.el7:

[root@wsfd-advnetlab16 bz1978796]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"
ovn2.13-central-20.12.0-173.el7fdp.x86_64                                                             
openvswitch2.13-2.13.0-102.el7fdp.x86_64                                                              
ovn2.13-20.12.0-173.el7fdp.x86_64                                                                     
ovn2.13-host-20.12.0-173.el7fdp.x86_64                                                                
python3-openvswitch2.13-2.13.0-102.el7fdp.x86_64

Comment 10 errata-xmlrpc 2022-12-15 00:21:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:9044


Note You need to log in before you can comment on or make changes to this bug.