Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1978796

Summary: ECMP routes with invalid next hops still result in OF groups getting programmed
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Tim Rozet <trozet>
Component: OVNAssignee: lorenzo bianconi <lorenzo.bianconi>
Status: CLOSED ERRATA QA Contact: Jianlin Shi <jishi>
Severity: medium Docs Contact:
Priority: high    
Version: RHEL 8.0CC: bhershbe, ctrautma, dcbw, jiji, kfida, lorenzo.bianconi, yjoseph
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovn2.13-20.12.0-172.el8fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1991793 (view as bug list) Environment:
Last Closed: 2022-12-15 00:21:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tim Rozet 2021-07-02 18:31:30 UTC
Description of problem:
When having an ECMP route that would result in a non-local next hop, northd complains that it is invalid and refuses to program some of the flows, however the path is still added to the openflow group. Consider this example:

[root@master-1 ~]# ovn-nbctl lr-route-list GR_worker-148
IPv4 Routes
         
            192.168.8.129              10.75.69.166 src-ip ecmp ecmp-symmetric-reply
            192.168.8.129                198.19.3.7 src-ip ecmp ecmp-symmetric-reply


The GR in this case is on the 198.19.3.x network, making 10.75.69.166 as a next hop invalid. Northd warns about this:

2021-07-02T14:40:13Z|104189|ovn_northd|WARN|No path for static route 192.168.8.129; next hop 10.75.69.166

However, the OF group still has 2 paths:

[root@worker-148 ~]# ovs-ofctl dump-groups br-int 140
NXST_GROUP_DESC reply (xid=0x2):
 group_id=140,type=select,selection_method=dp_hash,bucket=bucket_id:0,weight:100,actions=load:0x1->OXM_OF_PKT_REG4[48..63],resubmit(,19),bucket=bucket_id:1,weight:100,actions=load:0x2->OXM_OF_PKT_REG4[48..63],resubmit(,19)

If bucket 0 is chosen, the ECMP route will go through the 198.x next hop and work. If bucket 1 is chosen, the packet is dropped because there is no matching flow in table 19:

    group:140
     -> using bucket 1
    bucket 1
            set_field:0x2000000000000/0xffff000000000000->xreg4
            resubmit(,19)
        19. No match.
            drop

In the lflows we can see that there are 2 paths:
table=10(lr_in_ip_routing   ), priority=64   , match=(ip4.src == 192.168.8.129/32), action=(ip.ttl--; flags.loopback = 1; reg8[0..15] = 3; reg8[16..31] = select(1, 2);)

However in table 11 there is only one path matching:
 table=11(lr_in_ip_routing_ecmp), priority=100  , match=(reg8[0..15] == 3 && reg8[16..31] == 1), action=(reg0 = 198.19.3.7; reg1 = 198.19.2.61; eth.src = 98:03:9b:8f:15:ac; outport = "rtoe-GR_worker-148"; next;)



Version-Release number of selected component (if applicable):
ovn2.13-20.12.0-24.el8fdp.x86_64

Comment 5 Jianlin Shi 2021-08-12 08:57:28 UTC
tried with following script:

systemctl start openvswitch
systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:127.0.0.1:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=127.0.0.1
systemctl restart ovn-controller

ovn-nbctl ls-del ls
ovn-nbctl lr-del rtr

ovn-nbctl \
    -- lr-add rtr \
    -- lrp-add rtr rtr-ls 00:00:00:00:01:00 42.42.42.1/24 4242::1/64 \
    -- ls-add ls \
    -- lsp-add ls ls-rtr \
    -- lsp-set-addresses ls-rtr 00:00:00:00:01:00 \
    -- lsp-set-type ls-rtr router \
    -- lsp-set-options ls-rtr router-port=rtr-ls \
    -- lsp-add ls vm1 -- lsp-set-addresses vm1 00:00:00:00:00:01 \
    -- lsp-add ls vm2 -- lsp-set-addresses vm2 00:00:00:00:00:02 \

ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal
ovs-vsctl set Interface vm1 external_ids:iface-id=vm1
ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal
ovs-vsctl set Interface vm2 external_ids:iface-id=vm2

ip netns add vm1
ip link set vm1 netns vm1
ip netns exec vm1 ip link set vm1 address 00:00:00:00:00:01
ip netns exec vm1 ip addr add 42.42.42.2/24 dev vm1
ip netns exec vm1 ip addr add 4242::2/64 dev vm1
ip netns exec vm1 ip link set vm1 up
ip netns exec vm1 ip r a default via 42.42.42.1
ip netns exec vm1 ip -6 route add default via 4242::1
ip netns add vm2
ip link set vm2 netns vm2
ip netns exec vm2 ip link set vm2 address 00:00:00:00:00:02
ip netns exec vm2 ip addr add 42.42.42.3/24 dev vm2
ip netns exec vm2 ip addr add 4242::3/64 dev vm2
ip netns exec vm2 ip link set vm2 up
ip netns exec vm2 ip r a default via 42.42.42.1
ip netns exec vm2 ip -6 route add default via 4242::1

ovn-nbctl --ecmp-symmetric-reply lr-route-add rtr 1.0.0.1  42.42.42.2
ovn-nbctl --ecmp-symmetric-reply lr-route-add rtr 1.0.0.1  2.1.1.1

ovn-nbctl --wait=sb sync
ovn-sbctl lflow-list rtr | grep table=10
ovn-sbctl lflow-list rtr | grep lr_in_ip_routing_ecmp

result on ovn-2021-21.06.0-12.el8:

[root@wsfd-advnetlab18 bz1978796]# rpm -qa | grep -E "openvswitch2.15|ovn-2021"
ovn-2021-host-21.06.0-12.el8fdp.x86_64
python3-openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn-2021-central-21.06.0-12.el8fdp.x86_64
openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn-2021-21.06.0-12.el8fdp.x86_64

+ ovn-sbctl lflow-list rtr
+ grep table=10
  table=10(lr_in_ip_routing   ), priority=550  , match=(nd_rs || nd_ra), action=(drop;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(inport == "rtr-ls" && ip6.dst == fe80::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = fe80::200:ff:fe00:100; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(ip6.dst == 4242::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = 4242::1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=65   , match=(ip4.dst == 1.0.0.1/32), action=(ip.ttl--; flags.loopback = 1; reg8[0..15] = 1; reg8[16..31] = select(1, 2);)
  table=10(lr_in_ip_routing   ), priority=49   , match=(ip4.dst == 42.42.42.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = ip4.dst; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
+ ovn-sbctl lflow-list rtr
+ grep lr_in_ip_routing_ecmp
  table=11(lr_in_ip_routing_ecmp), priority=150  , match=(reg8[0..15] == 0), action=(next;)
  table=11(lr_in_ip_routing_ecmp), priority=100  , match=(reg8[0..15] == 1 && reg8[16..31] == 1), action=(reg0 = 42.42.42.2; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; next;)

result on ovn-2021-21.06.0-18.el8:

[root@wsfd-advnetlab18 bz1978796]# rpm -qa | grep -E "openvswitch2.15|ovn-2021"
python3-openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn-2021-21.06.0-18.el8fdp.x86_64
ovn-2021-host-21.06.0-18.el8fdp.x86_64
openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn-2021-central-21.06.0-18.el8fdp.x86_64

+ ovn-sbctl lflow-list rtr
+ grep table=10
  table=10(lr_in_ip_routing   ), priority=550  , match=(nd_rs || nd_ra), action=(drop;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(inport == "rtr-ls" && ip6.dst == fe80::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = fe80::200:ff:fe00:100; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(ip6.dst == 4242::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = 4242::1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=65   , match=(ip4.dst == 1.0.0.1/32), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 42.42.42.2; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=49   , match=(ip4.dst == 42.42.42.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = ip4.dst; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
+ ovn-sbctl lflow-list rtr
+ grep lr_in_ip_routing_ecmp
  table=11(lr_in_ip_routing_ecmp), priority=150  , match=(reg8[0..15] == 0), action=(next;)

@lorenzo.bianconi Is this the expected result?

Comment 6 lorenzo bianconi 2021-08-12 09:18:01 UTC
yes, it is correct since running updated version of ovn the second ecmp route is discarded (invalid next hop) and we have just the first regular route:

table=10(lr_in_ip_routing   ), priority=65   , match=(ip4.dst == 1.0.0.1/32), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 42.42.42.2; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)

if you try to add another ecmp route with a valid next hop, ovn will configure the select() action for them.

Comment 7 Jianlin Shi 2021-08-12 09:33:53 UTC
also verified on ovn2.13-20.12.0-173.el8:

[root@wsfd-advnetlab18 bz1978796]# rpm -qa | grep -E "openvswitch2.15|ovn2.13"
ovn2.13-central-20.12.0-173.el8fdp.x86_64
python3-openvswitch2.15-2.15.0-32.el8fdp.x86_64
ovn2.13-20.12.0-173.el8fdp.x86_64
ovn2.13-host-20.12.0-173.el8fdp.x86_64
openvswitch2.15-2.15.0-32.el8fdp.x86_64

+ ovn-nbctl --ecmp-symmetric-reply lr-route-add rtr 1.0.0.1 42.42.42.2
+ ovn-nbctl --ecmp-symmetric-reply lr-route-add rtr 1.0.0.1 2.1.1.1
+ ovn-nbctl --wait=sb sync
+ ovn-sbctl lflow-list rtr
+ grep table=10
  table=10(lr_in_ip_routing   ), priority=550  , match=(nd_rs || nd_ra), action=(drop;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(inport == "rtr-ls" && ip6.dst == fe80::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = fe80::200:ff:fe00:100; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=129  , match=(ip6.dst == 4242::/64), action=(ip.ttl--; reg8[0..15] = 0; xxreg0 = ip6.dst; xxreg1 = 4242::1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=65   , match=(ip4.dst == 1.0.0.1/32), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 42.42.42.2; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
  table=10(lr_in_ip_routing   ), priority=49   , match=(ip4.dst == 42.42.42.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = ip4.dst; reg1 = 42.42.42.1; eth.src = 00:00:00:00:01:00; outport = "rtr-ls"; flags.loopback = 1; next;)
+ ovn-sbctl lflow-list rtr
+ grep lr_in_ip_routing_ecmp
  table=11(lr_in_ip_routing_ecmp), priority=150  , match=(reg8[0..15] == 0), action=(next;)

Comment 8 Jianlin Shi 2021-08-12 09:35:32 UTC
also verified on ovn2.13-20.12.0-173.el7:

[root@wsfd-advnetlab16 bz1978796]# rpm -qa | grep -E "openvswitch2.13|ovn2.13"
ovn2.13-central-20.12.0-173.el7fdp.x86_64                                                             
openvswitch2.13-2.13.0-102.el7fdp.x86_64                                                              
ovn2.13-20.12.0-173.el7fdp.x86_64                                                                     
ovn2.13-host-20.12.0-173.el7fdp.x86_64                                                                
python3-openvswitch2.13-2.13.0-102.el7fdp.x86_64

Comment 10 errata-xmlrpc 2022-12-15 00:21:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:9044