Bug 2155306 - [RFE][ovn] Add ARP/NDP Proxy capabilities
Summary: [RFE][ovn] Add ARP/NDP Proxy capabilities
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn22.12
Version: FDP 22.L
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: ---
Assignee: Quique Llorente
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-12-20 17:31 UTC by Daniel Alvarez Sanchez
Modified: 2023-02-15 09:00 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2555 0 None None None 2022-12-20 17:46:03 UTC

Description Daniel Alvarez Sanchez 2022-12-20 17:31:53 UTC
OpenStack recently added support for pure L3 deployments using BGP [0]. This architecture relies on building an extra layer to route all the traffic coming out of the VMs to the leaf nodes and viceversa.


The current implementation is based on kernel networking where the routing and ARP/NDP proxy is done by the kernel. However, for us to cover customers that require acceleration such as HWOL/OVS-DPDK we need to move this functionality into OVS.


We're currently prototyping with an architecture (attaching image to this BZ) where a small OVN cluster is running on every OSP compute node. This local OVN cluster will have 3 elements:

1. Logical Switch (localnet) connecting to the integration bridge of OSP
2. Logical Router that (ECMP) routes the traffic from the OSP workloads towards the two leafs (each leaf has a /30 network)
3. Logical Switch (localnet) connecting to an external OVS bridge where the two NICs are added



The OpenStack workloads do not know of this routing layer so the L2 connectivity is simulated by responding to ARP/NDP requests locally. For the purpose of the PoC, we're injecting ARP responder flows in the intermediate br-conn but, ideally, the local OVN cluster should do it (especially for NDP where we currently do not have the ability to do it without a controller action).

Worth mentioning that for the purpose of the PoC we're using the multibridge feature (not merged yet) here [1] which allows us to run two separate ovn-controller instances in the same host.


Example of arping from a local OpenStack VM to an external destination:

[root@vm-provider ~]# ip a sh eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether fa:16:3e:80:95:72 brd ff:ff:ff:ff:ff:ff
    altname enp3s0
    inet 172.16.100.160/24 brd 172.16.100.255 scope global dynamic noprefixroute eth0


[root@vm-provider ~]# ip r get 8.8.8.8
8.8.8.8 via 172.16.100.1 dev eth0 src 172.16.100.160 uid 0

[root@vm-provider ~]# arping 172.16.100.1 -c1
ARPING 172.16.100.1 from 172.16.100.160 eth0
Unicast reply from 172.16.100.1 [40:44:00:00:00:06]  1.414ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)



The br-conn bridge will hijack this request and reply with the MAC address of the OVN LS (40:44:00:00:00:06): 



# ovs-ofctl dump-flows br-conn
 cookie=0xbadcafe, duration=4550.493s, table=0, n_packets=93, n_bytes=3906, priority=100,arp,arp_tpa=172.16.100.1,arp_op=1 actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],mod_dl_src:40:44:00:00:00:06,load:0x2->NXM_OF_ARP_OP[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],load:0x404400000006->NXM_NX_ARP_SHA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],load:0xac106401->NXM_OF_ARP_SPA[],IN_PORT
 cookie=0x0, duration=363766.342s, table=0, n_packets=100623, n_bytes=27886427, priority=0 actions=NORMAL



[root@vm-provider ~]# ip nei get 172.16.100.1 dev eth0
172.16.100.1 dev eth0 lladdr 40:44:00:00:00:06 REACHABLE



After this, the traffic will be processed by the OVN LR in br-bgp and routed to the leafs with a default route:


# ovn-nbctl lr-route-list bgp-router
IPv4 Routes
Route Table <main>:
                0.0.0.0/0                100.64.0.5 dst-ip ecmp
                0.0.0.0/0                100.65.3.5 dst-ip ecmp


The goal of this RFE is to request ARP/NDP Proxy functionality to be added to OVN.




# ovn-nbctl show
switch 96c723c4-1cbd-40d8-90b5-049bf62ac461 (bgp-conn)
    port conn-bgp-router
        type: router
        router-port: bgp-router-public
    port bgp-conn-localnet
        type: localnet
        addresses: ["unknown"]

switch da474a04-bd47-4b95-bfc3-3112ddbb9431 (bgp-ex)
    port bgp-ex-localnet
        type: localnet
        addresses: ["unknown"]
    port bgp-portbinding
    port ex-bgp-router-2
        type: router
        router-port: bgp-router-ex-2
    port ex-bgp-router-1
        type: router
        router-port: bgp-router-ex-1

router fcd73758-e940-4af4-9ae1-c2e98357e281 (bgp-router)
    port bgp-router-ex-1
        mac: "52:54:00:9e:ac:43"
        networks: ["100.65.3.6/30"]
    port bgp-router-public
        mac: "40:44:00:00:00:06"
        networks: ["172.16.100.1/24"]
    port bgp-router-ex-2
        mac: "52:54:00:4e:f1:eb"
        networks: ["100.64.0.6/30"]




# ip a sh enp2s0
3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
    link/ether 52:54:00:9e:ac:43 brd ff:ff:ff:ff:ff:ff
    inet 100.65.3.6/30 scope global enp2s0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe9e:ac43/64 scope link
       valid_lft forever preferred_lft forever


# ip a sh enp3s0
4: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ovs-system state UP group default qlen 1000
    link/ether 52:54:00:4e:f1:eb brd ff:ff:ff:ff:ff:ff
    inet 100.64.0.6/30 scope global enp3s0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe4e:f1eb/64 scope link
       valid_lft forever preferred_lft forever


# ovs-ofctl dump-flows br-ex
 cookie=0xbadcaf2, duration=554.855s, table=0, n_packets=537, n_bytes=52626, priority=100,ip,in_port=enp2s0,nw_dst=172.16.100.0/24 actions=mod_dl_dst:52:54:00:9e:ac:43,output:"patch-bgp-ex-lo"
 cookie=0xbadcaf2, duration=547.097s, table=0, n_packets=549, n_bytes=53940, priority=100,ip,in_port=enp3s0,nw_dst=172.16.100.0/24 actions=mod_dl_dst:52:54:00:4e:f1:eb,output:"patch-bgp-ex-lo"
 cookie=0x0, duration=364594.904s, table=0, n_packets=6216556, n_bytes=770838838, priority=0 actions=NORMAL




[0] https://developers.redhat.com/articles/2022/09/22/learn-about-new-bgp-capabilities-red-hat-openstack-17
[1] https://patchwork.ozlabs.org/project/ovn/list/?series=330752

Comment 1 Daniel Alvarez Sanchez 2022-12-20 17:55:38 UTC
Created attachment 1933800 [details]
arch

Comment 2 Numan Siddique 2022-12-20 23:27:22 UTC
Thanks for the detailed explanation and the diagram (very helpful).

So If I understand correctly, you don't want to add the openflow arp responder flow in br-conn right ?
Instead the ARP pkt from br-int will enter br-con n (via patch port) and from br-conn to br-bgp (managed by local OVN)
and the ARP responder flows added by local OVN.  Correct me If I'm wrong.


So IMO OVN should add these logical flows in the logical switch bgp-conn pipeline. I think it should
be straightforward to add this feature in OVN.

Right now we don't add ARP responder flows in the logical switch pipeline for the router ip (of the router it is connected to).
Instead the packet would enter the router pipeline and the arp responder flows there would reply.


In your case, if you don't add the arp responder flows in br-conn, what happens ?  I suppose the ARP request packet would enter local
OVN integration bridge br-bgp in the bgp-conn logical switch pipeline and then it would enter the router pipeline bgp-router.
And the ARP responder flows there should respond.   If this works as expected, probably we don't need to add anything in OVN.

Maybe what I'm saying is wrong (or there is a bug).  Can you please check what happens ?

Thanks
Numan

Comment 3 Daniel Alvarez Sanchez 2022-12-21 09:11:12 UTC
Thanks a lot Numan!

(In reply to Numan Siddique from comment #2)
> Thanks for the detailed explanation and the diagram (very helpful).
> 
> So If I understand correctly, you don't want to add the openflow arp
> responder flow in br-conn right ?

Exactly. There's two reasons for this:

1) Since we use OVN for pretty much everything else, it makes sense for us to not have to add extra flows of this sort 'manually'.
2) For IPv6 we can't do this and we'd need a controller action. We rather keep the dependency on ovn-controller than adding an extra one on the OVN BGP Agent



> Instead the ARP pkt from br-int will enter br-con n (via patch port) and
> from br-conn to br-bgp (managed by local OVN)
> and the ARP responder flows added by local OVN.  Correct me If I'm wrong.
> 
> 
> So IMO OVN should add these logical flows in the logical switch bgp-conn
> pipeline. I think it should
> be straightforward to add this feature in OVN.
> 
> Right now we don't add ARP responder flows in the logical switch pipeline
> for the router ip (of the router it is connected to).
> Instead the packet would enter the router pipeline and the arp responder
> flows there would reply.
> 
> 
> In your case, if you don't add the arp responder flows in br-conn, what
> happens ?  I suppose the ARP request packet would enter local
> OVN integration bridge br-bgp in the bgp-conn logical switch pipeline and
> then it would enter the router pipeline bgp-router.

Yes, I think this is correct.

> And the ARP responder flows there should respond.   If this works as
> expected, probably we don't need to add anything in OVN.

If we add entries to the Static_MAC_Binding table perhaps it works out of the box. The problem is that we'd need to respond to *every* IP.
The way that ARP Proxy works in the kernel (sysctl -w net.ipv4.conf.br-conn.proxy_arp=1) is that we give br-conn a loopback IP address (1.1.1.1/32) and then *all* the ARP/NDP requests will be answered by br-conn with its own MAC address.

The ask is to add some magic flows to the router pipeline (likely with less prio than the current ARP responder flows for known addresses) that respond to ARP/NDP requests with the MAC address of the port where we toggle the feature on regardless of the requested address.

Please let me know if the above makes sense. I can also show you live in a setup :)

> 
> Maybe what I'm saying is wrong (or there is a bug).  Can you please check
> what happens ?
> 
> Thanks
> Numan





I can probably mark this is as public and link it in the upstream ML for wider discussion if you think it's good to go.

Comment 4 Dumitru Ceara 2022-12-21 10:50:51 UTC
(In reply to Daniel Alvarez Sanchez from comment #3)
> Thanks a lot Numan!
> 
> (In reply to Numan Siddique from comment #2)
> > Thanks for the detailed explanation and the diagram (very helpful).
> > 
> > So If I understand correctly, you don't want to add the openflow arp
> > responder flow in br-conn right ?
> 
> Exactly. There's two reasons for this:
> 
> 1) Since we use OVN for pretty much everything else, it makes sense for us
> to not have to add extra flows of this sort 'manually'.
> 2) For IPv6 we can't do this and we'd need a controller action. We rather
> keep the dependency on ovn-controller than adding an extra one on the OVN
> BGP Agent
> 
> 
> 
> > Instead the ARP pkt from br-int will enter br-con n (via patch port) and
> > from br-conn to br-bgp (managed by local OVN)
> > and the ARP responder flows added by local OVN.  Correct me If I'm wrong.
> > 
> > 
> > So IMO OVN should add these logical flows in the logical switch bgp-conn
> > pipeline. I think it should
> > be straightforward to add this feature in OVN.
> > 
> > Right now we don't add ARP responder flows in the logical switch pipeline
> > for the router ip (of the router it is connected to).
> > Instead the packet would enter the router pipeline and the arp responder
> > flows there would reply.
> > 
> > 
> > In your case, if you don't add the arp responder flows in br-conn, what
> > happens ?  I suppose the ARP request packet would enter local
> > OVN integration bridge br-bgp in the bgp-conn logical switch pipeline and
> > then it would enter the router pipeline bgp-router.
> 
> Yes, I think this is correct.
> 
> > And the ARP responder flows there should respond.   If this works as
> > expected, probably we don't need to add anything in OVN.
> 
> If we add entries to the Static_MAC_Binding table perhaps it works out of
> the box. The problem is that we'd need to respond to *every* IP.
> The way that ARP Proxy works in the kernel (sysctl -w
> net.ipv4.conf.br-conn.proxy_arp=1) is that we give br-conn a loopback IP
> address (1.1.1.1/32) and then *all* the ARP/NDP requests will be answered by
> br-conn with its own MAC address.
> 
> The ask is to add some magic flows to the router pipeline (likely with less
> prio than the current ARP responder flows for known addresses) that respond
> to ARP/NDP requests with the MAC address of the port where we toggle the
> feature on regardless of the requested address.

Is something like this what you had in mind (per LRP knob to enable replying
to any ARP by default)?

https://github.com/dceara/ovn/commit/e46ea3fbc7088ac009480e2883968383404b79e2

> 
> Please let me know if the above makes sense. I can also show you live in a
> setup :)
> 
> > 
> > Maybe what I'm saying is wrong (or there is a bug).  Can you please check
> > what happens ?
> > 
> > Thanks
> > Numan
> 
> 
> 
> 
> 
> I can probably mark this is as public and link it in the upstream ML for
> wider discussion if you think it's good to go.

Comment 5 Daniel Alvarez Sanchez 2022-12-21 12:14:16 UTC
(In reply to Dumitru Ceara from comment #4)

> 
> Is something like this what you had in mind (per LRP knob to enable replying
> to any ARP by default)?
> 
> https://github.com/dceara/ovn/commit/e46ea3fbc7088ac009480e2883968383404b79e2

Wow! That was fast :)
Exactly what I had in mind yes. Thanks!

The per-LRP knob is what I thought of but it can be problematic if more than one LRP would have it on?
Another option is to have the knob at router level but then we need a way to specify which MAC address we should respond with.

Fine with me the per-LRP approach and let the user handle the fact that only one LRP should have it on for a given router.

Comment 6 Dumitru Ceara 2022-12-21 12:25:44 UTC
(In reply to Daniel Alvarez Sanchez from comment #5)
> (In reply to Dumitru Ceara from comment #4)
> 
> > 
> > Is something like this what you had in mind (per LRP knob to enable replying
> > to any ARP by default)?
> > 
> > https://github.com/dceara/ovn/commit/e46ea3fbc7088ac009480e2883968383404b79e2
> 
> Wow! That was fast :)
> Exactly what I had in mind yes. Thanks!
> 
> The per-LRP knob is what I thought of but it can be problematic if more than
> one LRP would have it on?

Why is it problematic?

> Another option is to have the knob at router level but then we need a way to
> specify which MAC address we should respond with.
> 
> Fine with me the per-LRP approach and let the user handle the fact that only
> one LRP should have it on for a given router.

I don't think we need that restriction.  Different LRPs correspond to different
subnets.  Unless I'm missing something I think we should be ok with a per-lrp
option.

Comment 7 Daniel Alvarez Sanchez 2022-12-21 13:02:35 UTC
(In reply to Dumitru Ceara from comment #6)
> (In reply to Daniel Alvarez Sanchez from comment #5)
> > (In reply to Dumitru Ceara from comment #4)
> > 
> > > 
> > > Is something like this what you had in mind (per LRP knob to enable replying
> > > to any ARP by default)?
> > > 
> > > https://github.com/dceara/ovn/commit/e46ea3fbc7088ac009480e2883968383404b79e2
> > 
> > Wow! That was fast :)
> > Exactly what I had in mind yes. Thanks!
> > 
> > The per-LRP knob is what I thought of but it can be problematic if more than
> > one LRP would have it on?
> 
> Why is it problematic?
> 
> > Another option is to have the knob at router level but then we need a way to
> > specify which MAC address we should respond with.
> > 
> > Fine with me the per-LRP approach and let the user handle the fact that only
> > one LRP should have it on for a given router.
> 
> I don't think we need that restriction.  Different LRPs correspond to
> different
> subnets.  Unless I'm missing something I think we should be ok with a per-lrp
> option.

Most likely it's me missing something :) 
ARP requests are broadcasted so they will reach all LRPs. The way that arp proxy works in the kernel (from what I've seen at least) is that the device will reply to all the requests regardless of the source (as long as the device has an IP address configured - eg. 1.1.1.1/32 is the one we use).

Comment 8 Dumitru Ceara 2022-12-21 13:09:02 UTC
(In reply to Daniel Alvarez Sanchez from comment #7)
> (In reply to Dumitru Ceara from comment #6)
> > (In reply to Daniel Alvarez Sanchez from comment #5)
> > > (In reply to Dumitru Ceara from comment #4)
> > > 
> > > > 
> > > > Is something like this what you had in mind (per LRP knob to enable replying
> > > > to any ARP by default)?
> > > > 
> > > > https://github.com/dceara/ovn/commit/e46ea3fbc7088ac009480e2883968383404b79e2
> > > 
> > > Wow! That was fast :)
> > > Exactly what I had in mind yes. Thanks!
> > > 
> > > The per-LRP knob is what I thought of but it can be problematic if more than
> > > one LRP would have it on?
> > 
> > Why is it problematic?
> > 
> > > Another option is to have the knob at router level but then we need a way to
> > > specify which MAC address we should respond with.
> > > 
> > > Fine with me the per-LRP approach and let the user handle the fact that only
> > > one LRP should have it on for a given router.
> > 
> > I don't think we need that restriction.  Different LRPs correspond to
> > different
> > subnets.  Unless I'm missing something I think we should be ok with a per-lrp
> > option.
> 
> Most likely it's me missing something :) 
> ARP requests are broadcasted so they will reach all LRPs. The way that arp

Hmm, how?  ARP reqs are broadcasted in the L2 domain so they only reach what
LRPs are connected to the logical switch where the ARPs are received.  I don't
think it's a valid configuration to have two LRPs from *the same router*
connected to the same logical switch.  Moreover, I don't think it's a valid
config to enable proxy-arp on two LRPs (different LRs) connected to the same
switch.  It's up to the user to avoid that IMO.

> proxy works in the kernel (from what I've seen at least) is that the device
> will reply to all the requests regardless of the source (as long as the
> device has an IP address configured - eg. 1.1.1.1/32 is the one we use).

I didn't test it but I'm quite sure that if you connect two linux interfaces
to a bridge and enable proxy-arp on both then they will both reply to ARP
reqs received on that bridge.

I think the PoC code I shared above implements that same behavior.

But if we agree that the OVN proxy-ARP implementation should match the
kernel behavior then we probably have a good enough "spec" to work with.

Comment 9 Daniel Alvarez Sanchez 2022-12-21 17:11:10 UTC
(In reply to Dumitru Ceara from comment #8)
> (In reply to Daniel Alvarez Sanchez from comment #7)
> > (In reply to Dumitru Ceara from comment #6)
> > > (In reply to Daniel Alvarez Sanchez from comment #5)
> > > > (In reply to Dumitru Ceara from comment #4)
> > > > 
> > > > > 
> > > > > Is something like this what you had in mind (per LRP knob to enable replying
> > > > > to any ARP by default)?
> > > > > 
> > > > > https://github.com/dceara/ovn/commit/e46ea3fbc7088ac009480e2883968383404b79e2
> > > > 
> > > > Wow! That was fast :)
> > > > Exactly what I had in mind yes. Thanks!
> > > > 
> > > > The per-LRP knob is what I thought of but it can be problematic if more than
> > > > one LRP would have it on?
> > > 
> > > Why is it problematic?
> > > 
> > > > Another option is to have the knob at router level but then we need a way to
> > > > specify which MAC address we should respond with.
> > > > 
> > > > Fine with me the per-LRP approach and let the user handle the fact that only
> > > > one LRP should have it on for a given router.
> > > 
> > > I don't think we need that restriction.  Different LRPs correspond to
> > > different
> > > subnets.  Unless I'm missing something I think we should be ok with a per-lrp
> > > option.
> > 
> > Most likely it's me missing something :) 
> > ARP requests are broadcasted so they will reach all LRPs. The way that arp
> 
> Hmm, how?  ARP reqs are broadcasted in the L2 domain so they only reach what
> LRPs are connected to the logical switch where the ARPs are received.  I
> don't
> think it's a valid configuration to have two LRPs from *the same router*
> connected to the same logical switch.  

Uhm we're doing this actually :)

switch da474a04-bd47-4b95-bfc3-3112ddbb9431 (bgp-ex)
    port ex-bgp-router-neutron
        type: router
        router-port: bgp-router-neutron-ex
    port bgp-ex-localnet
        type: localnet
        addresses: ["unknown"]
    port bgp-portbinding
    port ex-bgp-router-2
        type: router
        router-port: bgp-router-ex-2
    port ex-bgp-router-1
        type: router
        router-port: bgp-router-ex-1
router fcd73758-e940-4af4-9ae1-c2e98357e281 (bgp-router)
    port bgp-router-public
        mac: "40:44:00:00:00:06"
        networks: ["1.1.1.1/32"]
    port bgp-router-ex-1
        mac: "52:54:00:9e:ac:43"
        networks: ["100.65.3.6/30"]
    port bgp-router-ex-2
        mac: "52:54:00:4e:f1:eb"
        networks: ["100.64.0.6/30"]


By connecting two LRPs here we are doing ECMP out of the node. We can probably have one single LRP with the two networks but we wanted this config to match the MAC addresses of the NIC.
Why would this be wrong?



> Moreover, I don't think it's a valid
> config to enable proxy-arp on two LRPs (different LRs) connected to the same
> switch.  It's up to the user to avoid that IMO.

I think this is what I was saying with:

"Fine with me the per-LRP approach and let the user handle the fact that only
one LRP should have it on for a given router."




> 
> > proxy works in the kernel (from what I've seen at least) is that the device
> > will reply to all the requests regardless of the source (as long as the
> > device has an IP address configured - eg. 1.1.1.1/32 is the one we use).
> 
> I didn't test it but I'm quite sure that if you connect two linux interfaces
> to a bridge and enable proxy-arp on both then they will both reply to ARP
> reqs received on that bridge.
> 
> I think the PoC code I shared above implements that same behavior.
> 
> But if we agree that the OVN proxy-ARP implementation should match the
> kernel behavior then we probably have a good enough "spec" to work with.


Great!
Thanks again!

Comment 10 Dumitru Ceara 2022-12-22 09:25:23 UTC
(In reply to Daniel Alvarez Sanchez from comment #9)
> (In reply to Dumitru Ceara from comment #8)
> > (In reply to Daniel Alvarez Sanchez from comment #7)
> > > (In reply to Dumitru Ceara from comment #6)
> > > > (In reply to Daniel Alvarez Sanchez from comment #5)
> > > > > (In reply to Dumitru Ceara from comment #4)
> > > > > 
> > > > > > 
> > > > > > Is something like this what you had in mind (per LRP knob to enable replying
> > > > > > to any ARP by default)?
> > > > > > 
> > > > > > https://github.com/dceara/ovn/commit/e46ea3fbc7088ac009480e2883968383404b79e2
> > > > > 
> > > > > Wow! That was fast :)
> > > > > Exactly what I had in mind yes. Thanks!
> > > > > 
> > > > > The per-LRP knob is what I thought of but it can be problematic if more than
> > > > > one LRP would have it on?
> > > > 
> > > > Why is it problematic?
> > > > 
> > > > > Another option is to have the knob at router level but then we need a way to
> > > > > specify which MAC address we should respond with.
> > > > > 
> > > > > Fine with me the per-LRP approach and let the user handle the fact that only
> > > > > one LRP should have it on for a given router.
> > > > 
> > > > I don't think we need that restriction.  Different LRPs correspond to
> > > > different
> > > > subnets.  Unless I'm missing something I think we should be ok with a per-lrp
> > > > option.
> > > 
> > > Most likely it's me missing something :) 
> > > ARP requests are broadcasted so they will reach all LRPs. The way that arp
> > 
> > Hmm, how?  ARP reqs are broadcasted in the L2 domain so they only reach what
> > LRPs are connected to the logical switch where the ARPs are received.  I
> > don't
> > think it's a valid configuration to have two LRPs from *the same router*
> > connected to the same logical switch.  
> 
> Uhm we're doing this actually :)
> 
> switch da474a04-bd47-4b95-bfc3-3112ddbb9431 (bgp-ex)
>     port ex-bgp-router-neutron
>         type: router
>         router-port: bgp-router-neutron-ex
>     port bgp-ex-localnet
>         type: localnet
>         addresses: ["unknown"]
>     port bgp-portbinding
>     port ex-bgp-router-2
>         type: router
>         router-port: bgp-router-ex-2
>     port ex-bgp-router-1
>         type: router
>         router-port: bgp-router-ex-1
> router fcd73758-e940-4af4-9ae1-c2e98357e281 (bgp-router)
>     port bgp-router-public
>         mac: "40:44:00:00:00:06"
>         networks: ["1.1.1.1/32"]
>     port bgp-router-ex-1
>         mac: "52:54:00:9e:ac:43"
>         networks: ["100.65.3.6/30"]
>     port bgp-router-ex-2
>         mac: "52:54:00:4e:f1:eb"
>         networks: ["100.64.0.6/30"]
> 
> 
> By connecting two LRPs here we are doing ECMP out of the node. We can
> probably have one single LRP with the two networks but we wanted this config
> to match the MAC addresses of the NIC.
> Why would this be wrong?
> 

Oh, I see now.  It's not.  I was over constraining things (at most one subnet
per LS).

> 
> 
> > Moreover, I don't think it's a valid
> > config to enable proxy-arp on two LRPs (different LRs) connected to the same
> > switch.  It's up to the user to avoid that IMO.
> 
> I think this is what I was saying with:
> 
> "Fine with me the per-LRP approach and let the user handle the fact that only
> one LRP should have it on for a given router."
> 

Thanks!  While looking a bit at what the kernel should be doing, AFAICT,
proxy-arp should only reply to ARPs targeting IPs the host can reach
(through its own routing table).  I wonder if OVN should behave in the same
way or if it's good enough to "blindly" reply to all ARP reqs for destinations
that are not owned by the LR or VIFs attached to it.

Comment 13 Dumitru Ceara 2023-02-15 08:42:02 UTC
Summarizing our offline follow up, we should probably extend the current LSP.options:arp_proxy option [0] to include:

a. ipv6
b. subnets instead of host IPs
c. proxy mac address different from the one of the LRP (the logical router pipeline must also be updated to support routing traffic with dmac == proxy-mac-address)

[0] https://github.com/ovn-org/ovn/blob/24cd3267c452f6b687e8c03344693709b1c7ae9f/ovn-nb.xml#L994


Note You need to log in before you can comment on or make changes to this bug.