Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 2242830

Summary: `localnet_learn_fdb` may cause packet loss in OSP17.1
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Daniel Alvarez Sanchez <dalvarez>
Component: ovn22.12Assignee: xsimonar
Status: CLOSED ERRATA QA Contact: Ehsan Elahi <eelahi>
Severity: high Docs Contact:
Priority: high    
Version: FDP 23.ICC: bcafarel, chrisw, ctrautma, dalvarez, ekuris, hakhande, jiji, jishi, migawa, mmichels, rhayakaw, rhos-maint, scohen, xsimonar
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ovn22.12-22.12.1-55.el9fdp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2242439 Environment:
Last Closed: 2024-01-24 11:16:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2242439    

Comment 10 xsimonar 2023-10-19 11:06:19 UTC
Hello

Thanks for the detailed information.

I think we understood the root cause of the issue.

Let's suppose we have the following setup: Two hypervisors (hv1, hv2), one logical switch (ls) with two vifs (vif1, vif2) and a localnet (ln). vif1 is on hv1 and vif2 on hv2.

When vif1 (mac1) sends a packet (e.g. ping1 ) to vif2 (mac2):
- mac1 is recorded in FDB when packet enters ls in hv1 (w/ in_port = vif1)
- mac1 is (again) recorded in FDB when packet enters ls in hv2 (w/ in_port = ln) -> this overwrites previous entry.

Next time a packet is received by ln on hv1 with dst=mac1 (e.g. reply packet from ping1), the packet is dropped.

We should avoid having localnet learning the mac/writing to FDB if this mac is already in FDB for a vif port.

Thanks
Xavier

Comment 13 xsimonar 2023-10-23 15:41:21 UTC
Hi Ryo

Thanks for following this up.

Based on your previous diagram, and supposing we lose packets on right node while pinging left to right (10.1.10.251 ->10.1.10.253, fa:16:3e:43:a0:1b -> fa:16:3e:d2:45:de)

1) ovn-sbctl find fdb mac="fa\:16\:3e\:d2\:45\:de"
  This will give the port_key stored for the mac entry in the fdb table

2) ovn-sbctl get port_b 55d8727d-199c-45b2-9c24-2bf067f7fbbf tunnel_key
  This will give the tunnel_key for the port in the right node
  
They should be the same i.e. the mac entry "fa\:16\:3e\:d2\:45\:de" should be stored in the db with a port entry 55d8727d-199c-45b2-9c24-2bf067f7fbbf.

Furthermore, 
3) ovn-sbctl find port_binding tunnel_key=xxx
  where xxx is the port_key returned in step 1 (e.g. ovn-sbctl find port_binding tunnel_key=1) will give you which ports name are recorded. 
  If my understanding is correct, we will see that the type of the port is localnet
  
Please let me know if it helps, or if it is not clear

Thanks
Xavier

Comment 16 xsimonar 2023-10-25 10:26:05 UTC
Hi Ryo

The output is unexpected. But the issue itself is somehow random as you initially pointed out: some ping were going through, some were dropped.
Do we know for sure in the last run (in comment 15) than ping request did not go through ? Note that ping request might have gone through, but ping reply was maybe dropped. 
In that case (request going ok, reply dropped), the customer would see ping failing (no reply) and the output of the commands as in comment 15. We might also see this output some time after the ping failed (e.g. after a packet has been sent from ip_dst=10.1.10.251).

Both the vif and the localnet ports are fighting for the same mac. Depending of whom wins will cause the packet to be received or dropped. And this for both the ping request and ping reply.
If an interface (such as the vm owning ip_dst=10.1.10.251) sends a packet, the vif will try to write the mac in the db (with port=vif). If that packet is received on another node (by the localnet port), then the localnet will try to overwrite this mac (with port=localnet). If the packet does not reach another node, then the vif will keep owning the mac. Then, when a packet is sent toward that vif (so towards ip_dst=10.1.10.251)), if the entry in fdb refers to the localnet, the packet is dropped.

If you send a few ping, I would expect to see the output as described in comment 10 for at least some of the packets.

Another way to confirm it would be:
- send a few pings
- check using tcpdump that echo request packets do not reach destination
- when running 'ovn-sbctl find fdb mac="fa\:16\:3e\:72\:f9\:84"' as in comment 15, you see the uuid of that fdb entry (cc998093-51c0-4fad-b7d0-c31041e0fd8c in this case)
- then you can grep directly in ovnsb_db.db and check what are the changes for this entry e.g. grep cc998093-51c0-4fad-b7d0-c31041e0fd8c ovnsb_db.db
- I would expect, if packets are lost, either to have the wrong port_key entry, either to see the port_key changing

In the sos_report, I could see the entry for "fa:16:3e:d2:45:de" changing hundreds of times in 20 minutes.

Thanks
Xavier

Comment 18 xsimonar 2023-10-26 11:55:33 UTC
Hi Ryo

Thanks again for all the information.

I merged both information: the ping and the fdb, supposing TZ in Japan, this is what it gives (I omitted the FDB between 09:59:26 and 09:59:33 as no added value, and only copied port_key and date from the FDB for clarity)

09:59:26.835    {"port_key":1}},"_comment":"ovn-controller","_date":1698281966835,
...
09:59:33.261    {"port_key":1}},"_comment":"ovn-controller","_date":1698281973261,
09:59:43.666447 fa:16:3e:d2:45:de > Broadcast, ethertype ARP (0x0806), length 56: Request who-has 10.1.10.251 tell 10.1.10.253, length 42
09:59:43.666674 fa:16:3e:72:f9:84 > fa:16:3e:d2:45:de, ethertype ARP (0x0806), length 42: Reply 10.1.10.251 is-at fa:16:3e:72:f9:84, length 28
09:59:43.670    {"port_key":4}},"_comment":"ovn-controller","_date":1698281983670
09:59:44.683492 fa:16:3e:d2:45:de > Broadcast, ethertype ARP (0x0806), length 56: Request who-has 10.1.10.251 tell 10.1.10.253, length 42
09:59:44.683655 fa:16:3e:72:f9:84 > fa:16:3e:d2:45:de, ethertype ARP (0x0806), length 42: Reply 10.1.10.251 is-at fa:16:3e:72:f9:84, length 28
09:59:44.685070 fa:16:3e:d2:45:de > fa:16:3e:72:f9:84, ethertype IPv4 (0x0800), length 98: 10.1.10.253 > 10.1.10.251: ICMP echo request, id 23, seq 1, length 64
09:59:44.685076 fa:16:3e:d2:45:de > fa:16:3e:72:f9:84, ethertype IPv4 (0x0800), length 98: 10.1.10.253 > 10.1.10.251: ICMP echo request, id 23, seq 2, length 64
09:59:44.685329 fa:16:3e:72:f9:84 > fa:16:3e:d2:45:de, ethertype IPv4 (0x0800), length 98: 10.1.10.251 > 10.1.10.253: ICMP echo reply, id 23, seq 1, length 64
09:59:44.685343 fa:16:3e:72:f9:84 > fa:16:3e:d2:45:de, ethertype IPv4 (0x0800), length 98: 10.1.10.251 > 10.1.10.253: ICMP echo reply, id 23, seq 2, length 64
09:59:44.687    {"port_key":1}},"_comment":"ovn-controller","_date":1698281984687
09:59:50.156470 fa:16:3e:72:f9:84 > fa:16:3e:d2:45:de, ethertype ARP (0x0806), length 42: Request who-has 10.1.10.253 tell 10.1.10.251, length 28
09:59:50.157    :{"port_key":4}},"_comment":"ovn-controller","_date":1698281990157,
09:59:51.180440 fa:16:3e:72:f9:84 > fa:16:3e:d2:45:de, ethertype ARP (0x0806), length 42: Request who-has 10.1.10.253 tell 10.1.10.251, length 28
09:59:51.182    {"port_key":1}},"_comment":"ovn-controller","_date":1698281991182,
09:59:52.204359 fa:16:3e:72:f9:84 > fa:16:3e:d2:45:de, ethertype ARP (0x0806), length 42: Request who-has 10.1.10.253 tell 10.1.10.251, length 28
09:59:52.205    {"port_key":4}},"_comment":"ovn-controller","_date":1698281992205,

First ping happens at 09:59:44.68507. With one ping per second, last (5th) ping would have been around 09:59:48.685.
This first ping succeeds as fdb entry points to port_key 4 i.e. the vif.
The second ping, a few microsec later, succeeds as well for the same reason.
On 09:59:44.687, the FDB entry changed to port_key 1 (i.e. the localnet port from what I saw earlier in the logs). This happened when ping reply 1 (and/or ping 2) reaches 10.1.10.253.
From this moment on, the fdb entry is wrong, till 09:59:50.157, and packets are dropped. At 09:59:50.157 it is too late, as the last (5th) ping happened around 09:59:48.685.
Then, ARP got sent and the port_key comes back to 4 in fdb, so it looks correct when we issue the "ovn-sbctl find fdb ..."

So, this seems to be in line with what I initially expected in comment 10: the port_key is sometimes (often) wrong in the fdb when the packet reaches br-int and it gets dropped.

Thanks
Xavier

Comment 21 xsimonar 2023-11-02 09:15:13 UTC
Hi Ryo

I am working on an ovn patch, but it will take a while until it reaches Openstack.

I unfortunately do not think that manually writing the mac of the vm in fdb would work, as the issue is that localnet is overwriting this entry in fdb. When a packet leaves vm1, mac from vm1 is properly entered in fdb. However, when this packets hits for instance vm2 in a different host, the localnet port will overwrite the mac entry for vm1, and vm1 would not receive anymore packets from vm2 (or other vms).
So, if we overwrite the entry manually for vm1, it would work ... until one packet leaves vm1.

Is the VM sending only packets from that MAC (and only receiving packets towards that MAC) ?

Thanks
Xavier

Comment 22 Ryo Hayakawa 2023-11-07 02:31:11 UTC
Hello Xavier,

Thank you so much for the detailed explanation.
Please let me ask you an additional question about the writing FDB entries manually.

> I unfortunately do not think that manually writing the mac of the vm in fdb would work, as the issue is that localnet is overwriting this entry in fdb. When a packet leaves vm1, mac from vm1 is properly entered in fdb. However, when this packets hits for instance vm2 in a different host, the localnet port will overwrite the mac entry for vm1, and vm1 would not receive anymore packets from vm2 (or other vms).

Will the above behavior occur even if localnet_learn_fdb is False?

I may misunderstand but I guess that the part of the overwriting the FDB entries is from line 5700 to line 5716 of northd/northd.c[1].


If my guess is correct, I suppose that when localnet_learn_fdb is False, the program doesn't go through the part, because line 5699 has the checking of localnet_can_learn_mac().

I would appreciate it if you could give me advice on the above from your expert perspective.


[1] northd/northd.c of ovn-22.12.1:
~~~
   5689 static void
   5690 build_lswitch_learn_fdb_op(
   5691         struct ovn_port *op, struct hmap *lflows,
   5692         struct ds *actions, struct ds *match)
   5693 {
   5694     if (!op->nbsp) {
   5695         return;
   5696     }
   5697 
   5698     if (!op->n_ps_addrs && op->has_unknown && (!strcmp(op->nbsp->type, "") ||
   5699         (lsp_is_localnet(op->nbsp) && localnet_can_learn_mac(op->nbsp)))) {
   5700         ds_clear(match);
   5701         ds_clear(actions);
   5702         ds_put_format(match, "inport == %s", op->json_key);
   5703         ds_put_format(actions, REGBIT_LKUP_FDB
   5704                       " = lookup_fdb(inport, eth.src); next;");
   5705         ovn_lflow_add_with_lport_and_hint(lflows, op->od,
   5706                                           S_SWITCH_IN_LOOKUP_FDB, 100,
   5707                                           ds_cstr(match), ds_cstr(actions),
   5708                                           op->key, &op->nbsp->header_);
   5709 
   5710         ds_put_cstr(match, " && "REGBIT_LKUP_FDB" == 0");
   5711         ds_clear(actions);
   5712         ds_put_cstr(actions, "put_fdb(inport, eth.src); next;");
   5713         ovn_lflow_add_with_lport_and_hint(lflows, op->od, S_SWITCH_IN_PUT_FDB,
   5714                                           100, ds_cstr(match),
   5715                                           ds_cstr(actions), op->key,
   5716                                           &op->nbsp->header_);
   5717     }
   5718 }
   5719 
~~~

Best regards,
Ryo Hayakawa

Comment 32 Ehsan Elahi 2023-12-07 22:43:29 UTC
Here is the reproducer:

################# On HV1: ####################
systemctl start ovn-northd
ovn-nbctl set-connection ptcp:6641
ovn-sbctl set-connection ptcp:6642
systemctl start openvswitch
ovs-vsctl set open . external_ids:system-id=hv1
ifconfig ens1f0 192.168.20.1 netmask 255.255.255.0
ovs-vsctl set open . external_ids:ovn-remote=tcp:192.168.20.1:6642
ovs-vsctl set open . external_ids:ovn-encap-type=geneve
ovs-vsctl set open . external_ids:ovn-encap-ip=192.168.20.1
ovs-vsctl set open . external_ids:ovn-monitor-all=true
systemctl start ovn-controller

ovn-nbctl ls-add ls0

ovn-nbctl lsp-add ls0 ln_port
ovn-nbctl lsp-set-addresses ln_port unknown
ovn-nbctl lsp-set-type ln_port localnet
ovn-nbctl lsp-set-options ln_port network_name=physnet1
ovn-nbctl set logical_switch_port ln_port options:localnet_learn_fdb=true

ovn-nbctl lsp-add ls0 vif1
ovn-nbctl lsp-set-addresses vif1 unknown
ovn-nbctl lsp-add ls0 vif2
ovn-nbctl lsp-set-addresses vif2 unknown

ovs-vsctl add-br br-phys

ovs-vsctl set open . external_ids:ovn-bridge-mappings=physnet1:br-phys

ovs-vsctl -- add-port br-int vif1 -- set Interface vif1 type=internal -- set Interface vif1 external_ids:iface-id=vif1 ofport-request=1
ip netns add vif1
ip link set vif1 netns vif1
ip netns exec vif1 ip link set vif1 address 00:00:01:01:01:01
ip netns exec vif1 ip addr add 192.168.20.1/24 dev vif1
ip netns exec vif1 ip link set vif1 up

ovs-vsctl add-port br-phys ens1f1
ovs-vsctl set Interface ens1f1 ofport-request=2
ip link set br-phys up
ip link set ens1f1 up
ovn-nbctl --wait=hv sync

######### HV0 ###########
systemctl start ovn-northd
systemctl start openvswitch
ovs-vsctl set open . external_ids:system-id=hv0
ifconfig ens1f0 192.168.20.2 netmask 255.255.255.0
ovs-vsctl set open . external_ids:ovn-remote=tcp:192.168.20.1:6642
ovs-vsctl set open . external_ids:ovn-encap-type=geneve
ovs-vsctl set open . external_ids:ovn-encap-ip=192.168.20.2
ovs-vsctl set open . external_ids:ovn-monitor-all=true
systemctl start ovn-controller

ovs-vsctl add-br br-phys
ovs-vsctl set open . external_ids:ovn-bridge-mappings=physnet1:br-phys

ovs-vsctl -- add-port br-int vif2 
ovs-vsctl set Interface vif2 type=internal -- set Interface vif2 external_ids:iface-id=vif2 ofport-request=1
ip netns add vif2
ip link set vif2 netns vif2
ip netns exec vif2 ip link set vif2 address 00:00:01:01:01:02
ip netns exec vif2 ip addr add 192.168.20.2/24 dev vif2
ip netns exec vif2 ip link set vif2 up

ovs-vsctl -- add-port br-phys ens1f1 -- set interface ens1f1 ofport-request=2
ip link set br-phys up
ip link set ens1f1 up

################# On HV1 (with non-fixed OVN) ####################
[root@dell-per740-81 bz_2242830]# ovn-sbctl find fdb mac="00\:00\:01\:01\:01\:02"
_uuid               : 19124a9a-2945-499f-8bee-cf41bd1c7535
dp_key              : 1
mac                 : "00:00:01:01:01:02"
port_key            : 3
[root@dell-per740-81 bz_2242830]# grep 19124a9a-2945-499f-8bee-cf41bd1c7535 /var/lib/ovn/ovnsb_db.db |wc -l
6
[root@dell-per740-81 bz_2242830]# grep 19124a9a-2945-499f-8bee-cf41bd1c7535 /var/lib/ovn/ovnsb_db.db
{"_date":1701985614782,"_is_diff":true,"FDB":{"19124a9a-2945-499f-8bee-cf41bd1c7535":{"mac":"00:00:01:01:01:02","dp_key":1,"port_key":1}},"_comment":"ovn-controller"}
{"_date":1701985615498,"_is_diff":true,"FDB":{"19124a9a-2945-499f-8bee-cf41bd1c7535":{"port_key":3}},"_comment":"ovn-controller"}
{"_date":1701985615562,"_is_diff":true,"FDB":{"19124a9a-2945-499f-8bee-cf41bd1c7535":{"port_key":1}},"_comment":"ovn-controller"}
{"_date":1701985616586,"_is_diff":true,"FDB":{"19124a9a-2945-499f-8bee-cf41bd1c7535":{"port_key":3}},"_comment":"ovn-controller"}
{"_date":1701985616802,"_is_diff":true,"FDB":{"19124a9a-2945-499f-8bee-cf41bd1c7535":{"port_key":1}},"_comment":"ovn-controller"}
{"_date":1701985617226,"_is_diff":true,"FDB":{"19124a9a-2945-499f-8bee-cf41bd1c7535":{"port_key":3}},"_comment":"ovn-controller"}
[root@dell-per740-81 bz_2242830]# ip netns exec vif1 ping 192.168.20.2 -c30 -q
PING 192.168.20.2 (192.168.20.2) 56(84) bytes of data.

--- 192.168.20.2 ping statistics ---
30 packets transmitted, 13 received, 56.6667% packet loss, time 29678ms
rtt min/avg/max/mdev = 0.076/0.134/0.550/0.078 ms
[root@dell-per740-81 bz_2242830]# grep 19124a9a-2945-499f-8bee-cf41bd1c7535 /var/lib/ovn/ovnsb_db.db |wc -l
18

<================= Few packet loss + port_key changing frequently ===============>
<================= Reproduced on ================================================>
[root@dell-per740-81 bz_2242830]# rpm -qa | grep -E 'ovn|openvswitch'
openvswitch-selinux-extra-policy-1.0-34.el9fdp.noarch
openvswitch2.17-2.17.0-125.el9fdp.x86_64
ovn22.12-22.12.1-50.el9fdp.x86_64
ovn22.12-central-22.12.1-50.el9fdp.x86_64
ovn22.12-host-22.12.1-50.el9fdp.x86_64

<================= Verified on ================================================>
[root@dell-per740-81 bz_2242830]# rpm -qa | grep -E 'ovn|openvswitch'
openvswitch-selinux-extra-policy-1.0-34.el9fdp.noarch
openvswitch2.17-2.17.0-125.el9fdp.x86_64
ovn22.12-22.12.1-55.el9fdp.x86_64
ovn22.12-central-22.12.1-55.el9fdp.x86_64
ovn22.12-host-22.12.1-55.el9fdp.x86_64

################# On HV1 (with fixed OVN) ####################
[root@dell-per740-81 bz_2242830]# ovn-sbctl find fdb mac="00\:00\:01\:01\:01\:02"
_uuid               : c80e4511-e5ed-48ae-827e-f03305a6c3a9
dp_key              : 1
mac                 : "00:00:01:01:01:02"
port_key            : 3
[root@dell-per740-81 bz_2242830]# grep c80e4511-e5ed-48ae-827e-f03305a6c3a9 /var/lib/ovn/ovnsb_db.db
{"_date":1701986634562,"_is_diff":true,"FDB":{"c80e4511-e5ed-48ae-827e-f03305a6c3a9":{"mac":"00:00:01:01:01:02","dp_key":1,"port_key":3}},"_comment":"ovn-controller"}
[root@dell-per740-81 bz_2242830]# ip netns exec vif1 ping 192.168.20.2 -c30 -q
PING 192.168.20.2 (192.168.20.2) 56(84) bytes of data.

--- 192.168.20.2 ping statistics ---
30 packets transmitted, 30 received, 0% packet loss, time 29727ms
rtt min/avg/max/mdev = 0.092/0.131/0.516/0.072 ms
[root@dell-per740-81 bz_2242830]# grep c80e4511-e5ed-48ae-827e-f03305a6c3a9 /var/lib/ovn/ovnsb_db.db
{"_date":1701986634562,"_is_diff":true,"FDB":{"c80e4511-e5ed-48ae-827e-f03305a6c3a9":{"mac":"00:00:01:01:01:02","dp_key":1,"port_key":3}},"_comment":"ovn-controller"}

<================= No packet loss + port_key not changing ===============>

Comment 37 errata-xmlrpc 2024-01-24 11:16:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn22.12 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:0393