Bug 1896993

Summary: ipv6 RS flooding: ofproto_dpif_xlate(handler70)|WARN|over 4096 resubmit
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Jianlin Shi <jishi>
Component: ovn2.13Assignee: OVN Team <ovnteam>
Status: NEW --- QA Contact: Jianlin Shi <jishi>
Severity: unspecified Docs Contact:
Priority: medium    
Version: FDP 20.ICC: ctrautma, jishi, mmichels, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jianlin Shi 2020-11-12 01:39:30 UTC
Description of problem:
ipv6 RS flooding: ofproto_dpif_xlate(handler70)|WARN|over 4096 resubmit

Version-Release number of selected component (if applicable):
ovn2.13.0-20.09.0-10

How reproducible:
Always

Steps to Reproduce:
https://bugzilla.redhat.com/show_bug.cgi?id=1894478#c5

Actual results:
[root@wsfd-advnetlab19 ~]# grep 4096 /var/log/openvswitch/ovs-vswitchd.log                            
2020-11-11T10:13:39.047Z|00001|ofproto_dpif_xlate(handler70)|WARN|over 4096 resubmit actions on bridge br-int while processing icmp6,in_port=2,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::200:ff:fe00:1,ipv6_dst=ff02::2,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=255,icmp_type=133,icmp_code=0
2020-11-11T10:13:40.068Z|00001|ofproto_dpif_xlate(handler56)|WARN|over 4096 resubmit actions on bridge br-int while processing icmp6,in_port=LOCAL,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::c50:caff:feed:5840,ipv6_dst=ff02::2,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=255,icmp_type=133,icmp_code=0

Expected results:
no WARN for ipv6 RS

Additional info:

Comment 1 Mark Michelson 2021-11-19 14:41:59 UTC
Hi, we're trying to prioritize older issues. This is reported against an old version of ovn2.13. I have a couple of questions: does this still happen with current ovn2.13? Does this still happen with current ovn-2021?

Comment 2 Jianlin Shi 2021-11-22 03:40:21 UTC
(In reply to Mark Michelson from comment #1)
> Hi, we're trying to prioritize older issues. This is reported against an old
> version of ovn2.13. I have a couple of questions: does this still happen
> with current ovn2.13? Does this still happen with current ovn-2021?

the issue still exist on the latest ovn-2021 version: ovn-2021-21.09.1-20:

+ grep 4096 /var/log/openvswitch/ovs-vswitchd.log
2021-11-22T03:37:43.290Z|00001|ofproto_dpif_xlate(handler3)|WARN|over 4096 resubmit actions on bridge br-int while processing icmp6,in_port=1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::200:ff:fe00:1,ipv6_dst=ff02::2,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=255,icmp_type=133,icmp_code=0
2021-11-22T03:37:44.812Z|00001|ofproto_dpif_xlate(handler4)|WARN|over 4096 resubmit actions on bridge br-int while processing icmp6,in_port=LOCAL,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::e820:e7ff:fe17:a442,ipv6_dst=ff02::2,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=255,icmp_type=133,icmp_code=0
2021-11-22T03:38:15.031Z|00002|ofproto_dpif_xlate(handler3)|WARN|over 4096 resubmit actions on bridge br-int while processing icmp6,in_port=1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::200:ff:fe00:1,ipv6_dst=ff02::2,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=255,icmp_type=133,icmp_code=0
2021-11-22T03:38:21.675Z|00001|ofproto_dpif_xlate(handler5)|WARN|over 4096 resubmit actions on bridge br-int while processing icmp6,in_port=LOCAL,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::e820:e7ff:fe17:a442,ipv6_dst=ff02::2,ipv6_label=0x00000,nw_tos=0,nw_ecn=0,nw_ttl=255,icmp_type=133,icmp_code=0
[root@dell-per740-12 bz1896993]# rpm -qa | grep -E "openvswitch2.15|ovn-2021"
python3-openvswitch2.15-2.15.0-53.el8fdp.x86_64
ovn-2021-21.09.1-20.el8fdp.x86_64
openvswitch2.15-2.15.0-53.el8fdp.x86_64
ovn-2021-central-21.09.1-20.el8fdp.x86_64
ovn-2021-host-21.09.1-20.el8fdp.x86_64

Comment 3 Mark Michelson 2022-10-10 18:14:26 UTC
This issue came up during our OVN meeting today. As far as we understand, this issue is likely still present in OVN. RS packets are still broadcast to all connected routers from a logical switch. This is likely what results in the resubmit limit being hit. We should be handling RS packets the same as ARP and ND packets: target them to the owning router of an address instead of doing a broadcast.

@jishi, the linked reproducer in comment 1 does not actually have the list of OVN commands that were used to cause this issue. Can you please provide those? Thank you.

Comment 4 Jianlin Shi 2022-10-10 23:57:06 UTC
(In reply to Mark Michelson from comment #3)
> This issue came up during our OVN meeting today. As far as we understand,
> this issue is likely still present in OVN. RS packets are still broadcast to
> all connected routers from a logical switch. This is likely what results in
> the resubmit limit being hit. We should be handling RS packets the same as
> ARP and ND packets: target them to the owning router of an address instead
> of doing a broadcast.
> 
> @jishi, the linked reproducer in comment 1 does not actually have
> the list of OVN commands that were used to cause this issue. Can you please
> provide those? Thank you.

the reproducer is listed in https://bugzilla.redhat.com/show_bug.cgi?id=1894048#c7