The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1756945 - [OVN] pinging vm floatinip is failing at scale
Summary: [OVN] pinging vm floatinip is failing at scale
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.11
Version: RHEL 7.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Dumitru Ceara
QA Contact: ying xu
URL:
Whiteboard:
Depends On:
Blocks: 1776712
TreeView+ depends on / blocked
 
Reported: 2019-09-30 09:21 UTC by anil venkata
Modified: 2020-02-03 02:42 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1776712 (view as bug list)
Environment:
Last Closed: 2020-01-21 17:02:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:0190 0 None None None 2020-01-21 17:02:56 UTC

Description anil venkata 2019-09-30 09:21:11 UTC
Description of problem:

Running rally scenario test [1] which creates 400 vms (each vm will have its own network and router)
 and ping their floating ip.
Environment:
1) OSP 13 with OVS 2.11 (OVS patch provided by OVN team), also increated probe intervals
2) 3 Controllers and 6 compute node baremetal OSP deployment
3) increase timeout (12000 sec) for vm booting and ping
Ping was succesful till 280 vms and failed for some vms after 280.   

Numan manually debugged and see arp was unable to get resolved for router gateway IP 
while pinging corresponding vm floatingips.

[1] https://github.com/cloud-bulldozer/browbeat/blob/master/rally/rally-plugins/netcreate-boot/netcreate_nova_boot_fip_ping.py

Comment 1 anil venkata 2019-09-30 09:37:16 UTC
Numan while debugging on the setup noticed -

1) For the failed arp, the arp broadcast packet (for resolving router gateway ip) entering from switch(external network) pipeline to router pipeline, is not hitting table 34 after hitting table 32. 
He thinks that the below flow (table 32) has many resubmit actions which vswitchd is unable to process.

 cookie=0x0, duration=253438.565s, table=32, n_packets=148549, n_bytes=8905092, idle_age=65534, hard_age=65534, priority=100,reg15=0xffff,metadata=0x1 actions=load:0x91->NXM_NX_REG15[],resubmit(,34),load:0x21->NXM_NX_REG15[],resubmit(,34),load:0xdb->NXM_NX_REG15[],resubmit(,34),load:0x88->NXM_NX_REG15[],resubmit(,34),load:0x50->NXM_NX_REG15[],resubmit(,34),load:0x132->NXM_NX_REG15[],resubmit(,34),load:0x54->NXM_NX_REG15[],resubmit(,34),load:0xda->NXM_NX_REG15[],resubmit(,34),load:0x43->NXM_NX_REG15[],resubmit(,34),load:0x52->NXM_NX_REG15[],resubmit(,34),load:0x42->NXM_NX_REG15[],resubmit(,34),load:0xd5->NXM_NX_REG15[],resubmit(,34),load:0x15b->NXM_NX_REG15[],resubmit(,34),load:0x4d->NXM_NX_REG15[],resubmit(,34),load:0xe8->NXM_NX_REG15[],resubmit(,34),load:0x133->NXM_NX_REG15[],resubmit(,34),load:0x6c->NXM_NX_REG15[],resubmit(,34),load:0x8d->NXM_NX_REG15[],resubmit(,34),load:0xfc->NXM_NX_REG15[],resubmit(,34),load:0xd6->NXM_NX_REG15[],resubmit(,34),load:0xe->NXM_NX_REG15[],resubmit(,34),load:0xe3->NXM_NX_REG15[],resubmit(,34),load:0x169->NXM_NX_REG15[],resubmit(,34),load:0x14e->NXM_NX_REG15[],resubmit(,34),load:0x147->NXM_NX_REG15[],resubmit(,34),load:0x14->NXM_NX_REG15[],resubmit(,34),load:0x44->NXM_NX_REG15[],resubmit(,34),load:0x3d->NXM_NX_REG15[],resubmit(,34),load:0x3a->NXM_NX_REG15[],resubmit(,34),load:0xcb->NXM_NX_REG15[],resubmit(,34),load:0x165->NXM_NX_REG15[],resubmit(,34),load:0x144->NXM_NX_REG15[],resubmit(,34),load:0x151->NXM_NX_REG15[],resubmit(,34),load:0x5e->NXM_NX_REG15[],resubmit(,34),load:0x109->NXM_NX_REG15[],resubmit(,34),load:0xb9->NXM_NX_REG15[],resubmit(,34),load:0x100->NXM_NX_REG15[],resubmit(,34),load:0xcc->NXM_NX_REG15[],resubmit(,34),load:0x113->NXM_NX_REG15[],resubmit(,34),load:0x123->NXM_NX_REG15[],resubmit(,34),load:0x86->NXM_NX_REG15[],resubmit(,34),load:0xdf->NXM_NX_REG15[],resubmit(,34),load:0x82->NXM_NX_REG15[],resubmit(,34),load:0x10f->NXM_NX_REG15[],resubmit(,34),load:0x11f->NXM_NX_REG15[],resubmit(,34),load:0x114->NXM_NX_REG15[],resubmit(,34),load:0x46->NXM_NX_REG15[],resubmit(,34),load:0x10d->NXM_NX_REG15[],resubmit(,34),load:0xf5->NXM_NX_REG15[],resubmit(,34),load:0x7->NXM_NX_REG15[],resubmit(,34),load:0x4c->NXM_NX_REG15[],resubmit(,34),load:0x51->NXM_NX_REG15[],resubmit(,34),load:0x33->NXM_NX_REG15[],resubmit(,34),load:0x12->NXM_NX_REG15[],resubmit(,34),load:0x16d->NXM_NX_REG15[],resubmit(,34),load:0xb8->NXM_NX_REG15[],resubmit(,34),load:0x48->NXM_NX_REG15[],resubmit(,34),load:0xba->NXM_NX_REG15[],resubmit(,34),load:0x10->NXM_NX_REG15[],resubmit(,34),load:0x6->NXM_NX_REG15[],resubmit(,34),load:0x1b->NXM_NX_REG15[],resubmit(,34),load:0x120->NXM_NX_REG15[],resubmit(,34),load:0xc4->NXM_NX_REG15[],resubmit(,34),load:0x14d->NXM_NX_REG15[],resubmit(,34),load:0xf4->NXM_NX_REG15[],resubmit(,34),load:0xf->NXM_NX_REG15[],resubmit(,34),load:0x129->NXM_NX_REG15[],resubmit(,34),load:0x8c->NXM_NX_REG15[],resubmit(,34),load:0x85->NXM_NX_REG15[],resubmit(,34),load:0x110->NXM_NX_REG15[],resubmit(,34),load:0x93->NXM_NX_REG15[],resubmit(,34),load:0xe5->NXM_NX_REG15[],resubmit(,34),load:0x3b->NXM_NX_REG15[],resubmit(,34),load:0x62->NXM_NX_REG15[],resubmit(,34),load:0x39->NXM_NX_REG15[],resubmit(,34),load:0xce->NXM_NX_REG15[],resubmit(,34),load:0xe2->NXM_NX_REG15[],resubmit(,34),load:0x49->NXM_NX_REG15[],resubmit(,34),load:0x12d->NXM_NX_REG15[],resubmit(,34),load:0x115->NXM_NX_REG15[],resubmit(,34),load:0x24->NXM_NX_REG15[],resubmit(,34),load:0x8a->NXM_NX_REG15[],resubmit(,34),load:0x135->NXM_NX_REG15[],resubmit(,34),load:0x79->NXM_NX_REG15[],resubmit(,34),load:0xee->NXM_NX_REG15[],resubmit(,34),load:0x22->NXM_NX_REG15[],resubmit(,34),load:0x10b->NXM_NX_REG15[],resubmit(,34),load:0x67->NXM_NX_REG15[],resubmit(,34),load:0xb7->NXM_NX_REG15[],resubmit(,34),load:0x25->NXM_NX_REG15[],resubmit(,34),load:0xf7->NXM_NX_REG15[],resubmit(,34),load:0x98->NXM_NX_REG15[],resubmit(,34),load:0x1d->NXM_NX_REG15[],resubmit(,34),load:0xa4->NXM_NX_REG15[],resubmit(,34),load:0x36->NXM_NX_REG15[],resubmit(,34),load:0x1e->NXM_NX_REG15[],resubmit(,34),load:0x155->NXM_NX_REG15[],resubmit(,34),load:0x157->NXM_NX_REG15[],resubmit(,34),load:0x136->NXM_NX_REG15[],resubmit(,34),load:0x3f->NXM_NX_REG15[],resubmit(,34),load:0x63->NXM_NX_REG15[],resubmit(,34),load:0x15c->NXM_NX_REG15[],resubmit(,34),load:0x8->NXM_NX_REG15[],resubmit(,34),load:0x16f->NXM_NX_REG15[],resubmit(,34),load:0x167->NXM_NX_REG15[],resubmit(,34),load:0xa7->NXM_NX_REG15[],resubmit(,34),load:0xf9->NXM_NX_REG15[],resubmit(,34),load:0xa2->NXM_NX_REG15[],resubmit(,34),load:0x108->NXM_NX_REG15[],resubmit(,34),load:0x3c->NXM_NX_REG15[],resubmit(,34),load:0x13f->NXM_NX_REG15[],resubmit(,34),load:0x31->NXM_NX_REG15[],resubmit(,34),load:0x89->NXM_NX_REG15[],resubmit(,34),load:0x35->NXM_NX_REG15[],resubmit(,34),load:0xfd->NXM_NX_REG15[],resubmit(,34),load:0xdc->NXM_NX_REG15[],resubmit(,34),load:0xca->NXM_NX_REG15[],resubmit(,34),load:0xd8->NXM_NX_REG15[],resubmit(,34),load:0x29->NXM_NX_REG15[],resubmit(,34),load:0x13e->NXM_NX_REG15[],resubmit(,34),load:0x4->NXM_NX_REG15[],resubmit(,34),load:0x60->NXM_NX_REG15[],resubmit(,34),load:0x15->NXM_NX_REG15[],resubmit(,34),load:0x71->NXM_NX_REG15[],resubmit(,34),load:0x47->NXM_NX_REG15[],resubmit(,34),load:0x9b->NXM_NX_REG15[],resubmit(,34),load:0xa8->NXM_NX_REG15[],resubmit(,34),load:0x20->NXM_NX_REG15[],resubmit(,34),load:0x7a->NXM_NX_REG15[],resubmit(,34),load:0x9->NXM_NX_REG15[],resubmit(,34),load:0x11d->NXM_NX_REG15[],resubmit(,34),load:0x11a->NXM_NX_REG15[],resubmit(,34),load:0x111->NXM_NX_REG15[],resubmit(,34),load:0x168->NXM_NX_REG15[],resubmit(,34),load:0x156->NXM_NX_REG15[],resubmit(,34),load:0x118->NXM_NX_REG15[],resubmit(,34),load:0x77->NXM_NX_REG15[],resubmit(,34),load:0x130->NXM_NX_REG15[],resubmit(,34),load:0x15a->NXM_NX_REG15[],resubmit(,34),load:0x96->NXM_NX_REG15[],resubmit(,34),load:0x107->NXM_NX_REG15[],resubmit(,34),load:0x58->NXM_NX_REG15[],resubmit(,34),load:0x122->NXM_NX_REG15[],resubmit(,34),load:0x10a->NXM_NX_REG15[],resubmit(,34),load:0x4f->NXM_NX_REG15[],resubmit(,34),load:0xfb->NXM_NX_REG15[],resubmit(,34),load:0x83->NXM_NX_REG15[],resubmit(,34),load:0xc6->NXM_NX_REG15[],resubmit(,34),load:0x145->NXM_NX_REG15[],resubmit(,34),load:0xd7->NXM_NX_REG15[],resubmit(,34),load:0x2f->NXM_NX_REG15[],resubmit(,34),load:0x40->NXM_NX_REG15[],resubmit(,34),load:0xc3->NXM_NX_REG15[],resubmit(,34),load:0xf8->NXM_NX_REG15[],resubmit(,34),load:0x128->NXM_NX_REG15[],resubmit(,34),load:0x164->NXM_NX_REG15[],resubmit(,34),load:0x6f->NXM_NX_REG15[],resubmit(,34),load:0x6e->NXM_NX_REG15[],resubmit(,34),load:0x119->NXM_NX_REG15[],resubmit(,34),load:0x32->NXM_NX_REG15[],resubmit(,34),load:0xbb->NXM_NX_REG15[],resubmit(,34),load:0xed->NXM_NX_REG15[],resubmit(,34),load:0xec->NXM_NX_REG15[],resubmit(,34),load:0x9a->NXM_NX_REG15[],resubmit(,34),load:0xfa->NXM_NX_REG15[],resubmit(,34),load:0x73->NXM_NX_REG15[],resubmit(,34),load:0xa5->NXM_NX_REG15[],resubmit(,34),load:0x16->NXM_NX_REG15[],resubmit(,34),load:0xd0->NXM_NX_REG15[],resubmit(,34),load:0x9c->NXM_NX_REG15[],resubmit(,34),load:0x149->NXM_NX_REG15[],resubmit(,34),load:0x64->NXM_NX_REG15[],resubmit(,34),load:0x16c->NXM_NX_REG15[],resubmit(,34),load:0x7c->NXM_NX_REG15[],resubmit(,34),load:0xf1->NXM_NX_REG15[],resubmit(,34),load:0x138->NXM_NX_REG15[],resubmit(,34),load:0x121->NXM_NX_REG15[],resubmit(,34),load:0x153->NXM_NX_REG15[],resubmit(,34),load:0xd3->NXM_NX_REG15[],resubmit(,34),load:0x65->NXM_NX_REG15[],resubmit(,34),load:0x12c->NXM_NX_REG15[],resubmit(,34),load:0xa6->NXM_NX_REG15[],resubmit(,34),load:0x1f->NXM_NX_REG15[],resubmit(,34),load:0x7e->NXM_NX_REG15[],resubmit(,34),load:0x13c->NXM_NX_REG15[],resubmit(,34),load:0x14c->NXM_NX_REG15[],resubmit(,34),load:0x13->NXM_NX_REG15[],resubmit(,34),load:0xb2->NXM_NX_REG15[],resubmit(,34),load:0xaf->NXM_NX_REG15[],resubmit(,34),load:0x1a->NXM_NX_REG15[],resubmit(,34),load:0x7f->NXM_NX_REG15[],resubmit(,34),load:0x12f->NXM_NX_REG15[],resubmit(,34),load:0x37->NXM_NX_REG15[],resubmit(,34),load:0x154->NXM_NX_REG15[],resubmit(,34),load:0x127->NXM_NX_REG15[],resubmit(,34),load:0x26->NXM_NX_REG15[],resubmit(,34),load:0x59->NXM_NX_REG15[],resubmit(,34),load:0x14b->NXM_NX_REG15[],resubmit(,34),load:0xb5->NXM_NX_REG15[],resubmit(,34),load:0x104->NXM_NX_REG15[],resubmit(,34),load:0xc7->NXM_NX_REG15[],resubmit(,34),load:0x162->NXM_NX_REG15[],resubmit(,34),load:0xe0->NXM_NX_REG15[],resubmit(,34),load:0x11->NXM_NX_REG15[],resubmit(,34),load:0x75->NXM_NX_REG15[],resubmit(,34),load:0xf2->NXM_NX_REG15[],resubmit(,34),load:0x12e->NXM_NX_REG15[],resubmit(,34),load:0xae->NXM_NX_REG15[],resubmit(,34),load:0x90->NXM_NX_REG15[],resubmit(,34),load:0x13d->NXM_NX_REG15[],resubmit(,34),load:0x106->NXM_NX_REG15[],resubmit(,34),load:0x15e->NXM_NX_REG15[],resubmit(,34),load:0x28->NXM_NX_REG15[],resubmit(,34),load:0x23->NXM_NX_REG15[],resubmit(,34),load:0x12a->NXM_NX_REG15[],resubmit(,34),load:0x2d->NXM_NX_REG15[],resubmit(,34),load:0xe6->NXM_NX_REG15[],resubmit(,34),load:0xab->NXM_NX_REG15[],resubmit(,34),load:0x16b->NXM_NX_REG15[],resubmit(,34),load:0x19->NXM_NX_REG15[],resubmit(,34),load:0xfe->NXM_NX_REG15[],resubmit(,34),load:0x55->NXM_NX_REG15[],resubmit(,34),load:0x134->NXM_NX_REG15[],resubmit(,34),load:0xa->NXM_NX_REG15[],resubmit(,34),load:0x13a->NXM_NX_REG15[],resubmit(,34),load:0x11e->NXM_NX_REG15[],resubmit(,34),load:0x131->NXM_NX_REG15[],resubmit(,34),load:0x5f->NXM_NX_REG15[],resubmit(,34),load:0xf0->NXM_NX_REG15[],resubmit(,34),load:0x14a->NXM_NX_REG15[],resubmit(,34),load:0x7b->NXM_NX_REG15[],resubmit(,34),load:0x4e->NXM_NX_REG15[],resubmit(,34),load:0x53->NXM_NX_REG15[],resubmit(,34),load:0xe9->NXM_NX_REG15[],resubmit(,34),load:0xff->NXM_NX_REG15[],resubmit(,34),load:0xc->NXM_NX_REG15[],resubmit(,34),load:0xde->NXM_NX_REG15[],resubmit(,34),load:0x143->NXM_NX_REG15[],resubmit(,34),load:0x13b->NXM_NX_REG15[],resubmit(,34),load:0xcd->NXM_NX_REG15[],resubmit(,34),load:0xad->NXM_NX_REG15[],resubmit(,34),load:0x146->NXM_NX_REG15[],resubmit(,34),load:0xc0->NXM_NX_REG15[],resubmit(,34),load:0xcf->NXM_NX_REG15[],resubmit(,34),load:0x69->NXM_NX_REG15[],resubmit(,34),load:0x1c->NXM_NX_REG15[],resubmit(,34),load:0x5d->NXM_NX_REG15[],resubmit(,34),load:0x117->NXM_NX_REG15[],resubmit(,34),load:0xc2->NXM_NX_REG15[],resubmit(,34),load:0x163->NXM_NX_REG15[],resubmit(,34),load:0xf6->NXM_NX_REG15[],resubmit(,34),load:0x56->NXM_NX_REG15[],resubmit(,34),load:0x61->NXM_NX_REG15[],resubmit(,34),load:0x116->NXM_NX_REG15[],resubmit(,34),load:0x10c->NXM_NX_REG15[],resubmit(,34),load:0x160->NXM_NX_REG15[],resubmit(,34),load:0x4b->NXM_NX_REG15[],resubmit(,34),load:0x16a->NXM_NX_REG15[],resubmit(,34),load:0xe1->NXM_NX_REG15[],resubmit(,34),load:0x8e->NXM_NX_REG15[],resubmit(,34),load:0x66->NXM_NX_REG15[],resubmit(,34),load:0xdd->NXM_NX_REG15[],resubmit(,34),load:0x140->NXM_NX_REG15[],resubmit(,34),load:0x34->NXM_NX_REG15[],resubmit(,34),load:0xd1->NXM_NX_REG15[],resubmit(,34),load:0x148->NXM_NX_REG15[],resubmit(,34),load:0x125->NXM_NX_REG15[],resubmit(,34),load:0x124->NXM_NX_REG15[],resubmit(,34),load:0x101->NXM_NX_REG15[],resubmit(,34),load:0x5->NXM_NX_REG15[],resubmit(,34),load:0x5b->NXM_NX_REG15[],resubmit(,34),load:0xbe->NXM_NX_REG15[],resubmit(,34),load:0x142->NXM_NX_REG15[],resubmit(,34),load:0x17->NXM_NX_REG15[],resubmit(,34),load:0x41->NXM_NX_REG15[],resubmit(,34),load:0xa3->NXM_NX_REG15[],resubmit(,34),load:0xd4->NXM_NX_REG15[],resubmit(,34),load:0x81->NXM_NX_REG15[],resubmit(,34),load:0x94->NXM_NX_REG15[],resubmit(,34),load:0xc9->NXM_NX_REG15[],resubmit(,34),load:0xef->NXM_NX_REG15[],resubmit(,34),load:0xb4->NXM_NX_REG15[],resubmit(,34),load:0x139->NXM_NX_REG15[],resubmit(,34),load:0x126->NXM_NX_REG15[],resubmit(,34),load:0x12b->NXM_NX_REG15[],resubmit(,34),load:0x10e->NXM_NX_REG15[],resubmit(,34),load:0xb1->NXM_NX_REG15[],resubmit(,34),load:0x30->NXM_NX_REG15[],resubmit(,34),load:0x74->NXM_NX_REG15[],resubmit(,34),load:0x5a->NXM_NX_REG15[],resubmit(,34),load:0x92->NXM_NX_REG15[],resubmit(,34),load:0xa0->NXM_NX_REG15[],resubmit(,34),load:0x6a->NXM_NX_REG15[],resubmit(,34),load:0x5c->NXM_NX_REG15[],resubmit(,34),load:0xb6->NXM_NX_REG15[],resubmit(,34),load:0xc8->NXM_NX_REG15[],resubmit(,34),load:0xaa->NXM_NX_REG15[],resubmit(,34),load:0xeb->NXM_NX_REG15[],resubmit(,34),load:0x150->NXM_NX_REG15[],resubmit(,34),load:0x72->NXM_NX_REG15[],resubmit(,34),load:0xa1->NXM_NX_REG15[],resubmit(,34),load:0x87->NXM_NX_REG15[],resubmit(,34),load:0x84->NXM_NX_REG15[],resubmit(,34),load:0x3->NXM_NX_REG15[],resubmit(,34),load:0x159->NXM_NX_REG15[],resubmit(,34),load:0x9d->NXM_NX_REG15[],resubmit(,34),load:0x105->NXM_NX_REG15[],resubmit(,34),load:0x158->NXM_NX_REG15[],resubmit(,34),load:0xc5->NXM_NX_REG15[],resubmit(,34),load:0xbc->NXM_NX_REG15[],resubmit(,34),load:0x8f->NXM_NX_REG15[],resubmit(,34),load:0x6b->NXM_NX_REG15[],resubmit(,34),load:0x4a->NXM_NX_REG15[],resubmit(,34),load:0xd2->NXM_NX_REG15[],resubmit(,34),load:0x152->NXM_NX_REG15[],resubmit(,34),load:0x6d->NXM_NX_REG15[],resubmit(,34),load:0x7d->NXM_NX_REG15[],resubmit(,34),load:0x80->NXM_NX_REG15[],resubmit(,34),load:0xa9->NXM_NX_REG15[],resubmit(,34),load:0xbf->NXM_NX_REG15[],resubmit(,34),load:0x68->NXM_NX_REG15[],resubmit(,34),load:0x161->NXM_NX_REG15[],resubmit(,34),load:0x170->NXM_NX_REG15[],resubmit(,34),load:0xbd->NXM_NX_REG15[],resubmit(,34),load:0x166->NXM_NX_REG15[],resubmit(,34),load:0x171->NXM_NX_REG15[],resubmit(,34),load:0x18->NXM_NX_REG15[],resubmit(,34),load:0xd->NXM_NX_REG15[],resubmit(,34),load:0x15d->NXM_NX_REG15[],resubmit(,34),load:0xe4->NXM_NX_REG15[],resubmit(,34),load:0x2c->NXM_NX_REG15[],resubmit(,34),load:0x97->NXM_NX_REG15[],resubmit(,34),load:0xb->NXM_NX_REG15[],resubmit(,34),load:0x16e->NXM_NX_REG15[],resubmit(,34),load:0x137->NXM_NX_REG15[],resubmit(,34),load:0x38->NXM_NX_REG15[],resubmit(,34),load:0xb0->NXM_NX_REG15[],resubmit(,34),load:0xd9->NXM_NX_REG15[],resubmit(,34),load:0x15f->NXM_NX_REG15[],resubmit(,34),load:0x9e->NXM_NX_REG15[],resubmit(,34),load:0x70->NXM_NX_REG15[],resubmit(,34),load:0x112->NXM_NX_REG15[],resubmit(,34),load:0x76->NXM_NX_REG15[],resubmit(,34),load:0x9f->NXM_NX_REG15[],resubmit(,34),load:0x11c->NXM_NX_REG15[],resubmit(,34),load:0x27->NXM_NX_REG15[],resubmit(,34),load:0x14f->NXM_NX_REG15[],resubmit(,34),load:0xac->NXM_NX_REG15[],resubmit(,34),load:0xe7->NXM_NX_REG15[],resubmit(,34),load:0x99->NXM_NX_REG15[],resubmit(,34),load:0x3e->NXM_NX_REG15[],resubmit(,34),load:0x141->NXM_NX_REG15[],resubmit(,34),load:0x57->NXM_NX_REG15[],resubmit(,34),load:0x2e->NXM_NX_REG15[],resubmit(,34),load:0x2a->NXM_NX_REG15[],resubmit(,34),load:0x78->NXM_NX_REG15[],resubmit(,34),load:0xea->NXM_NX_REG15[],resubmit(,34),load:0x11b->NXM_NX_REG15[],resubmit(,34),load:0x2b->NXM_NX_REG15[],resubmit(,34),load:0x95->NXM_NX_REG15[],resubmit(,34),load:0xc1->NXM_NX_REG15[],resubmit(,34),load:0xf3->NXM_NX_REG15[],resubmit(,34),load:0x45->NXM_NX_REG15[],resubmit(,34),load:0xb3->NXM_NX_REG15[],resubmit(,34),load:0x103->NXM_NX_REG15[],resubmit(,34),load:0x8b->NXM_NX_REG15[],resubmit(,34),load:0x102->NXM_NX_REG15[],resubmit(,34),load:0xffff->NXM_NX_REG15[],resubmit(,33)
 cookie=0x0, duration=251368.835s, table=32, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=100,reg15=0xffff,metadata=0x2e0 actions=load:0x2->NXM_NX_REG15[],resubmit(,34),load:0xffff->NXM_NX_REG15[],load:0x2e0->NXM_NX_TUN_ID[0..23],set_field:0xffff->tun_metadata0,move:NXM_NX_REG14[0..14]->NXM_NX_TUN_METADATA0[16..30],output:22

Comment 2 Dumitru Ceara 2019-10-01 08:27:59 UTC
Broadcast packets need to be forwarded to all logical ports and in OVN that is done by using the MC_FLOOD multicast group which translates to flows in tables 32-34.
In table 32 the flows for a datapath's MC_FLOOD group resubmit the packet on *each* patch port to allow it to be processed by all connected logical routers.
In table 33 the flows for a datapath's MC_FLOOD group resubmit the packet on *each* local VIF to allow the egress pipeline of the logical switch to be executed.

Unfortunately, on the egress pipeline we might have flows that use the out-port as part of the match (e.g., OUT ACL, QOS marking) which means that we need to execute the egress pipeline for each "copy" of the original packet as the final result might be different based on the final egress port.

In this bug the problem is that the logical switch on which the packet is initially received is connected to ~300 logical routers so we end up with this entry in table 32 resubmitting the packet on all patch ports leading to the logical routers. On each individual pipeline there will be more resubmits happening. It's quite expected to have 10-20 resubmits in the logical router and egress switch pipelines. This leads to more than 4K resubmits for a single broadcast packet. OVS has a hard limit of 4K resubmits to protect against a packet looping for ever in the pipeline and drops the packet once the limit is hit.

For example a flatter topology with a single logical switch and more than 300 VIFs connected to it would have the same issue.

Comment 3 Dumitru Ceara 2019-10-01 12:53:07 UTC
Upstream discussion about the issue: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-October/049323.html

Comment 6 ying xu 2020-01-09 05:44:01 UTC
topo like this:

vm0---------s0-----------r1-------s1----------vm1
             |
             rn
             |
             sn
             |
             vmn    (n>=400)
s0 connected to N routers(N>=400) and every router connected to a swn, and s0 connected to a provider network vm0,create dnat and snat entry for vm1
and from vm0, send a arp to request the dnat entry of vm1.

reproduced on ovn2.11.1-21
kernel-kernel-networking-openvswitch-ovn-common-1.0-6.noarch
kernel-kernel-networking-openvswitch-ovn-basic-1.0-16.noarch
ovn2.11-2.11.1-20.el7fdp.x86_64
ovn2.11-host-2.11.1-20.el7fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-qos-1.0-1.noarch
ovn2.11-central-2.11.1-20.el7fdp.x86_64

[root@dell-per730-57 qos]# ovn-nbctl lr-nat-list r1
TYPE             EXTERNAL_IP        LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
dnat_and_snat    172.16.0.200       173.0.1.2             00:00:00:01:02:03    vm1

[root@dell-per730-57 qos]# ip netns exec vm0 ping 172.16.0.200
PING 172.16.0.200 (172.16.0.200) 56(84) bytes of data.
From 172.16.0.100 icmp_seq=1 Destination Host Unreachable
From 172.16.0.100 icmp_seq=2 Destination Host Unreachable
From 172.16.0.100 icmp_seq=3 Destination Host Unreachable
From 172.16.0.100 icmp_seq=4 Destination Host Unreachable
^C
--- 172.16.0.200 ping statistics ---
4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 2999ms
pipe 4
[root@dell-per730-57 qos]# ip netns exec vm0 ip nei
172.16.0.200 dev vm0  FAILED    ------------------------------------------------>>arp failed
fe80::202:c9ff:fe52:2727 dev vm0 lladdr 00:02:c9:52:27:27 router STALE



verified on ovn2.11-24
[root@dell-per730-57 qos]# rpm -qa|grep ovn
kernel-kernel-networking-openvswitch-ovn-common-1.0-6.noarch
kernel-kernel-networking-openvswitch-ovn-basic-1.0-16.noarch
ovn2.11-2.11.1-24.el7fdp.x86_64
ovn2.11-host-2.11.1-24.el7fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-qos-1.0-1.noarch
ovn2.11-central-2.11.1-24.el7fdp.x86_64

[root@dell-per730-57 qos]# ovn-nbctl lr-nat-list r1
TYPE             EXTERNAL_IP        LOGICAL_IP            EXTERNAL_MAC         LOGICAL_PORT
dnat_and_snat    172.16.0.200       173.0.1.2             00:00:00:01:02:03    vm1

[root@dell-per730-57 qos]# ip netns exec vm0 ping 172.16.0.200
PING 172.16.0.200 (172.16.0.200) 56(84) bytes of data.
^C
--- 172.16.0.200 ping statistics ---
13 packets transmitted, 0 received, 100% packet loss, time 12000ms

[root@dell-per730-57 qos]# ip netns exec vm0 ip nei
172.16.0.200 dev vm0 lladdr 00:de:ad:ff:00:01 REACHABLE   ----------------------->>got the arp
fe80::202:c9ff:fe52:2727 dev vm0 lladdr 00:02:c9:52:27:27 router STALE

Comment 8 errata-xmlrpc 2020-01-21 17:02:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0190


Note You need to log in before you can comment on or make changes to this bug.