Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1837595

Summary: [4.4] Fix handling of large openflow messages due to Services with many Endpoints
Product: OpenShift Container Platform Reporter: Dan Williams <dcbw>
Component: NetworkingAssignee: Dan Williams <dcbw>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: urgent CC: aconstan, bbennett, ctrautma, dcbw, jishi, mmichels, rkhan, tredaelli, zzhao
Version: 4.4   
Target Milestone: ---   
Target Release: 4.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1837593 Environment:
Last Closed: 2020-06-11 15:12:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1837593    
Bug Blocks:    

Description Dan Williams 2020-05-19 17:29:18 UTC
+++ This bug was initially created as a clone of Bug #1837593 +++

+++ This bug was initially created as a clone of Bug #1779854 +++

Description of problem:
OVN can potentially create overly large OpenFlow flows depending on configuration. An example that can do this is to create a load balancer that has a very large number of backends. Other ways to potentially trigger this is with extra large ACLs or very large multicast groups.

When the generated OpenFlow message is in excess of 65535, the length cannot be expressed in an OpenFlow header. The message length is then truncated to the final 16 bits of the length of the flow. This results in an incomplete flow being installed.

How reproducible:
Every time.

Steps to Reproduce:
Open the OVN sandbox and run the following script

for i in `seq 1 5` ; do for j in `seq 1 254` ; do echo 192.169.$i.$j:80 >> /tmp/ips; done ; done
IPS=`cat /tmp/ips |xargs |tr ' ' ','`
ovn-nbctl ls-add ls1
ovn-nbctl lsp-add ls1 ls1-p1
ovs-vsctl add-port br-int p1 -- set Interface p1 external_ids:iface-id=ls1-p1
ovn-nbctl lb-add lb0 172.172.0.1:8080 ${IPS}
ovn-nbctl ls-lb-add ls1 lb0

Then check sandbox/ovn-controller.log to see the errors generated.

Actual results:
You'll see errors in ovn-controller's log such as:
2019-12-04T20:51:52.556Z|00015|ofp_msgs|WARN|unknown OpenFlow message (version 0, type 64)
2019-12-04T20:51:52.556Z|00016|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0xffffffff): OFPBRC_BAD_VERSION
***decode error: OFPBRC_BAD_TYPE***
00000000  00 40 00 64 ff ff ff ff-ff ff ff ff 00 00 00 00 |.@.d............|
00000010  ff ff 00 30 00 00 23 20-00 23 00 01 00 01 1a 04 |...0..# .#......|
00000020  00 0f 13 00 00 00 00 00-ff ff 00 18 00 00 23 20 |..............# |
00000030  00 24 00 00 00 02 00 11-c0 a9 04 df 00 50 00 00 |.$...........P..|
00000040  00 40 00 64 ff ff ff ff-ff ff ff ff 00 00 00 00 |.@.d............|
00000050  ff ff 00 30 00 00 23 20-00 23 00 01 00 01 1a 04 |...0..# .#......|
00000060  00 0f 13 00                                     |....            |
2019-12-04T20:51:52.556Z|00017|vconn_stream|ERR|send: Broken pipe
2019-12-04T20:51:52.556Z|00018|rconn|WARN|unix:/home/putnopvut/ovn-copy/tutorial/sandbox/br-int.mgmt: connection dropped (Broken pipe)
2019-12-04T20:51:53.557Z|00019|rconn|INFO|unix:/home/putnopvut/ovn-copy/tutorial/sandbox/br-int.mgmt: connecting...
2019-12-04T20:51:53.558Z|00020|rconn|INFO|unix:/home/putnopvut/ovn-copy/tutorial/sandbox/br-int.mgmt: connected

Expected results:
I would expect that the openflow message would be split into chunks and sent to ovs-vswitch, allowing for the large flow to be installed if possible. If flows in excess of 65535 bytes are not possible, then the flow itself should be split into multiple flows.

--- Additional comment from Dan Williams on 2020-05-12 11:34:25 CDT ---

Upstream patch: http://patchwork.ozlabs.org/project/openvswitch/patch/20200507212442.315956-1-mmichels@redhat.com/

--- Additional comment from Mark Michelson on 2020-05-15 15:52:47 CDT ---

Patch has been backported to ovn2.13 FDN.

Comment 1 Dan Williams 2020-06-11 15:12:43 UTC

*** This bug has been marked as a duplicate of bug 1845706 ***