Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1954853

Summary: ovn-controller high memory usage (6.7GB)
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Tim Rozet <trozet>
Component: OVNAssignee: OVN Team <ovnteam>
Status: CLOSED CURRENTRELEASE QA Contact: ying xu <yinxu>
Severity: high Docs Contact:
Priority: urgent    
Version: RHEL 8.0CC: andcosta, ctrautma, dcbw, dceara, jiji
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovn2.13-20.12.0-135.fdp8 ovn-2021-21.03.0-13.fdp8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-13 07:08:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1953613, 1956358    

Description Tim Rozet 2021-04-28 22:11:58 UTC
Description of problem:
We see on an OCP deployment ovn-controller taking 41% of the system's RAM:
[trozet@fedora process]$ cat ps_auxwwwm  | egrep 'USER|ovn-controller'
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root        3824  0.0  0.0 143732  2324 ?        -    Apr19   0:11 /usr/libexec/crio/conmon -b /var/run/containers/storage/overlay-containers/6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f/userdata -c 6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f --exit-dir /var/run/crio/exits -l /var/log/pods/openshift-ovn-kubernetes_ovnkube-node-g8n2c_27d64f48-a34c-4cfb-acf2-1e6ac0c9605e/ovn-controller/0.log --log-level info -n k8s_ovn-controller_ovnkube-node-g8n2c_openshift-ovn-kubernetes_27d64f48-a34c-4cfb-acf2-1e6ac0c9605e_0 -P /var/run/containers/storage/overlay-containers/6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f/userdata/conmon-pidfile -p /var/run/containers/storage/overlay-containers/6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f/userdata -r /usr/bin/runc --runtime-arg --root=/run/runc --socket-dir-path /var/run/crio -u 6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f -s
root        3983 28.3 40.9 6981276 6719588 ?     -    Apr19 2819:43 ovn-controller unix:/var/run/openvswitch/db.sock -vfile:off --no-chdir --pidfile=/var/run/ovn/ovn-controller.pid -p /ovn-cert/tls.key -c /ovn-cert/tls.crt -C /ovn-ca/ca-bundle.crt -vconsole:info


Additionally, there are complaints about OVN-controller:

021-04-26T08:40:59.094134419Z 2021-04-26T08:40:59Z|318134|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-26T08:40:59.094134419Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-26T08:40:59.094134419Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 f1 |......# ."......|
2021-04-26T08:40:59.094164758Z 2021-04-26T08:40:59Z|318135|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-26T08:40:59.094164758Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-26T08:40:59.094164758Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 f1 |......# ."......|
2021-04-26T08:40:59.094199271Z 2021-04-26T08:40:59Z|318136|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-26T08:40:59.094199271Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-26T08:40:59.094199271Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 f1 |......# ."......|

This starts after OVS gets a bad OF packet and ends the connection:

2021-04-25T18:13:51.653255903Z 2021-04-25T18:13:51.643Z|57494|connmgr|INFO|br-int<->unix#979216: sending OFPBMC_BAD_LEN error reply to OFPT_FLOW_MOD message
2021-04-25T18:13:51.653255903Z 2021-04-25T18:13:51.643Z|57495|vconn_stream|ERR|received too-short ofp_header (0 bytes)
2021-04-25T18:13:51.653255903Z 2021-04-25T18:13:51.643Z|57496|rconn|WARN|br-int<->unix#979216: connection dropped (Protocol error)
2021-04-25T18:13:51.653255903Z 2021-04-25T18:13:51.643Z|57497|connmgr|INFO|br-int<->unix#979216: 4005 flow_mods in the last 17 s (344 adds, 3642 deletes, 19 modifications)
2021-04-25T18:13:53.003691314Z 2021-04-25T18:13:53.001Z|57498|ofp_msgs|WARN|unknown OpenFlow message (version 255, type 255)
2021-04-25T18:13:53.003691314Z 2021-04-25T18:13:53.001Z|57499|vconn|ERR|unix#1015512: received OpenFlow version 0xff != expected 06

Happening on multiple ovn-controllers in the cluster. Initially ovn-controller complains:
2021-04-25T18:13:51.871355080Z 2021-04-25T18:13:51Z|15499|ofctrl|WARN|req_cfg regressed from 1547 to 0
2021-04-25T18:13:52.763487507Z 2021-04-25T18:13:52Z|15500|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
2021-04-25T18:13:52.766052647Z 2021-04-25T18:13:52Z|15501|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
2021-04-25T18:13:53.002004331Z 2021-04-25T18:13:53Z|15502|ofp_msgs|WARN|unknown OpenFlow message (version 255, type 255)
2021-04-25T18:13:53.002112470Z 2021-04-25T18:13:53Z|15503|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-25T18:13:53.002112470Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-25T18:13:53.002112470Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 11 |......# ."......|
2021-04-25T18:13:53.002164034Z 2021-04-25T18:13:53Z|15504|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-25T18:13:53.002164034Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-25T18:13:53.002164034Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 14 |......# ."......|
2021-04-25T18:13:53.002220215Z 2021-04-25T18:13:53Z|15505|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-25T18:13:53.002220215Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-25T18:13:53.002220215Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 14 |......# ."......|
2021-04-25T18:13:53.002778877Z 2021-04-25T18:13:53Z|15506|ofctrl|IN

Version-Release number of selected component (if applicable):
ovn2.13-20.12.0-24.el8fdp.x86_64

How reproducible:
need to see if I can reproduce this. Happens after upgrade from 4.7.3->4.7.6

Comment 13 Red Hat Bugzilla 2023-09-18 00:26:13 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days