The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1954853 - ovn-controller high memory usage (6.7GB)
Summary: ovn-controller high memory usage (6.7GB)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: RHEL 8.0
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: ---
Assignee: OVN Team
QA Contact: ying xu
URL:
Whiteboard:
Depends On:
Blocks: 1953613 1956358
TreeView+ depends on / blocked
 
Reported: 2021-04-28 22:11 UTC by Tim Rozet
Modified: 2023-09-18 00:26 UTC (History)
5 users (show)

Fixed In Version: ovn2.13-20.12.0-135.fdp8 ovn-2021-21.03.0-13.fdp8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-03-13 07:08:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1283 0 None None None 2022-08-06 04:58:58 UTC

Description Tim Rozet 2021-04-28 22:11:58 UTC
Description of problem:
We see on an OCP deployment ovn-controller taking 41% of the system's RAM:
[trozet@fedora process]$ cat ps_auxwwwm  | egrep 'USER|ovn-controller'
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root        3824  0.0  0.0 143732  2324 ?        -    Apr19   0:11 /usr/libexec/crio/conmon -b /var/run/containers/storage/overlay-containers/6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f/userdata -c 6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f --exit-dir /var/run/crio/exits -l /var/log/pods/openshift-ovn-kubernetes_ovnkube-node-g8n2c_27d64f48-a34c-4cfb-acf2-1e6ac0c9605e/ovn-controller/0.log --log-level info -n k8s_ovn-controller_ovnkube-node-g8n2c_openshift-ovn-kubernetes_27d64f48-a34c-4cfb-acf2-1e6ac0c9605e_0 -P /var/run/containers/storage/overlay-containers/6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f/userdata/conmon-pidfile -p /var/run/containers/storage/overlay-containers/6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f/userdata -r /usr/bin/runc --runtime-arg --root=/run/runc --socket-dir-path /var/run/crio -u 6d11511d3356979d09e19079b3a52f7f1c58c8353945337f30aa80c21f37b03f -s
root        3983 28.3 40.9 6981276 6719588 ?     -    Apr19 2819:43 ovn-controller unix:/var/run/openvswitch/db.sock -vfile:off --no-chdir --pidfile=/var/run/ovn/ovn-controller.pid -p /ovn-cert/tls.key -c /ovn-cert/tls.crt -C /ovn-ca/ca-bundle.crt -vconsole:info


Additionally, there are complaints about OVN-controller:

021-04-26T08:40:59.094134419Z 2021-04-26T08:40:59Z|318134|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-26T08:40:59.094134419Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-26T08:40:59.094134419Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 f1 |......# ."......|
2021-04-26T08:40:59.094164758Z 2021-04-26T08:40:59Z|318135|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-26T08:40:59.094164758Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-26T08:40:59.094164758Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 f1 |......# ."......|
2021-04-26T08:40:59.094199271Z 2021-04-26T08:40:59Z|318136|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-26T08:40:59.094199271Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-26T08:40:59.094199271Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 f1 |......# ."......|

This starts after OVS gets a bad OF packet and ends the connection:

2021-04-25T18:13:51.653255903Z 2021-04-25T18:13:51.643Z|57494|connmgr|INFO|br-int<->unix#979216: sending OFPBMC_BAD_LEN error reply to OFPT_FLOW_MOD message
2021-04-25T18:13:51.653255903Z 2021-04-25T18:13:51.643Z|57495|vconn_stream|ERR|received too-short ofp_header (0 bytes)
2021-04-25T18:13:51.653255903Z 2021-04-25T18:13:51.643Z|57496|rconn|WARN|br-int<->unix#979216: connection dropped (Protocol error)
2021-04-25T18:13:51.653255903Z 2021-04-25T18:13:51.643Z|57497|connmgr|INFO|br-int<->unix#979216: 4005 flow_mods in the last 17 s (344 adds, 3642 deletes, 19 modifications)
2021-04-25T18:13:53.003691314Z 2021-04-25T18:13:53.001Z|57498|ofp_msgs|WARN|unknown OpenFlow message (version 255, type 255)
2021-04-25T18:13:53.003691314Z 2021-04-25T18:13:53.001Z|57499|vconn|ERR|unix#1015512: received OpenFlow version 0xff != expected 06

Happening on multiple ovn-controllers in the cluster. Initially ovn-controller complains:
2021-04-25T18:13:51.871355080Z 2021-04-25T18:13:51Z|15499|ofctrl|WARN|req_cfg regressed from 1547 to 0
2021-04-25T18:13:52.763487507Z 2021-04-25T18:13:52Z|15500|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
2021-04-25T18:13:52.766052647Z 2021-04-25T18:13:52Z|15501|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
2021-04-25T18:13:53.002004331Z 2021-04-25T18:13:53Z|15502|ofp_msgs|WARN|unknown OpenFlow message (version 255, type 255)
2021-04-25T18:13:53.002112470Z 2021-04-25T18:13:53Z|15503|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-25T18:13:53.002112470Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-25T18:13:53.002112470Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 11 |......# ."......|
2021-04-25T18:13:53.002164034Z 2021-04-25T18:13:53Z|15504|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-25T18:13:53.002164034Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-25T18:13:53.002164034Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 14 |......# ."......|
2021-04-25T18:13:53.002220215Z 2021-04-25T18:13:53Z|15505|ofctrl|INFO|OpenFlow error: OFPT_ERROR (xid=0x2320): OFPBRC_BAD_VERSION
2021-04-25T18:13:53.002220215Z ***decode error: OFPBRC_BAD_TYPE***
2021-04-25T18:13:53.002220215Z 00000000  ff ff 00 10 00 00 23 20-00 22 00 02 00 00 00 14 |......# ."......|
2021-04-25T18:13:53.002778877Z 2021-04-25T18:13:53Z|15506|ofctrl|IN

Version-Release number of selected component (if applicable):
ovn2.13-20.12.0-24.el8fdp.x86_64

How reproducible:
need to see if I can reproduce this. Happens after upgrade from 4.7.3->4.7.6

Comment 13 Red Hat Bugzilla 2023-09-18 00:26:13 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.