2011525 – Rate-limit incoming BFD to prevent ovn-controller DoS

Bug 2011525 - Rate-limit incoming BFD to prevent ovn-controller DoS

Summary: Rate-limit incoming BFD to prevent ovn-controller DoS

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.10
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.11.0
Assignee:	Surya Seetharaman
QA Contact:	Anurag saxena
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-10-06 18:11 UTC by Sai Sindhur Malleni
Modified:	2022-08-10 10:38 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-08-10 10:38:04 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
DBs and conf.db of worker node (12.61 MB, application/gzip) 2021-10-06 19:24 UTC, Sai Sindhur Malleni	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift ovn-kubernetes pull 947	None	open	Bug 2011525: [DownstreamMerge] Downstream merge 08-02-2022	2022-02-10 16:29:03 UTC
Github	ovn-org ovn-kubernetes pull 2752	None	Merged	Add COPP to gateway routers for rate limiting pkts	2022-02-10 16:29:00 UTC
Red Hat Issue Tracker	FD-1588	None	None	None	2021-10-06 18:13:32 UTC
Red Hat Issue Tracker	SDN-2570	None	None	None	2022-01-11 22:33:15 UTC
Red Hat Product Errata	RHSA-2022:5069	None	None	None	2022-08-10 10:38:21 UTC

Description Sai Sindhur Malleni 2021-10-06 18:11:28 UTC

Description of problem:

On 4.10 OpenShift cluster with ~250 worker nodes, we launched around 35-60 pods per node. All the ovn-controller instances across the cluster seems be utilizing a reasonable amount of memory except two nodes. Once the pods are launched, they are not deleted and the cluster is in a steady state.

They are:
worker000-fc640
worker18

Over time their memory keeps increasing without bounds and without any activity happening in the cluster. I am suspecting a memory leak here.

https://snapshot.raintank.io/dashboard/snapshot/7pRROXWYy0lXwsqar4ln1fwc5dzZEDb6

Looking at memory stats on one of the problem nodes we see

[root@worker000-fc640 ~]# ovn-appctl -t ovn-controller memory/show
lflow-cache-entries-cache-expr:16030 lflow-cache-entries-cache-matches:9621 lflow-cache-size-KB:65688 ofctrl_desired_flow_usage-KB:37379 ofctrl_installed_flow_usage-KB:28470 ofctrl_sb_flow_ref_usage-KB:9361

Comparing it a node which is NOT exhibiting this leak, I don't spot many differences

Node without leak:
sh-4.4#  ovn-appctl -t ovn-controller memory/show
lflow-cache-entries-cache-expr:15804 lflow-cache-entries-cache-matches:9547 lflow-cache-size-KB:64684 ofctrl_desired_flow_usage-KB:37304 ofctrl_installed_flow_usage-KB:28425 ofctrl_sb_flow_ref_usage-KB:9326




Version-Release number of selected component (if applicable):

[kni@e16-h12-b02-fc640 web-burner]$ oc rsh -c ovn-controller ovnkube-node-qj8dq                                                        
sh-4.4# rpm -qa | grep ovn
ovn21.09-central-21.09.0-19.el8fdp.x86_64
ovn21.09-vtep-21.09.0-19.el8fdp.x86_64
ovn21.09-21.09.0-19.el8fdp.x86_64
ovn21.09-host-21.09.0-19.el8fdp.x86_64
sh-4.4# 

How reproducible:
Only reproducible on some nodes

Steps to Reproduce:
1. Deploy a large cluster
2. Launch a few pods
3. remain at steady state and watch ovn-controller memory grow boundlessly on some nodes

Actual results:
ovn-controller memory on some nodes keeps growing without bounds indicating a memory leak

Expected results:
Memory should be within reasonable bounds and not grow at steady state

Additional info:

Comment 1 Sai Sindhur Malleni 2021-10-06 18:18:57 UTC

Placed DBS and provided Numan access to those.

Comment 2 Sai Sindhur Malleni 2021-10-06 19:24:09 UTC

Created attachment 1830071 [details]
DBs and conf.db of worker node

Comment 3 Sai Sindhur Malleni 2021-10-07 02:06:18 UTC

perf record output shows pinctrl0 thread is hot on CPu

 Event count (approx.): 6723064170
#
# Overhead  Command         Shared Object        Symbol                                                                                
# ........  ..............  ...................  ...........................................                                           
#
     1.97%  ovn_pinctrl0    libpthread-2.28.so   [.] __pthread_rwlock_wrlock                                                           
     1.84%  ovn_pinctrl0    libpthread-2.28.so   [.] __pthread_rwlock_rdlock                                                           
     1.77%  ovn_pinctrl0    libpthread-2.28.so   [.] __pthread_rwlock_unlock                                                           
     1.75%  ovn_pinctrl0    [kernel.kallsyms]    [k] copy_user_enhanced_fast_string                                                    
     1.62%  ovn_pinctrl0    libc-2.28.so         [.] _int_malloc
     1.53%  ovn_pinctrl0    [kernel.kallsyms]    [k] avc_has_perm
     1.16%  ovn_pinctrl0    [kernel.kallsyms]    [k] _raw_spin_lock
     1.09%  ovn_pinctrl0    libc-2.28.so         [.] malloc
     0.96%  ovn_pinctrl0    libc-2.28.so         [.] __memmove_avx_unaligned_erms                                                      
     0.96%  ovn_pinctrl0    libc-2.28.so         [.] _int_free
     0.80%  ovn_pinctrl0    ovn-controller       [.] 0x00000000000b87c1                                                                
     0.75%  ovn_pinctrl0    [kernel.kallsyms]    [k] copy_user_generic_unrolled                                                        
     0.71%  ovn_pinctrl0    libc-2.28.so         [.] __memset_avx2_unaligned_erms                                                      
     0.69%  ovn_pinctrl0    libpthread-2.28.so   [.] __pthread_enable_asynccancel                                                      
     0.69%  ovn_pinctrl0    libc-2.28.so         [.] __memcmp_avx2_movbe                                                               
     0.67%  ovn_pinctrl0    ovn-controller       [.] 0x000000000011c8fd                                                                
     0.65%  ovn_pinctrl0    [kernel.kallsyms]    [k] find_vma
     0.61%  ovn_pinctrl0    ovn-controller       [.] 0x00000000000469bb                                                                
     0.61%  ovn_pinctrl0    [kernel.kallsyms]    [k] skb_set_owner_w                                                                   
     0.60%  ovn_pinctrl0    ovn-controller       [.] 0x00000000000bdfbb                                                                
     0.58%  ovn_pinctrl0    ovn-controller       [.] 0x00000000000b87a6

Comment 6 Dan Williams 2021-11-18 16:46:30 UTC

As discussed above the solution is to use OVN's meter functionality to rate-limit packet-in to ovn-controller, which would be configured by ovn-kubernetes. This should be done for BFD and chk-pkt-len at least.

Comment 8 Surya Seetharaman 2022-01-14 20:28:50 UTC

Opened Upstream PR: https://github.com/ovn-org/ovn-kubernetes/pull/2752

Comment 14 errata-xmlrpc 2022-08-10 10:38:04 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Note You need to log in before you can comment on or make changes to this bug.