Bug 1975345

Summary: Northd: Need a config option to specify number of threads
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Tim Rozet <trozet>
Component: OVNAssignee: Mark Michelson <mmichels>
Status: CLOSED ERRATA QA Contact: Jianlin Shi <jishi>
Severity: high Docs Contact:
Priority: urgent    
Version: RHEL 8.0CC: ctrautma, dcbw, dceara, fdangelo, jiji, mmichels, nusiddiq
Target Milestone: ---   
Target Release: FDP 22.B   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: perfscale-ovn
Fixed In Version: ovn-2021-21.12.0-30 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-30 16:28:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tim Rozet 2021-06-23 13:23:25 UTC
Description of problem:
When enabling multi-threading for northd, the amount of threads that are used are determined by number of vcpus on the host. We would like more control over being able to specify how many threads should be used.

Comment 2 Mark Michelson 2022-01-20 15:47:04 UTC
Moving this back to "ASSIGNED" since the approach offered in the initial patch was not accepted.

Comment 3 Mark Michelson 2022-01-20 21:21:38 UTC
I have an interesting update.

ovn-northd was outfitted with a --dummy-numa option back in June 2021, which allows for you to start ovn-northd and specify a dummy NUMA configuration. For instance, you could start ovn-northd with "--dumy-numa=0,0,0,0" and it would fool OVS/OVN into thinking that there are four CPU cores on NUMA node 0. This would then allow parallelization code to operate with a pool size of 4.

The problem is, this option doesn't work. In fact, NUMA and core detection in the OVN's parallel hmap code has never worked. The reason is that there was a missing call to ovs_numa_init() to tell OVS to gather the NUMA and core information of the system in question.

I am submitting a fix upstream and will update this issue when the patch is posted.

Comment 7 Jianlin Shi 2022-03-02 03:06:46 UTC
Hi Mark,

I started ovn-northd with --dummy-numa:

openvsw+   37777  0.0  0.0 212008  4672 ?        S<l  22:05   0:00 ovn-northd --user openvswitch:openvswitch -vconsole:emer -vsyslog:err -vfile:info --ovnnb-db=unix:/run/ovn/ovnnb_db.sock --ovnsb-db=unix:/run/ovn/ovnsb_db.sock --dumy-numa=1,3,5,7 --no-chdir --log-file=/var/log/ovn/ovn-northd.log --pidfile=/run/ovn/ovn-northd.pid --detach --monitor

but how could I check if the parameter take effect for ovn-northd?

Comment 8 Mark Michelson 2022-03-03 14:09:32 UTC
First, enable parallelization in northd by setting the following in the database:

    ovn-nbctl set NB_Global . options:use_parallel_build=true

Now, check ovn-northd.log. You should see something like:

2022-03-03T14:07:48.796Z|00011|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 7
2022-03-03T14:07:48.796Z|00012|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 1
2022-03-03T14:07:48.796Z|00013|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 5
2022-03-03T14:07:48.796Z|00014|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 3
2022-03-03T14:07:48.796Z|00015|ovs_numa|INFO|Discovered 4 NUMA nodes and 4 CPU cores

Comment 9 Jianlin Shi 2022-03-04 02:18:14 UTC
Tested with following steps:

1. add --dummy-numa=1,3,5,7 in /usr/share/ovn/scripts/ovn-ctl for start_northd.
2. systemctl start ovn-northd
3. ovn-nbctl set NB_Global . options:use_parallel_build=true
4. check log in /var/log/ovn/ovn-northd.log

result on ovn-2021-21.12.0-15.el8:

[root@dell-per740-42 ~]# grep numa /var/log/ovn/ovn-northd.log                                        
[root@dell-per740-42 ~]# rpm -qa | grep ovn-2021                                                      
ovn-2021-central-21.12.0-15.el8fdp.x86_64                                                             
ovn-2021-21.12.0-15.el8fdp.x86_64                                                                     
ovn-2021-host-21.12.0-15.el8fdp.x86_64

result on ovn-2021-21.12.0-30.el8:

[root@dell-per740-42 ~]# grep numa /var/log/ovn/ovn-northd.log                                        
2022-03-04T02:17:46.790Z|00010|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 7                    
2022-03-04T02:17:46.790Z|00011|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 1                    
2022-03-04T02:17:46.790Z|00012|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 5                    
2022-03-04T02:17:46.790Z|00013|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 3                    
2022-03-04T02:17:46.790Z|00014|ovs_numa|INFO|Discovered 4 NUMA nodes and 4 CPU cores                  
[root@dell-per740-42 ~]# rpm -qa | grep ovn-2021                                                      
ovn-2021-host-21.12.0-30.el8fdp.x86_64                                                                
ovn-2021-central-21.12.0-30.el8fdp.x86_64                                                             
ovn-2021-21.12.0-30.el8fdp.x86_64

Comment 10 Jianlin Shi 2022-03-04 02:20:10 UTC
also verified on ovn-2021-21.12.0-30.el9:

[root@wsfd-advnetlab18 ~]# grep numa /var/log/ovn/ovn-northd.log                                      
2022-03-04T02:19:22.903Z|00011|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 5                    
2022-03-04T02:19:22.903Z|00012|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 1                    
2022-03-04T02:19:22.903Z|00013|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 7                    
2022-03-04T02:19:22.903Z|00014|ovs_numa|INFO|Discovered 1 CPU cores on NUMA node 3                    
2022-03-04T02:19:22.903Z|00015|ovs_numa|INFO|Discovered 4 NUMA nodes and 4 CPU cores                  
[root@wsfd-advnetlab18 ~]# rpm -qa | grep ovn-2021
ovn-2021-21.12.0-30.el9fdp.x86_64
ovn-2021-central-21.12.0-30.el9fdp.x86_64                                                             
ovn-2021-host-21.12.0-30.el9fdp.x86_64

Comment 12 errata-xmlrpc 2022-03-30 16:28:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1144