Bug 2070636

Summary: [PerfCI][OVN][16.2] Browbeat create_network_nova_boot_ping scenario fails SLA as ping takes much time now
Product: Red Hat OpenStack Reporter: Yatin Karel <ykarel>
Component: python-networking-ovnAssignee: Arnau Verdaguer <averdagu>
Status: CLOSED CURRENTRELEASE QA Contact: Bharath M V <bmv>
Severity: high Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: apevec, egarciar, lhh, majopela, mburns, scohen
Target Milestone: z3Keywords: Regression, TestOnly, Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-03-28 13:41:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2069714, 2069783    
Bug Blocks:    

Description Yatin Karel 2022-03-31 15:39:48 UTC
Description of problem:
create_network_nova_boot_ping scenario[1] failing while checking SLA as ping started taking much time Since Puddle RHOS-16.2-RHEL-8-20220201.n.1. 
Last good Puddle was RHOS-16.2-RHEL-8-20211129.n.1, so multiple changes were introduced with the puddle in 2 months.

Trend Change can be seen b/w build 43[2] and 44[3] of job DFG-perfscale-PerfCI-OSP16.2-neutron-ovn.

42 - vm.wait_for_ping (x4)	0.052	0.059	0.066	0.071	0.075	0.06	100.0%	10
43 - vm.wait_for_ping (x4)	0.046	0.056	0.061	0.072	0.082	0.057	100.0%	10
44 - vm.wait_for_ping (x4)	0.049	0.772	35.345	36.16	36.975	8.137	100.0%	10
45 - vm.wait_for_ping (x4)	0.052	2.248	59.315	60.471	61.627	18.789	100.0%	10

Job History:- https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/perfscale/view/PerfCI/job/DFG-perfscale-PerfCI-OSP16.2-neutron-ovn/


Version-Release number of selected component (if applicable):
There were many rpm changes but just pasting few related to start with:-

Good Build:-
python3-networking-ovn-7.4.2-2.20210601204831.el8ost.13.noarch
openstack-neutron-15.3.5-2.20210608154816.el8ost.4.noarch
ovn-2021-21.09.0-20.el8fdp.x86_64
openvswitch2.15-2.15.0-38.el8fdp.x86_64
kernel-4.18.0-305.25.1.el8_4.x86_64

Bad build:-
python3-networking-ovn-7.4.2-2.20220113214852.a2eba10.el8ost.noarch
openstack-neutron-15.3.5-2.20220113150030.94e4cbb.el8ost.noarch
ovn-2021-21.12.0-11.el8fdp.x86_64
openvswitch2.15-2.15.0-57.el8fdp.x86_64
kernel-4.18.0-305.34.2.el8_4.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Reproducible always with Perf CI job DFG-perfscale-PerfCI-OSP16.2-neutron-ovn
2. Performance difference can be seen with manual run of browbeat scenario in 16.1 and 16.2 deployments

Actual results:
Performance Dropped of scenario as compared to 16.1 equivalent and 16.2 after compose RHOS-16.2-RHEL-8-20220201.n.1.

Expected results:
Performance shouldn't have dropped that much.

Additional info:

16.1 OVN Perf job[4] is all good, It's a regression in 16.2.

54 - vm.wait_for_ping (x4)	0.048	0.054	0.067	0.067	0.068	0.057	100.0%	10
55 - vm.wait_for_ping (x4)	0.047	0.058	0.063	0.063	0.064	0.056	100.0%	10

It's likely same issue as [5] but filing seperate bz just to not mix both issues . Also this is more to track performance aspect and rule out non-ovn related changes. Can be marked duplicate once it's confirmed post further investigation.

[1] https://opendev.org/x/browbeat/src/branch/master/rally/rally-plugins/netcreate-boot/netcreate_nova_boot_fip_ping.py
[2] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-perfscale-PerfCI-OSP16.2-neutron-ovn/43/artifact/rally-results.html#/BrowbeatPlugin.create_network_nova_boot_ping
[3] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-perfscale-PerfCI-OSP16.2-neutron-ovn/44/artifact/rally-results.html#/BrowbeatPlugin.create_network_nova_boot_ping
[4] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/perfscale/view/PerfCI/job/DFG-perfscale-PerfCI-OSP16.1-neutron-ovn/
[5] https://bugzilla.redhat.com/show_bug.cgi?id=2069668