Description of problem: Comparing OSP 15 with OSP13 both using ML2/OVN backend, there seems to be a regression when exercising the list security groups API call. Rally scenario used is create-list-security-group which creates a new security group and lists them and concurrency was set to 16 and time was set to 500. The 95% API response time in the case of OSP 15 was 16s to list security groups while it was 6s Version-Release number of selected component (if applicable): 15 How reproducible: 100% Steps to Reproduce: 1. deploy OSP15 2. Run create-list-security-group rally scenario 3. Actual results: There is a performance reression Expected results: Similar or better results for OSP 15 Additional info: Not sure if this could be related to https://bugzilla.redhat.com/show_bug.cgi?id=1721273
This looks like it might be the same issue as https://bugs.launchpad.net/bugs/1830679 We should make sure https://review.opendev.org/670075 is applied and see if that helps the response time.
I was trying to reproduce this issue today and check if patch https://review.opendev.org/670075 will help to solve it. First problem for me was that I wasn't able to reproduce this issue. I was checking it using: ()[root@controller-0 /]# rpm -qa | grep neutron puppet-neutron-14.4.1-0.20190531220405.ff3610d.el8ost.noarch python3-neutron-lib-1.25.0-0.20190521130309.fc2a810.el8ost.noarch python3-neutron-14.0.3-0.20190704180411.9f4e596.el8ost.noarch openstack-neutron-common-14.0.3-0.20190704180411.9f4e596.el8ost.noarch python3-neutron-dynamic-routing-14.0.1-0.20190426180400.f313f0e.1.el8ost.noarch openstack-neutron-lbaas-14.0.1-0.20190614170521.30bdd86.el8ost.noarch openstack-neutron-ml2-14.0.3-0.20190704180411.9f4e596.el8ost.noarch python3-neutronclient-6.12.0-0.20190312100012.680b417.el8ost.noarch openstack-neutron-14.0.3-0.20190704180411.9f4e596.el8ost.noarch python3-neutron-lbaas-14.0.1-0.20190614170521.30bdd86.el8ost.noarch and container version: 192.168.24.1:8787/rhosp15/openstack-neutron-server:20190715.1 This version don't have patch https://review.opendev.org/670075 but results were much better than mentioned in bug description: test scenario NeutronSecurityGroup.create_and_list_security_groups args position 0 args values: { "args": { "security_group_create_args": {} }, "runner": { "times": 100, "concurrency": 10 }, "contexts": { "users": { "tenants": 3, "users_per_tenant": 3 }, "quotas": { "neutron": { "security_group": -1 } } }, "sla": { "failure_rate": { "max": 0 } }, "hooks": [] } -------------------------------------------------------------------------------- Task 138c2349-be2c-4a00-b831-8908f800b525 has 0 error(s) -------------------------------------------------------------------------------- +----------------------------------------------------------------------------------------------------------------------------------+ | Response Times (sec) | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ | Action | Min (sec) | Median (sec) | 90%ile (sec) | 95%ile (sec) | Max (sec) | Avg (sec) | Success | Count | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ | neutron.create_security_group | 1.159 | 1.574 | 2.49 | 2.828 | 3.247 | 1.775 | 100.0% | 100 | | neutron.list_security_groups | 0.296 | 1.809 | 2.801 | 3.144 | 4.374 | 1.925 | 100.0% | 100 | | total | 2.206 | 3.552 | 4.764 | 5.039 | 5.759 | 3.699 | 100.0% | 100 | | -> duration | 2.206 | 3.552 | 4.764 | 5.039 | 5.759 | 3.699 | 100.0% | 100 | | -> idle_duration | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 100.0% | 100 | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ Load duration: 38.97572 Full duration: 121.609626 ===================== So as next step I applied manually patch https://review.opendev.org/670075 in neutron_api containers in all 3 controller nodes on my deployment and run same rally scenario again. Results were as below: test scenario NeutronSecurityGroup.create_and_list_security_groups args position 0 args values: { "args": { "security_group_create_args": {} }, "runner": { "times": 100, "concurrency": 10 }, "contexts": { "users": { "tenants": 3, "users_per_tenant": 3 }, "quotas": { "neutron": { "security_group": -1 } } }, "sla": { "failure_rate": { "max": 0 } }, "hooks": [] } -------------------------------------------------------------------------------- Task 56089a6e-fe99-407f-a0fb-e85077b7372f has 0 error(s) -------------------------------------------------------------------------------- +----------------------------------------------------------------------------------------------------------------------------------+ | Response Times (sec) | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ | Action | Min (sec) | Median (sec) | 90%ile (sec) | 95%ile (sec) | Max (sec) | Avg (sec) | Success | Count | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ | neutron.create_security_group | 1.145 | 1.564 | 2.528 | 2.937 | 3.484 | 1.69 | 100.0% | 100 | | neutron.list_security_groups | 0.141 | 0.722 | 1.159 | 1.462 | 1.856 | 0.734 | 100.0% | 100 | | total | 1.566 | 2.227 | 3.466 | 3.723 | 5.322 | 2.424 | 100.0% | 100 | | -> duration | 1.566 | 2.227 | 3.466 | 3.723 | 5.322 | 2.424 | 100.0% | 100 | | -> idle_duration | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 100.0% | 100 | +-------------------------------+-----------+--------------+--------------+--------------+-----------+-----------+---------+-------+ Load duration: 25.90234 Full duration: 105.568619 So as You can see, there is significant improvement when this patch is applied. It is already in rhos-15.0-trunk-patches branch so should be included in next OSP-15 puddle IMO: https://code.engineering.redhat.com/gerrit/gitweb?p=neutron.git;a=log;h=refs/heads/rhos-15.0-trunk-patches I will close this BZ as CURRENTRELEASE for now but feel free to reopen it if You will still have same issue using newest OSP-15. In such case please also provide details about environment on which You had this issue and maybe it would be also possible to get access to Your env to debug there what is going on and why it happend like that.