Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1805468

Summary: Improve perforance of Smart-Gateway AMQP code
Product: Red Hat OpenStack Reporter: Chris Sibbitt <csibbitt>
Component: Service Telemetry FrameworkAssignee: Chris Sibbitt <csibbitt>
Status: CLOSED CURRENTRELEASE QA Contact: Leonid Natapov <lnatapov>
Severity: medium Docs Contact:
Priority: high    
Version: 16.0 (Train)CC: augol, ealcaniz, lmadsen, mgarciac, mmagr, mrunge, sclewis
Target Milestone: z2Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20200914170155.29a02c1.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-29 10:51:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1846039    
Bug Blocks:    

Description Chris Sibbitt 2020-02-20 20:34:56 UTC
Description of problem:

The existing implementation of the Smart Gateway has performance degradation after ~15k metrics/s and maxes out around ~30k/s.

Version-Release number of selected component (if applicable):
1.1.0

How reproducible:
100%

Steps to Reproduce:
1. Send 10k metrics per second and observe delivery rate
2. Send 20k metrics per second and observe delivery rate
3. Send 40k metrics per second and observe delivery rate

Actual results:
Depending on exact hardware specs, the delivery rate is degraded at 20k and flat-lines at 40k.


Expected results:
The next bottleneck in our system is the QDR router, which can process ~100k metrics/s, so the goal is to perform in that range.


Additional info:
The qpid electron library used in the Smart Gateway is known to be the source of the performance bottleneck; and internal experiments have shown improvements when that code is replaced with a socket-based connection to native qpid-proton receiver.

Comment 5 Scott Lewis 2020-04-19 19:07:05 UTC
Removing Target Milestone; please replan

Comment 9 Chris Sibbitt 2020-06-11 14:30:53 UTC
Adding a link to an openstack Gerrit review that disables presettled mode for metrics delivery. This is shown to improve performance and reliability both on our existing code as well as the new, more performant smart gateway code.

Comment 10 Chris Sibbitt 2020-07-13 15:29:24 UTC
Upstream Openstack patches for the change in message settlement have merged and are ready for the next import for 16.1.2. 

Additional improvements are coming in an STF server-side update (non-OSP code), currently tracked by https://bugzilla.redhat.com/show_bug.cgi?id=1846039

Comment 11 Leif Madsen 2020-07-20 20:13:34 UTC
Linking this BZ as the parent for additional BZs to track release work of the artifacts. This BZ will track changes to THT that will be imported in a future release.

Comment 16 Lon Hohberger 2020-10-29 10:51:20 UTC
According to our records, this should be resolved by openstack-tripleo-heat-templates-11.3.2-1.20200914170156.el8ost.  This build is available now.