Bug 2053681

Summary: [STF 1.4] Expose ringBufferSize parameter to ServiceTelemetry.clouds
Product: Service Telemetry Framework Reporter: Leif Madsen <lmadsen>
Component: service-telemetry-operator-containerAssignee: Leif Madsen <lmadsen>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: medium Docs Contact: Joanne O'Flynn <joflynn>
Priority: medium    
Version: 1.4Keywords: Triaged, ZStream
Target Milestone: z3   
Target Release: 1.4 (STF)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: service-telemetry-operator-container-1.4.2-2 Doc Type: Enhancement
Doc Text:
The default value for the sg-bridge ring buffer size has been increased from the default value of 2048 to 16384 to account for larger messages that might come from Ceilometer. Before this change, larger Ceilometer messages might have been corrupted between the sg-bridge ring buffer and socket prior to consumption by sg-core. This change also exposes the sg-bridge ring buffer size and count values along with the verbose option, which can be useful for debugging purposes. No change is required by the administrator to start making use of these new values as the defaults have been updated.
Story Points: ---
Clone Of:
: 2053683 (view as bug list) Environment:
Last Closed: 2022-05-16 12:52:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2051615    
Bug Blocks:    

Description Leif Madsen 2022-02-11 18:25:54 UTC
Description of problem: The ringBufferSize parameter for the deployment of Smart Gateways is not exposed and is currently hard coded. This results in some large Ceilometer messages being corrupted in the ring buffer before being made available to the socket and consumption of sg-core.


Version-Release number of selected component (if applicable): STF 1.4.1


How reproducible: Always


Steps to Reproduce:
1. Deploy ServiceTelemetry
2. Connect RHOSP 16.2 to STF
3. Enable debugMessages: true in the cloud configuration on STF
4. View the logs and see that periodically some messages are corrupted, e.g.

2022-02-10 22:15:11 [DEBUG] failed handling message [error: ceilometer.OsloSchema.Request: isObjectEnd: object ended prematurely, unexpected char r, error found in #10 byte of ...|Sw��{"request": {|..., bigger context ...|ycast/ceilometer/cops04-metering.sampleSw��{"request": {"oslo.version": "2.0", "oslo.message": "|..., handler: ceilometer-metrics[socket0]]

Additional info: Issue is that the `--rbs` flag is not passed to the Deployment of Smart Gateway for the bridge container. The result is that the default ring buffer message size is set to 2048 and causes some larger Ceilometer messages (events, metrics) to be cut off and thus not parsable by sg-core.

The fix here is to allow the RBS value to be set, and to increase the default sizing to 16384 when deploying Smart Gateways. The maximum size is 65535 (maxBufferSize configured in sg-core as a constant).

Comment 10 errata-xmlrpc 2022-05-16 12:52:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Service Telemetry Framework 1.4.3 - Container Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2276