Bug 2053681 - [STF 1.4] Expose ringBufferSize parameter to ServiceTelemetry.clouds
Summary: [STF 1.4] Expose ringBufferSize parameter to ServiceTelemetry.clouds
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Service Telemetry Framework
Classification: Red Hat
Component: service-telemetry-operator-container
Version: 1.4
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z3
: 1.4 (STF)
Assignee: Leif Madsen
QA Contact: Leonid Natapov
Joanne O'Flynn
URL:
Whiteboard:
Depends On: 2051615
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-11 18:25 UTC by Leif Madsen
Modified: 2022-05-16 12:52 UTC (History)
0 users

Fixed In Version: service-telemetry-operator-container-1.4.2-2
Doc Type: Enhancement
Doc Text:
The default value for the sg-bridge ring buffer size has been increased from the default value of 2048 to 16384 to account for larger messages that might come from Ceilometer. Before this change, larger Ceilometer messages might have been corrupted between the sg-bridge ring buffer and socket prior to consumption by sg-core. This change also exposes the sg-bridge ring buffer size and count values along with the verbose option, which can be useful for debugging purposes. No change is required by the administrator to start making use of these new values as the defaults have been updated.
Clone Of:
: 2053683 (view as bug list)
Environment:
Last Closed: 2022-05-16 12:52:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github infrawatch service-telemetry-operator pull 314 0 None open Expose minimal set of bridge controls for SGs (#312) 2022-03-10 20:10:46 UTC
Red Hat Issue Tracker STF-977 0 None None None 2022-02-11 18:30:15 UTC
Red Hat Product Errata RHBA-2022:2276 0 None None None 2022-05-16 12:52:20 UTC

Description Leif Madsen 2022-02-11 18:25:54 UTC
Description of problem: The ringBufferSize parameter for the deployment of Smart Gateways is not exposed and is currently hard coded. This results in some large Ceilometer messages being corrupted in the ring buffer before being made available to the socket and consumption of sg-core.


Version-Release number of selected component (if applicable): STF 1.4.1


How reproducible: Always


Steps to Reproduce:
1. Deploy ServiceTelemetry
2. Connect RHOSP 16.2 to STF
3. Enable debugMessages: true in the cloud configuration on STF
4. View the logs and see that periodically some messages are corrupted, e.g.

2022-02-10 22:15:11 [DEBUG] failed handling message [error: ceilometer.OsloSchema.Request: isObjectEnd: object ended prematurely, unexpected char r, error found in #10 byte of ...|Sw��{"request": {|..., bigger context ...|ycast/ceilometer/cops04-metering.sampleSw��{"request": {"oslo.version": "2.0", "oslo.message": "|..., handler: ceilometer-metrics[socket0]]

Additional info: Issue is that the `--rbs` flag is not passed to the Deployment of Smart Gateway for the bridge container. The result is that the default ring buffer message size is set to 2048 and causes some larger Ceilometer messages (events, metrics) to be cut off and thus not parsable by sg-core.

The fix here is to allow the RBS value to be set, and to increase the default sizing to 16384 when deploying Smart Gateways. The maximum size is 65535 (maxBufferSize configured in sg-core as a constant).

Comment 10 errata-xmlrpc 2022-05-16 12:52:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Service Telemetry Framework 1.4.3 - Container Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2276


Note You need to log in before you can comment on or make changes to this bug.