1797436 – unbounded memory usage in collectd when it's not configured with any write plugin

Bug 1797436 - unbounded memory usage in collectd when it's not configured with any write plugin

Summary: unbounded memory usage in collectd when it's not configured with any write pl...

Keywords:
Status:	CLOSED DUPLICATE of bug 1817124
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	collectd
Sub Component:
Version:	13.0 (Queens)
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	z13
Target Release:	13.0 (Queens)
Assignee:	Matthias Runge
QA Contact:	Leonid Natapov
Docs Contact:
URL:
Whiteboard:
Depends On:	1790928 1817124 1859630
Blocks:
TreeView+	depends on / blocked

Reported:	2020-02-03 07:25 UTC by Jaison Raju
Modified:	2023-10-06 19:07 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-09-25 05:59:40 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	4855731	0	None	None	None	2020-03-02 18:43:23 UTC

Description Jaison Raju 2020-02-03 07:25:39 UTC

This bug was initially created as a copy of Bug #1771994

I am copying this bug because: 

We noticed this issue in RHOS13 for customer environment where collectd was configured to send data to gnocchi, but collectd was facing issue an issue communicating to gnocchi.
So it seems, this issue can be seen in environments which are deployed with write plugin (to send metrics) but the endpoint of destination is not accessible.
In customer environment, this lead to ovs-vswitchd crash.


Description of problem:
If overcloud is deployed with collectd but when collectd is not configured to use any write plugin or any destination collectd server it can send data, memory leak is noticed.
Collectd processes Resident memory increases to 20GB in few hours.
During every cycle collectd collects data, the process visibly grows in memory usage.

Collectd should have some configuration to discard collected data rather than storing it in memory when it is not configured with destination collectd server or write plugin.

Version-Release number of selected component (if applicable):
RHOS15z1

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Martin Magr 2020-02-19 16:30:28 UTC

We will need to backport a fix for setting maximum memory queue which collectd holds.

Comment 15 Matthias Runge 2020-09-25 05:52:21 UTC

There are two bzs for osp13 to address a memory issue with amqp1, these are targeted for osp13z13 release.

- https://bugzilla.redhat.com/show_bug.cgi?id=1817124 (fix a memory issue with amqp1)
- https://bugzilla.redhat.com/show_bug.cgi?id=1861716 (puppet-collectd change to add the SendQueueLimit)


In general, there is https://access.redhat.com/solutions/4855731 which mentions
~~~
collectd::write_queue_limit_high: 1000000
collectd::write_queue_limit_low: 800000
~~~
These settings are intended to limit the write queue length in collectd (the values should probably be much lower, like 80 - 100).
Unfortunately, that setting does not affect either the python plugin (for writing to gnocchi) or the amqp1
plugin, which was addressed in above mentioned bugs.

Note You need to log in before you can comment on or make changes to this bug.