726379 – Qpidd possible memory leaks

Bug 726379 - Qpidd possible memory leaks

Summary: Qpidd possible memory leaks

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise MRG
Classification:	Red Hat
Component:	qpid-cpp
Sub Component:
Version:	1.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	2.1
Target Release:	---
Assignee:	Alan Conway
QA Contact:	Leonid Zhaldybin
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	736160
TreeView+	depends on / blocked

Reported:	2011-07-28 12:38 UTC by ppecka
Modified:	2014-11-09 22:38 UTC (History)
CC List:	7 users (show)
Fixed In Version:	0.14
Doc Type:	Bug Fix
Doc Text:
Clone Of:	703590
Clones:	736160 (view as bug list)
Environment:
Last Closed:	2012-03-29 20:03:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Charts of memory use over time (8.17 KB, application/x-bzip-compressed-tar) 2011-08-03 20:59 UTC, Alan Conway	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	703590	0	high	CLOSED	OOM on clustered qpidd node leads to cluster abort	2025-02-10 03:13:41 UTC

Internal Links: 703590 723809

Comment 3 Alan Conway 2011-07-28 13:23:47 UTC

Which nodes were the perftest and txtest clients running on?

Comment 6 Alan Conway 2011-08-03 20:59:42 UTC

Created attachment 516579 [details]
Charts of memory use over time

I don't think there is a leak. The charts in comment 5 agree with my own observations over 1.5 days (see attached charts): qpidd goes through a growth phase where memory use increases in jumps, but settles down to a steady state. In one case the last jump came late but I don't think that indicates long-term growth. 

Note that RSS stays low and steady throughout, so we're not using a growing amount of physical memory.

Running profilng valgrind tools massif and exp-dhat on a standalone broker show that while the vsize goes up to 400M, only about 30MB is ever allocated on the heap at a time.

Note the total size of libraries linked with qpidd is about 400M
(ldd ~/install/sbin/qpidd | awk '{print $3}' | xargs wc -c) 

The standalone broker looks healthy - it only goes up to 400M which is presumably mostly holding code loaded from shared libs. The cluster and durable cases are eating more memory but they don't seem to be growing in the long run.

I think what we are seeing is memory fragmentation. The size initially grows because there's not enough contiguous memory to allocate objects, and it stays high because even though qpid is not using more than 30M, the virtual address space is full of "holes" of unused memory. 

The memory fragmentation is worth investigating for possible performance improvements, but I don't think this is a critical bug that should hold up the release. 

The tests are still running so I'll check in to see if running longer does start to raise the memory use.

Comment 7 Alan Conway 2011-08-04 12:48:28 UTC

Tests from comment 6 still running with no further memory increases after 48 hours. The clustered brokers are using 500M and 650, standalone durable 750M, standalone 400M.
 
I think the memory use should be investigated as it seems very inefficient and probably is fragmented (it looks like we only use 10% of the VM allocated)  but I don't think the broker is growing without limit.

Comment 9 Alan Conway 2011-08-05 20:54:18 UTC

Fix comitted to release repo on branch aconway-bz726379, off end of mrg_1.3.0.x branch.

Ported to trunk as r1154377

Comment 12 Ted Ross 2012-03-29 20:03:50 UTC

This was fixed well prior to the 0.14 rebase

Comment 13 Frantisek Reznicek 2012-03-30 09:48:56 UTC

CLOSED/CRELEASE -> ASSIGNED -> ON_QA
The defect has to go through QA process.

Comment 14 Leonid Zhaldybin 2012-04-12 14:25:00 UTC

The test was run on the latest available 0.14 build. The cluster under testing consisted of four nodes, each of them running on a RHEL5.8 virtual machine. Two of the nodes had a i386 architecture, two others x86_64.
After running the test scripts for over 8 days, the memory consumption did not change significantly for three of the nodes. Log files show that the memory consumption on these nodes remained almost the same during the testrun, fluctuating by ~10MB around the initial value. The fourth node (the one on which the testing script was running) showed a noticeable increase of memory usage from initial value of 130MB to almost 150MB (~20MB). This, however, is still a much better result than the one initially reported.
Resolution: this issue has been fixed.

Packages used for testing:
qpid-cpp-client-0.14-14.el5
qpid-cpp-client-devel-0.14-14.el5
qpid-cpp-client-devel-docs-0.14-14.el5
qpid-cpp-client-rdma-0.14-14.el5
qpid-cpp-client-ssl-0.14-14.el5
qpid-cpp-mrg-debuginfo-0.14-14.el5
qpid-cpp-server-0.14-14.el5
qpid-cpp-server-cluster-0.14-14.el5
qpid-cpp-server-devel-0.14-14.el5
qpid-cpp-server-rdma-0.14-14.el5
qpid-cpp-server-ssl-0.14-14.el5
qpid-cpp-server-store-0.14-14.el5
qpid-cpp-server-xml-0.14-14.el5
rh-qpid-cpp-tests-0.14-14.el5

-> VERIFIED

Note You need to log in before you can comment on or make changes to this bug.