Bug 1350023

Summary: Memory leak in primary broker when creating&unsubscribing from an autoDel queue in a loop
Product: Red Hat Enterprise MRG Reporter: Pavel Moravec <pmoravec>
Component: qpid-cppAssignee: Alan Conway <aconway>
Status: CLOSED ERRATA QA Contact: Messaging QE <messaging-qe-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 3.2CC: aconway, jross, mcressma, tkratky, zkraus
Target Milestone: 3.2.2   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qpid-cpp-0.34-16 Doc Type: Bug Fix
Doc Text:
Cause: when creating then unsubscribing from an autodelete queue, unnecessary information was being stored in the primary cluster node and never released Consequence: memory use would increase over time Fix: removed the code that stored the unneeded information Result: memory now does not accumulate
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-11 07:36:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Pavel Moravec 2016-06-24 20:49:16 UTC
Description of problem:
Having a consumer that (in a loop):
- creates an autoDelete queue
- subscribes to it
- unsubscribes
against a HA cluster, primary broker memory consumption grows over the time.


Version-Release number of selected component (if applicable):
0.34-6
0.34-15


How reproducible:
100%


Steps to Reproduce:
1. Start 3 brokers in a HA cluster (mine reproducer uses options:

qpidd --port=5672 --store-dir=_5672 --log-to-file=qpidd.5672.log --data-dir=_5672 --auth=no --log-to-stderr=no --ha-cluster=yes --ha-brokers-url=localhost:5672,localhost:5673,localhost:5674,localhost:5675,localhost:5676 --ha-replicate=all --acl-file=/root/qpidd.acl --link-maintenance-interval=5
)

2. Run simple qpid-receive in a loop:
while true; do
  qpid-receive -a "autoDelQueue;  {create:always, node:{ x-declare:{auto-delete:True}}}"
  sleep 0.1
done

3. Monitor memory usage of primary broker.


Actual results:
- memory usage grows over the time (just of primary broker, backup ones are OK)


Expected results:
- no memory usage growth


Additional info:
- I *think* HA cluster is necessary condition (i.e. bug does not occur on standalone broker), but I will test it

Comment 1 Pavel Moravec 2016-06-25 20:59:56 UTC
Additional testing revealed that:

- standalone broker does not exhibit that mem.leak

- even standalone broker in HA cluster does not - backup brokers are mandatory for the leak

- anyway, replicator bridge queues on primary stand almost everytime empty, no bursts of messages occur there.

- just the primary broker is affected, backups are OK

- amount of leaked memory does not depend on number of backups (very similar mem.usage when having 1,2 or 4 backups)

- valgrind does not show any leaked or excessive "still reachable" memory, even after 1 hour test where memory consumption of valgrind grew evidently

- curiously, every running "qpid-stat -q" causes _additional_ memory is leaked - maybe due to the fact it uses auxiliary autoDel queue as well (but just one while the leak is much bigger)?

- bug is present in versions:
0.34-15
0.34-6
0.34-5
(today's upstream r1750209)

- leak _not_ present in 0.30-8, so a regression since then

- --ha-replicate=all is crucial, even --ha-replication=configuration does _not_ trigger the leak

- adding --enable-qmf2=no prevents most the leak - memory still grows (sic!) but evidently slower

Comment 2 Alan Conway 2016-06-27 21:46:12 UTC
Fixed on branch:

http://git.app.eng.bos.redhat.com/git/rh-qpid.git/commit/?h=0.34-mrg-aconway-bz1350023&id=a499d3f470604ec598e697e0d0ca8257553444f9

Bug 1350023 - Memory leak in primary broker when creating&unsubscribing from an autoDel queue in a loop0.34-mrg-aconway-bz1350023
Removed redundant code that was keeping replicated queues in a map,
left over from earlier code that was not completely removed.

Comment 7 errata-xmlrpc 2016-10-11 07:36:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2049.html