Bug 1136294 - Memory leak when purging priority ring queue
Summary: Memory leak when purging priority ring queue
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 2.5
Hardware: All
OS: All
high
high
Target Milestone: 2.5.5
: ---
Assignee: Pavel Moravec
QA Contact: Zdenek Kraus
URL:
Whiteboard:
: 1136029 (view as bug list)
Depends On:
Blocks: 1140368 1140369
TreeView+ depends on / blocked
 
Reported: 2014-09-02 09:46 UTC by Pavel Moravec
Modified: 2018-12-09 18:29 UTC (History)
7 users (show)

Fixed In Version: qpid-cpp-0.18-31
Doc Type: Bug Fix
Doc Text:
It was discovered that when messages were purged from a priority ring queue, the message counters were not properly maintained. This caused messages to be deleted incorrectly, which caused memory to be consumed unnecessarily. The message counters are now properly maintained for this particular scenario. Messages are purged properly, and memory no longer accumulates during repeated purging operations.
Clone Of:
Environment:
Last Closed: 2014-10-22 16:34:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch proposal (880 bytes, patch)
2014-09-03 14:15 UTC, Pavel Moravec
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1682 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 2 update 2014-10-22 20:34:01 UTC

Description Pavel Moravec 2014-09-02 09:46:53 UTC
Description of problem:
See Summary and reproducer. Valgrind does not show any leak. I suspect RingQueuePolicy accumulates the memory as gdb shows:

(gdb) p (*(qpid::broker::RingQueuePolicy *)(policy))
$11 = {<qpid::broker::QueuePolicy> = {_vptr.QueuePolicy = 0x7f62d2ebaef0, static defaultMaxSize = 104857600, maxCount = 5000, 
    maxSize = 104857600, type = "ring", count = 31409, size = 157045000, policyExceeded = true, queue = 0x7f62b405c640, 
..

See maxCount being smaller than count, and "queue" has some members with size_ in thousands.


Version-Release number of selected component (if applicable):
qpid-cpp-server 0.18-30


How reproducible:
100%


Steps to Reproduce:
(see https://access.redhat.com/site/solutions/70456 for purge_queue)

rm -rf /var/lib/qpidd/*
service qpidd restart

qpid-config add queue PrioRing --argument x-qpid-priorities=10 --max-queue-count=1000 --limit-policy=ring

while true; do
        qpid-send -a PrioRing -m 700 --content-size=50 #--ttl=10
        ./purge_queue PrioRing 0 > /dev/null 2>&1
        ps aux | grep qpidd | grep daemon
done


Actual results:
RSS and also CPU usage of qpidd process grows in every iteration.


Expected results:
RSS stabilises after a while, CPU usage as well (though that is rather a side effect)


Additional info:
- priority queue is essential for reproducer. Ring queue as well but just due to bz1136029
- it is probably a regression caused by bz860691
- another reproducer (worth to verify as well) relies on purging expired messages:

echo "queue-purge-interval=1" > /etc/qpidd.conf
rm -rf /var/lib/qpidd/*
service qpidd restart

qpid-config add queue PrioRing --argument x-qpid-priorities=10 --max-queue-count=10000 --limit-policy=ring

while true; do
        qpid-send -a PrioRing -m 7000 --content-size=50 --ttl=10
	sleep 1.3
	qpid-receive -a "PrioRing; {mode:browse}" -m 100000 | sed "s/X//g" | nl # should be empty output, just to confirm the queue is really empty
        ps aux | grep qpidd | grep daemon
done

Comment 1 Pavel Moravec 2014-09-03 12:28:02 UTC
Yet another reproducer, again relying on Queue::purge method: use QMF method reroute. Reproducer:

qpid-config add exchange fanout AltForQueue1
qpid-config add exchange fanout AltForQueue2
for q in Queue1 Queue2; do
	qpid-config add queue $q --argument x-qpid-priorities=10 --max-queue-count=5000 --limit-policy=ring --alternate-exchange=AltFor${q}
done
qpid-config bind AltForQueue2 Queue1
qpid-config bind AltForQueue1 Queue2

qpid-send -a Queue1 -m 3000 --content-size=100

while true; do
	for q in Queue1 Queue2; do
	        ps aux | grep qpid | grep daemon
		./reroute_queue $q AltFor${q} 0 false > /dev/null 2>&1
	done
done

Comment 2 Pavel Moravec 2014-09-03 12:59:20 UTC
Root cause in source code found.

Queue::purge method executes:

1) messages->removeIf. Here, PriorityQueue::removeIf calls "fifo.clean()".

2) dequeue(0, *qmsg); is called for all dequeued messages. Queue::dequeue executes:

            deleted = removed || messages->deleted(msg);
            if (deleted) {
              if (policy.get()) policy->dequeued(msg);

but PriorityQueue::deleted checks it has empty fifo so it returns false. And queue policy is _not_ updated, keeping itself reference to the message and not updating QMF counters.

resolving this fixes also bz1136029.

Comment 3 Pavel Moravec 2014-09-03 13:39:44 UTC
Quite probable patch (just building test packages):

diff -rup _qpid-cpp-0.18-30_orig/cpp/src/qpid/broker/PriorityQueue.cpp _qpid-cpp-0.18-30_new/cpp/src/qpid/broker/PriorityQueue.cpp
--- _qpid-cpp-0.18-30_orig/cpp/src/qpid/broker/PriorityQueue.cpp	2014-09-02 10:44:20.398153124 +0200
+++ _qpid-cpp-0.18-30_new/cpp/src/qpid/broker/PriorityQueue.cpp	2014-09-03 15:37:47.693868410 +0200
@@ -140,7 +140,7 @@ void PriorityQueue::removeIf(Predicate p
     for (int priority = 0; priority < levels; ++priority) {
         for (Deque::iterator i = messages[priority].begin(); i != messages[priority].end();) {
             if ((*i)->status == QueuedMessage::AVAILABLE && p(**i)) {
-                (*i)->status = QueuedMessage::DELETED; // Updates fifo index
+                (*i)->status = QueuedMessage::REMOVED; // Updates fifo index
                 i = messages[priority].erase(i);
                 clearCache();
             } else {

Comment 4 Pavel Moravec 2014-09-03 14:15:01 UTC
Created attachment 934111 [details]
Patch proposal

PriorityQueue::removeIf method has to mark removed messages as REMOVED and not DELETED (cf MessageDequeue::removeIf method), otherwise the messages are removed immediatelly and queue policy is not updated.

Comment 5 Alan Conway 2014-09-03 17:13:42 UTC
(In reply to Pavel Moravec from comment #4)
> Created attachment 934111 [details]
> Patch proposal
> 
> PriorityQueue::removeIf method has to mark removed messages as REMOVED and
> not DELETED (cf MessageDequeue::removeIf method), otherwise the messages are
> removed immediatelly and queue policy is not updated.

This patch looks good to me, provided that the testing bears it out.

Comment 6 Pavel Moravec 2014-09-03 17:50:34 UTC
*** Bug 1136029 has been marked as a duplicate of this bug. ***

Comment 8 Zdenek Kraus 2014-10-02 14:01:42 UTC
This was tested on RHEL5, RHEL6 && i686, x86_64 with following packages:

python-qpid-0.18-13
python-qpid-qmf-0.18-27
qpid-cpp-client-0.18-32
qpid-cpp-client-devel-0.18-32
qpid-cpp-client-devel-docs-0.18-32
qpid-cpp-client-ssl-0.18-32
qpid-cpp-server-0.18-32
qpid-cpp-server-devel-0.18-32
qpid-cpp-server-ssl-0.18-32
qpid-cpp-server-store-0.18-32
qpid-cpp-server-xml-0.18-32
qpid-java-client-0.18-8
qpid-java-common-0.18-8
qpid-java-example-0.18-8
qpid-jca-0.18-8
qpid-jca-xarecovery-0.18-8
qpid-qmf-0.18-27
qpid-tools-0.18-10


->VERIFIED

Comment 11 errata-xmlrpc 2014-10-22 16:34:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2014-1682.html


Note You need to log in before you can comment on or make changes to this bug.