Bug 1136294
| Summary: | Memory leak when purging priority ring queue | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Pavel Moravec <pmoravec> | ||||
| Component: | qpid-cpp | Assignee: | Pavel Moravec <pmoravec> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Zdenek Kraus <zkraus> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 2.5 | CC: | aconway, iboverma, jross, lzhaldyb, mcressma, pematous, zkraus | ||||
| Target Milestone: | 2.5.5 | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | All | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | qpid-cpp-0.18-31 | Doc Type: | Bug Fix | ||||
| Doc Text: |
It was discovered that when messages were purged from a priority ring queue, the message counters were not properly maintained. This caused messages to be deleted incorrectly, which caused memory to be consumed unnecessarily. The message counters are now properly maintained for this particular scenario. Messages are purged properly, and memory no longer accumulates during repeated purging operations.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-10-22 16:34:32 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1140368, 1140369 | ||||||
| Attachments: |
|
||||||
Yet another reproducer, again relying on Queue::purge method: use QMF method reroute. Reproducer:
qpid-config add exchange fanout AltForQueue1
qpid-config add exchange fanout AltForQueue2
for q in Queue1 Queue2; do
qpid-config add queue $q --argument x-qpid-priorities=10 --max-queue-count=5000 --limit-policy=ring --alternate-exchange=AltFor${q}
done
qpid-config bind AltForQueue2 Queue1
qpid-config bind AltForQueue1 Queue2
qpid-send -a Queue1 -m 3000 --content-size=100
while true; do
for q in Queue1 Queue2; do
ps aux | grep qpid | grep daemon
./reroute_queue $q AltFor${q} 0 false > /dev/null 2>&1
done
done
Root cause in source code found.
Queue::purge method executes:
1) messages->removeIf. Here, PriorityQueue::removeIf calls "fifo.clean()".
2) dequeue(0, *qmsg); is called for all dequeued messages. Queue::dequeue executes:
deleted = removed || messages->deleted(msg);
if (deleted) {
if (policy.get()) policy->dequeued(msg);
but PriorityQueue::deleted checks it has empty fifo so it returns false. And queue policy is _not_ updated, keeping itself reference to the message and not updating QMF counters.
resolving this fixes also bz1136029.
Quite probable patch (just building test packages):
diff -rup _qpid-cpp-0.18-30_orig/cpp/src/qpid/broker/PriorityQueue.cpp _qpid-cpp-0.18-30_new/cpp/src/qpid/broker/PriorityQueue.cpp
--- _qpid-cpp-0.18-30_orig/cpp/src/qpid/broker/PriorityQueue.cpp 2014-09-02 10:44:20.398153124 +0200
+++ _qpid-cpp-0.18-30_new/cpp/src/qpid/broker/PriorityQueue.cpp 2014-09-03 15:37:47.693868410 +0200
@@ -140,7 +140,7 @@ void PriorityQueue::removeIf(Predicate p
for (int priority = 0; priority < levels; ++priority) {
for (Deque::iterator i = messages[priority].begin(); i != messages[priority].end();) {
if ((*i)->status == QueuedMessage::AVAILABLE && p(**i)) {
- (*i)->status = QueuedMessage::DELETED; // Updates fifo index
+ (*i)->status = QueuedMessage::REMOVED; // Updates fifo index
i = messages[priority].erase(i);
clearCache();
} else {
Created attachment 934111 [details]
Patch proposal
PriorityQueue::removeIf method has to mark removed messages as REMOVED and not DELETED (cf MessageDequeue::removeIf method), otherwise the messages are removed immediatelly and queue policy is not updated.
(In reply to Pavel Moravec from comment #4) > Created attachment 934111 [details] > Patch proposal > > PriorityQueue::removeIf method has to mark removed messages as REMOVED and > not DELETED (cf MessageDequeue::removeIf method), otherwise the messages are > removed immediatelly and queue policy is not updated. This patch looks good to me, provided that the testing bears it out. *** Bug 1136029 has been marked as a duplicate of this bug. *** This was tested on RHEL5, RHEL6 && i686, x86_64 with following packages: python-qpid-0.18-13 python-qpid-qmf-0.18-27 qpid-cpp-client-0.18-32 qpid-cpp-client-devel-0.18-32 qpid-cpp-client-devel-docs-0.18-32 qpid-cpp-client-ssl-0.18-32 qpid-cpp-server-0.18-32 qpid-cpp-server-devel-0.18-32 qpid-cpp-server-ssl-0.18-32 qpid-cpp-server-store-0.18-32 qpid-cpp-server-xml-0.18-32 qpid-java-client-0.18-8 qpid-java-common-0.18-8 qpid-java-example-0.18-8 qpid-jca-0.18-8 qpid-jca-xarecovery-0.18-8 qpid-qmf-0.18-27 qpid-tools-0.18-10 ->VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2014-1682.html |
Description of problem: See Summary and reproducer. Valgrind does not show any leak. I suspect RingQueuePolicy accumulates the memory as gdb shows: (gdb) p (*(qpid::broker::RingQueuePolicy *)(policy)) $11 = {<qpid::broker::QueuePolicy> = {_vptr.QueuePolicy = 0x7f62d2ebaef0, static defaultMaxSize = 104857600, maxCount = 5000, maxSize = 104857600, type = "ring", count = 31409, size = 157045000, policyExceeded = true, queue = 0x7f62b405c640, .. See maxCount being smaller than count, and "queue" has some members with size_ in thousands. Version-Release number of selected component (if applicable): qpid-cpp-server 0.18-30 How reproducible: 100% Steps to Reproduce: (see https://access.redhat.com/site/solutions/70456 for purge_queue) rm -rf /var/lib/qpidd/* service qpidd restart qpid-config add queue PrioRing --argument x-qpid-priorities=10 --max-queue-count=1000 --limit-policy=ring while true; do qpid-send -a PrioRing -m 700 --content-size=50 #--ttl=10 ./purge_queue PrioRing 0 > /dev/null 2>&1 ps aux | grep qpidd | grep daemon done Actual results: RSS and also CPU usage of qpidd process grows in every iteration. Expected results: RSS stabilises after a while, CPU usage as well (though that is rather a side effect) Additional info: - priority queue is essential for reproducer. Ring queue as well but just due to bz1136029 - it is probably a regression caused by bz860691 - another reproducer (worth to verify as well) relies on purging expired messages: echo "queue-purge-interval=1" > /etc/qpidd.conf rm -rf /var/lib/qpidd/* service qpidd restart qpid-config add queue PrioRing --argument x-qpid-priorities=10 --max-queue-count=10000 --limit-policy=ring while true; do qpid-send -a PrioRing -m 7000 --content-size=50 --ttl=10 sleep 1.3 qpid-receive -a "PrioRing; {mode:browse}" -m 100000 | sed "s/X//g" | nl # should be empty output, just to confirm the queue is really empty ps aux | grep qpidd | grep daemon done