Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1609227

Summary: memory leak on 0.18 on overwriting priority messages in ring prio queue
Product: Red Hat Enterprise MRG Reporter: Pavel Moravec <pmoravec>
Component: qpid-cppAssignee: messaging-bugs <messaging-bugs>
Status: CLOSED UPSTREAM QA Contact: Messaging QE <messaging-qe-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 2.5CC: gsim, jross
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-02-10 03:59:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pavel Moravec 2018-07-27 10:21:31 UTC
Description of problem:
When overwriting messages in priority ring queue due to both message priority and ring feature, memory leak happens. It is due to:

/usr/src/debug/qpid-0.18/cpp/src/qpid/broker/PriorityQueue.h :

    /** FIFO index of all messsagse (including acquired messages)  for fast browsing and indexing */
    MessageDeque fifo;

where this deque is accumulating more and more messages.

I *think* the reason is PriorityQueue::erase purges "messages" (map of dequeues that was seen always right during the reproducer) but not "fifo" (dequeue that was growing over time, with many DELETED or REMOVED messages there).

This leak is present on primary and also backup broker in HA cluster, as well as on a standalone broker.

This leak is present on 0.18-49 but fixed in 1.36 (where, contrary, a similar leak is present - will file soon)


Version-Release number of selected component (if applicable):
qpid-cpp 0.18-49
NOT present in 1.36


How reproducible:
100%


Steps to Reproduce:
- create ring priority queue, send messages with priority 9 and then in a loop with prio 7 (or any other lower prio)

queue=Ring_1
qpid-receive -a "${queue}; {create:always, node:{ x-declare:{arguments:{'qpid.max_count':1000, 'qpid.policy_type':'ring', 'x-qpid-priorities':10}}}}" 

qpid-send --priority=9 -m 1000 -a "${queue}"

while true; do date; qpid-send --priority=7 -m 1000 -a "${queue}"; done

- observe memory footprint
- optionally, in gdb, check "fifo" deque size of that queue


Actual results:
- memory grows over time
- "fifo" deque grows over time


Expected results:
- no mem.leak


Additional info:
in gdb:

(gdb) thread 6
[Switching to thread 6 (Thread 0x7f853d2fa7a0 (LWP 28590))]#0  0x00007f853addc1c3 in epoll_wait () from /lib64/libc.so.6
(gdb) frame 4
#4  0x000000000040fbdb in qpid::broker::QpiddDaemon::child (this=<value optimized out>) at posix/QpiddBroker.cpp:149
149	        brokerPtr->run();
(gdb) p brokerPtr->px->queues
$1 = {queues = std::map with 26 elements = {.... ["Ring_1"] = {px = 0x7f84f01419a0, pn = {pi_ = 0x7f84f012e690}}}, 
  lock = {<boost::noncopyable_::noncopyable> = {<No data fields>}, rwlock = {__data = {__lock = 0, __nr_readers = 0, __readers_wakeup = 12, __writer_wakeup = 0, __nr_readers_queued = 0, 
        __nr_writers_queued = 0, __writer = 0, __shared = 0, __pad1 = 0, __pad2 = 0, __flags = 0}, __size = "\000\000\000\000\000\000\000\000\f", '\000' <repeats 46 times>, __align = 0}}, 
  counter = 1, store = 0x197a840, events = 0x0, parent = 0x197d6a0, lastNode = false, broker = 0x197ab10}
(gdb) p *((qpid::broker::PriorityQueue *) ((qpid::broker::Queue*) 0x7f84f01419a0)->messages->_M_ptr)
$2 = {<qpid::broker::Messages> = {_vptr.Messages = 0x7f853d0d57b0}, levels = 10, messages = std::vector of length 10, capacity 10 = {std::deque with 0 elements, std::deque with 0 elements, 
    std::deque with 0 elements, std::deque with 0 elements, std::deque with 0 elements, std::deque with 0 elements, std::deque with 0 elements, std::deque with 86097 elements = {
      0x7f83c4779948, .....}, std::deque with 13903 elements = {0x7f84f9a89bc8, .....}, std::deque with 0 elements}, 
  fifo = {<qpid::broker::Messages> = {_vptr.Messages = 0x7f853d0d52f0}, messages = std::deque with 110111542 elements = {{payload = {px = 0x7f84feacf010}, 


See the fifo is "std::deque with 110111542 elements" (while "messages" has 100k messages in 2 priority buckets)

Comment 1 Alan Conway 2018-08-02 20:03:02 UTC
This problem is not trivial to fix

PriorityQueue uses a MessageDeq to track the non-prioritized queue order of messages for browsing.

MessageDeq is not suitable for a sparse queue - if some old messages stay on the queue while new ones arrive, the MessageDeq store is "padded" with empty entries from the oldest message to the newest sequence number. Cleaning up empty entries at the ends of the queue doesn't help - as long as the old message stays in place, every new message forces the queue to be re-padded up to the latest sequence number so process memory still grows.

To fix this PriorityQueue needs to use a data structure that can efficiently represent a sparse queue. Not very hard to implement but Qpidd doesn't have such a structure now AFAIK. This is a risky fix for all users of priority-queue.

Comment 2 Gordon Sim 2018-08-02 20:25:53 UTC
I don't understand why the reproducer from the description would cause a sparse queue. The priority 9 messages should be delivered to the receiver. Then there is just a steady stream of priority 7 messages. Assuming the ingress exceeds the egress, the ring policy will kick in and it *should* remove the oldest priority 7 messages. What am I missing?

Comment 3 Alan Conway 2018-08-07 15:14:07 UTC
There is no receiver. The problem is that a queue with max-count grows in memory without limit under continuous sending if nobody receives the messages. The priority 7 messages are being deleted correctly but the priority 9 message stays there forever so the MessageDeq grows without limit.

Additional cleaning code doesn't help, the deq just gets re-allocated to span the message ids from the old message to the latest.

Comment 4 Gordon Sim 2018-08-07 16:27:34 UTC
Ah, ok. I was thrown off track by the qpid-receive in the reproducer. I guess that is exiting immediately though and is just being used to create the queue.

I note also in the description that it says this bug is not present in 1.36 and indeed I can't seem to reproduce sustained memory growth with the reproducer as described against latest master.

Comment 5 Alan Conway 2018-08-07 19:24:48 UTC
(In reply to Gordon Sim from comment #4)
> Ah, ok. I was thrown off track by the qpid-receive in the reproducer. I
> guess that is exiting immediately though and is just being used to create
> the queue.
> 
> I note also in the description that it says this bug is not present in 1.36
> and indeed I can't seem to reproduce sustained memory growth with the
> reproducer as described against latest master.

+1 - my feeling is this bug isn't present in 1.36 and Bug 1609230 is not the same issue.

Comment 6 Red Hat Bugzilla 2025-02-10 03:59:24 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.