Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1132395

Summary: Memory leak when using federation with unreliable link
Product: Red Hat Enterprise MRG Reporter: Pavel Moravec <pmoravec>
Component: qpid-cppAssignee: messaging-bugs <messaging-bugs>
Status: CLOSED ERRATA QA Contact: Zdenek Kraus <zkraus>
Severity: high Docs Contact:
Priority: high    
Version: 2.5CC: esammons, gsim, iboverma, jross, lzhaldyb, mcressma, pematous, zkraus
Target Milestone: 2.5.5   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: qpid-cpp-0.18-31 Doc Type: Bug Fix
Doc Text:
It was discovered that when sending messages over a federation link with no acking, the message delivery records were not properly marked when the message was delivered. This caused the broker memory footprint to increase unnecessarily with each message. The fix ensures the DeliveryRecord is correctly marked when the message is delivered, allowing the memory for the message to be released. The memory footprint of the broker can now decrease when messages are delivered.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-22 16:34:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1140368, 1140369    

Description Pavel Moravec 2014-08-21 09:36:41 UTC
Description of problem:
Having federation link without --ack set (i.e. unreliable), source broker suffers by memory leak by passing every message through the link. Memory increase corresponds to the size of the message passed.


Version-Release number of selected component (if applicable):
0.18-29


How reproducible:
100%


Steps to Reproduce:
1) Start 2 brokers on nodes mrg01 and mrg02
2) Do trivial federation setup:
qpid-config add queue fromQueue -b mrg01
qpid-route queue add mrg02 mrg01 amq.fanout fromQueue
3) send messages to the queue and monitor memory utilization:

while true; do
  qpid-send -a fromQueue -b mrg01 -m 1 --content-size=16777216
  ssh mrg01 "ps aux | grep qpidd | grep -v grep"
  qpid-stat -q -b mrg01
  sleep 1
done


Actual results:
Memory grows in every iteration by approx the message size.


Expected results:
No memory growth.


Additional info:
- The growth is visible also for small messages, just it takes much more time to see/confirm the leak.
- having reliable fed.link, no leak occurs (i.e. add "--ack=5" to the qpid-route is a valid workaround)

Comment 1 Pavel Moravec 2014-08-21 09:54:00 UTC
Even simplier reproducer without federation: use unreliable consumer:

qpid-receive -a "amq.fanout; { link:{reliability:unreliable} }" --print-content=no -f &

while true; do
  qpid-send -a amq.fanout -m 1 --content-size=16777216
  ps aux | grep qpidd | grep -v grep
  qpid-stat -q
  sleep 1
done

Comment 2 Pavel Moravec 2014-08-21 10:08:59 UTC
FYI, valgrind does not show any leaked memory. And once I stop the unreliable consumer, memory utilization drops down.

But still this is not intended behaviour. The consumer can be connected for weeks, and then the memory increase (though limited just to the consumer life-cycle) can cause OOM killer stopping the broker..

Checked against upstream qpid and there is no such behaviour.

Comment 3 Pavel Moravec 2014-08-21 11:23:37 UTC
DeliveryRecords qpid::broker::SemanticState::unacked keeps track of the messages.

I.e. sending 13 messages via reproducer in comment #1, unacked = std::deque with 13 elements = {{msg = {payload = {px = 0x7f1a8c002250}, ..

Comment 4 Gordon Sim 2014-08-21 20:52:56 UTC
The bug appears to have been originally introduced quite far back:

http://git.app.eng.bos.redhat.com/git/rh-qpid.git/commit/qpid/cpp/src/qpid/broker/SemanticState.cpp?h=0.18-mrg&id=6010991600d605d2a82a0c64a105a7ceabecffae

I can observe large memory growth when sending large messages over a fed link with no acking even for mrg2.3-checkpoint10. 

I've created a branch with a fix for this:

http://git.app.eng.bos.redhat.com/git/rh-qpid.git/log/?h=0.18-mrg_BZ1132395

Comment 5 Pavel Moravec 2014-08-22 10:00:19 UTC
(In reply to Gordon Sim from comment #4)
> The bug appears to have been originally introduced quite far back:
> 
> http://git.app.eng.bos.redhat.com/git/rh-qpid.git/commit/qpid/cpp/src/qpid/
> broker/SemanticState.cpp?h=0.18-
> mrg&id=6010991600d605d2a82a0c64a105a7ceabecffae
> 
> I can observe large memory growth when sending large messages over a fed
> link with no acking even for mrg2.3-checkpoint10. 
> 
> I've created a branch with a fix for this:
> 
> http://git.app.eng.bos.redhat.com/git/rh-qpid.git/log/?h=0.18-mrg_BZ1132395

Both reproducers from comment #0 and comment #3 are fixed by that - no memory growth during the repro execution.

Good work, Gordon!

Comment 11 Zdenek Kraus 2014-10-16 16:54:53 UTC
tested on RHEL 5 6 7 && i686 x86_64, with following packages:
python-qpid-0.18-13.el6
python-qpid-qmf-0.18-28.el6
qpid-cpp-client-0.18-35.el6
qpid-cpp-client-devel-0.18-35.el6
qpid-cpp-client-devel-docs-0.18-35.el6
qpid-cpp-client-ssl-0.18-35.el6
qpid-cpp-server-0.18-35.el6
qpid-cpp-server-devel-0.18-35.el6
qpid-cpp-server-ssl-0.18-35.el6
qpid-cpp-server-store-0.18-35.el6
qpid-cpp-server-xml-0.18-35.el6
qpid-java-client-0.18-8.el6_4
qpid-java-common-0.18-8.el6_4
qpid-java-example-0.18-8.el6_4
qpid-jca-0.18-8.el6
qpid-jca-xarecovery-0.18-8.el6
qpid-qmf-0.18-28.el6
qpid-tools-0.18-10.el6_4


->VERIFIED

Comment 13 errata-xmlrpc 2014-10-22 16:34:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2014-1682.html