Bug 1132395
| Summary: | Memory leak when using federation with unreliable link | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Pavel Moravec <pmoravec> |
| Component: | qpid-cpp | Assignee: | messaging-bugs <messaging-bugs> |
| Status: | CLOSED ERRATA | QA Contact: | Zdenek Kraus <zkraus> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 2.5 | CC: | esammons, gsim, iboverma, jross, lzhaldyb, mcressma, pematous, zkraus |
| Target Milestone: | 2.5.5 | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | qpid-cpp-0.18-31 | Doc Type: | Bug Fix |
| Doc Text: |
It was discovered that when sending messages over a federation link with no acking, the message delivery records were not properly marked when the message was delivered. This caused the broker memory footprint to increase unnecessarily with each message. The fix ensures the DeliveryRecord is correctly marked when the message is delivered, allowing the memory for the message to be released. The memory footprint of the broker can now decrease when messages are delivered.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-10-22 16:34:28 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1140368, 1140369 | ||
|
Description
Pavel Moravec
2014-08-21 09:36:41 UTC
Even simplier reproducer without federation: use unreliable consumer:
qpid-receive -a "amq.fanout; { link:{reliability:unreliable} }" --print-content=no -f &
while true; do
qpid-send -a amq.fanout -m 1 --content-size=16777216
ps aux | grep qpidd | grep -v grep
qpid-stat -q
sleep 1
done
FYI, valgrind does not show any leaked memory. And once I stop the unreliable consumer, memory utilization drops down. But still this is not intended behaviour. The consumer can be connected for weeks, and then the memory increase (though limited just to the consumer life-cycle) can cause OOM killer stopping the broker.. Checked against upstream qpid and there is no such behaviour. DeliveryRecords qpid::broker::SemanticState::unacked keeps track of the messages. I.e. sending 13 messages via reproducer in comment #1, unacked = std::deque with 13 elements = {{msg = {payload = {px = 0x7f1a8c002250}, .. The bug appears to have been originally introduced quite far back: http://git.app.eng.bos.redhat.com/git/rh-qpid.git/commit/qpid/cpp/src/qpid/broker/SemanticState.cpp?h=0.18-mrg&id=6010991600d605d2a82a0c64a105a7ceabecffae I can observe large memory growth when sending large messages over a fed link with no acking even for mrg2.3-checkpoint10. I've created a branch with a fix for this: http://git.app.eng.bos.redhat.com/git/rh-qpid.git/log/?h=0.18-mrg_BZ1132395 (In reply to Gordon Sim from comment #4) > The bug appears to have been originally introduced quite far back: > > http://git.app.eng.bos.redhat.com/git/rh-qpid.git/commit/qpid/cpp/src/qpid/ > broker/SemanticState.cpp?h=0.18- > mrg&id=6010991600d605d2a82a0c64a105a7ceabecffae > > I can observe large memory growth when sending large messages over a fed > link with no acking even for mrg2.3-checkpoint10. > > I've created a branch with a fix for this: > > http://git.app.eng.bos.redhat.com/git/rh-qpid.git/log/?h=0.18-mrg_BZ1132395 Both reproducers from comment #0 and comment #3 are fixed by that - no memory growth during the repro execution. Good work, Gordon! tested on RHEL 5 6 7 && i686 x86_64, with following packages: python-qpid-0.18-13.el6 python-qpid-qmf-0.18-28.el6 qpid-cpp-client-0.18-35.el6 qpid-cpp-client-devel-0.18-35.el6 qpid-cpp-client-devel-docs-0.18-35.el6 qpid-cpp-client-ssl-0.18-35.el6 qpid-cpp-server-0.18-35.el6 qpid-cpp-server-devel-0.18-35.el6 qpid-cpp-server-ssl-0.18-35.el6 qpid-cpp-server-store-0.18-35.el6 qpid-cpp-server-xml-0.18-35.el6 qpid-java-client-0.18-8.el6_4 qpid-java-common-0.18-8.el6_4 qpid-java-example-0.18-8.el6_4 qpid-jca-0.18-8.el6 qpid-jca-xarecovery-0.18-8.el6 qpid-qmf-0.18-28.el6 qpid-tools-0.18-10.el6_4 ->VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2014-1682.html |