Bug 1096744 - Increased memory requirements of MRG-M 3.0, as compared to 2.3
Summary: Increased memory requirements of MRG-M 3.0, as compared to 2.3
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 3.0
Hardware: All
OS: Linux
high
high
Target Milestone: 3.1
: ---
Assignee: Gordon Sim
QA Contact: Zdenek Kraus
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-12 11:18 UTC by Pavel Moravec
Modified: 2018-12-05 18:32 UTC (History)
8 users (show)

Fixed In Version: qpid-cpp-0.30-2
Doc Type: Bug Fix
Doc Text:
Previously, message state was shared between queues on which the message was enqueued. This behavior was incorrect, because certain elements of the state tracked information relevant to the message specific to a particular queue. This caused memory consumption to increase significantly. The amount of duplicated (per queue) state tracked for each message is now optimised to reduce the amount of memory required. Though a given scenario may still (unavoidably) require a little more memory, memory consumption is reduced compared to the original scenario.
Clone Of:
Environment:
Last Closed: 2015-04-14 13:48:03 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Apache JIRA QPID-5783 None None None Never
Red Hat Bugzilla 867826 None None None Never
Red Hat Product Errata RHEA-2015:0805 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 3.1 Release 2015-04-14 17:45:54 UTC

Internal Links: 867826

Description Pavel Moravec 2014-05-12 11:18:10 UTC
Description of problem:
Comparing memory requirements of MRG-M 3.0 against latest MRG 2 broker (0.18-20), I see 3.0 consumes up to 4 times more (!) memory in one scenario.

Basic use case: have X (like 100) queues bound to a fanout exchange. Send Y (few thousands of) messages to the exchange and compare memory consumed by qpidd.

The more queues you use, the worse 3.0 behaves. The more messages you send (for considerable high X), the worse 3.0 behaves.


Version-Release number of selected component (if applicable):
0.22-38


How reproducible:
100%


Steps to Reproduce:
1. Have a script like:
service qpidd restart
queues=$1
durable=$2
messages=$3
size=$4

for i in $(seq 1 $queues); do
        qpid-receive -a "queue_${i}; {create:always, node:{type:queue, durable:${durable}, x-bindings:[{ exchange:'amq.fanout', queue:'queue_${i}' }], x-declare:{ arguments:{ 'qpid.file_size':640 }}}}" &
done
wait

qpid-send -a "amq.fanout" -m ${messages} --durable=yes --content-size=${size}

ps aux | grep qpid | grep daemon


2. Run the script against 0.18-20 and 0.22-* broker with parameters like:
100 false 1000 0
100 false 100000 0
500 false 1000 0
500 false 10000 0


Actual results:
0.22-* broker shows in "ps" output (RSS column) much higher memory requirements.


Expected results:
Similar memory consumtion of 3.0 broker.


Additional info:
Few observations from various tests I run:
- when having just 1 queue, memory requirements of 0.22 are usually little bit _lower_ than 0.18. So in general, 0.22 manages memory better.
- The more queues you use, the worse 3.0 behaves. The more messages you send (for considerable high X), the worse 3.0 behaves.
- message content length does not have visible impact on the issue (len=0 and len=1024 tested)
- message durability has no visible impact as well


Raw data from my tests:
#queues durable msgs msg_len	0.18-20	0.22-38	0.22/0.18	
1 false 1000 0			9604	9900	103.1%	
1 false 10000 0			22084	19204	87.0%	
1 false 100000 0		126200	111400	88.3%	
1 false 100000 1024		235852	218260	92.5%	
1 true 100000 0			138696	122652	88.4%	
1 true 100000 1024		245504	233564	95.1%	
100 false 100000 0		369188	1333984	361.3%	
100 false 100000 1024		477516	1443052	302.2%	
100 false 1000 0		15512	25732	165.9%	
100 false 1000 1024		16732	24800	148.2%	
100 true 1000 0			112052	180872	161.4%	
100 true 1000 1024		182912	230932	126.3%	
500 false 1000 0		28676	85124	296.8%	
500 false 1000 1024		29760	85832	288.4%	
500 false 1000 0		28676	85124	296.8%	copy
500 false 2000 0		44240	147404	333.2%	
500 false 3000 0		57840	208868	361.1%	
500 false 4000 0		78864	271024	343.7%	
500 false 5000 0		93520	345960	369.9%	
500 false 6000 0		101724	399152	392.4%	
500 false 7000 0		123748	475364	384.1%	
500 false 8000 0		134720	533952	396.3%	
500 false 9000 0		146372	593464	405.4%	
500 false 10000 0		155136	651360	419.9%	
500 false 10000 1024		161764	661336	408.8%	
100 false 1000 0		15512	25732	165.9%	copy
200 false 1000 0		19768	42584	215.4%	
300 false 1000 0		23960	59692	249.1%	
400 false 1000 0		27964	70836	253.3%	
500 false 1000 0		28676	85124	296.8%	copy

Comment 2 Pavel Moravec 2014-05-14 07:58:37 UTC
This is somehow related to https://bugzilla.redhat.com/show_bug.cgi?id=867826. Comparing memory requirements in the use case before and after restart between 0.18-20 and 0.22-38, I get interesting results:


10queues, 100000msgs each, before restart		275672	339944
10queues, 100000msgs each, after restart		1324916	1160520
20queues, 100000msgs each, before restart		444836	582364
20queues, 100000msgs each, after restart		2642236	2308496
100queues, 10000msgs each, before restart		342388	442552
100queues, 10000msgs each, after restart		1424352	1260080
100queues, 10000msgs(1024b) each, before restart	377520	469464
100queues, 10000msgs(1024b) each, after restart		2500696	2331000

So, while memory consumption of 3.0 before broker restart is higher, it is lower after a restart.

Message size does not affect that.

All tests were run with legacy store.

This observation somehow put less stress in fixing this BZ. Anyway I would like to get explanation for the phenomenon (higher RAM usage before restart).

Comment 3 Gordon Sim 2014-05-14 09:06:00 UTC
The issue is in the amount of message state that is shared between the queues. Before a restart, the 0.18 based code shares everything. This is in fact a bug. Certain changes are supposed to be specific to the message on a specific queue and do not apply to the same message on other queues.

This is fixed in 0.22 based builds which is why the memory is higher. However, while we cannot eliminate any extra memory (even one 64 bit int, when present in 500 copies of 10000 different messages) I think we can improve on the amount extra memory used.

On recovering from disk, messages on queues share no state even if they originally represented the same message.

Comment 4 Pavel Moravec 2014-05-15 07:10:02 UTC
(In reply to Gordon Sim from comment #3)
> The issue is in the amount of message state that is shared between the
> queues. Before a restart, the 0.18 based code shares everything. This is in
> fact a bug. Certain changes are supposed to be specific to the message on a
> specific queue and do not apply to the same message on other queues.
> 
> This is fixed in 0.22 based builds which is why the memory is higher.
> However, while we cannot eliminate any extra memory (even one 64 bit int,
> when present in 500 copies of 10000 different messages) I think we can
> improve on the amount extra memory used.
> 
> On recovering from disk, messages on queues share no state even if they
> originally represented the same message.

Thanks a lot for this sound explanation, really appreciated.

Due to Comment #2 (https://bugzilla.redhat.com/show_bug.cgi?id=1096744#c2), this BZ turns rather to request for explanation. With one issue explained, it remains to answer:

Why the ratio "0.22_memory_utilization / 0.18_memory_utilization" gets worse when enqueueing more messages?

As e.g.:
- for 500 queues, 1000 messages sent, 0.18 has 28676 RSS, 0.22 has 85124, i.e. 3times more;
- same setup but sending 10000 messages, 0.18 has 155136 RSS and 0.22 has 651360, i.e. 4times more
  - why not "only" 3times more like for 1000 msgs?

Comment 5 Gordon Sim 2014-05-23 16:20:12 UTC
Some improvements checked in upstream: https://svn.apache.org/r1597121

Comment 6 Justin Ross 2014-06-17 17:38:33 UTC
Pavel, would you rerun your tests to find out what difference Gordon's changes made?

(In reply to Gordon Sim from comment #5)
> Some improvements checked in upstream: https://svn.apache.org/r1597121

Comment 7 Pavel Moravec 2014-06-17 18:26:22 UTC
(In reply to Justin Ross from comment #6)
> Pavel, would you rerun your tests to find out what difference Gordon's
> changes made?
> 
> (In reply to Gordon Sim from comment #5)
> > Some improvements checked in upstream: https://svn.apache.org/r1597121

I already did such comparison - in very general, for use cases with few hundreds of queues bound to a fanout exchange, Gordon's improvement decreases memory usage by half. See csv data:

#queues durable msgs msg_len;0.18-20;0.22-38;0.22/0.18;Upstream (r1601656);Upstream/0.18;Upstream/0.22
1 false 1000 0;9604;9900;103.1%;;;
1 false 10000 0;22084;19204;87.0%;;;
1 false 100000 0;126200;111400;88.3%;107968;85.6%;96.9%
1 false 100000 1024;235852;218260;92.5%;;;
1 true 100000 0;138696;122652;88.4%;;;
1 true 100000 1024;245504;233564;95.1%;;;
100 false 100000 0;369188;1333984;361.3%;602052;163.1%;45.1%
100 false 100000 1024;477516;1443052;302.2%;681876;142.8%;47.3%
100 false 1000 0;15512;25732;165.9%;17752;114.4%;69.0%
100 false 1000 1024;16732;24800;148.2%;;;
100 true 1000 0;112052;180872;161.4%;;;
100 true 1000 1024;182912;230932;126.3%;;;
500 false 1000 0;28676;85124;296.8%;48124;167.8%;56.5%
500 false 1000 1024;29760;85832;288.4%;51376;172.6%;59.9%
500 false 1000 0;28676;85124;296.8%;48124;167.8%;56.5%
500 false 2000 0;44240;147404;333.2%;75800;171.3%;51.4%
500 false 3000 0;57840;208868;361.1%;100572;173.9%;48.2%
500 false 4000 0;78864;271024;343.7%;125528;159.2%;46.3%
500 false 5000 0;93520;345960;369.9%;155612;166.4%;45.0%
500 false 6000 0;101724;399152;392.4%;181108;178.0%;45.4%
500 false 7000 0;123748;475364;384.1%;207596;167.8%;43.7%
500 false 8000 0;134720;533952;396.3%;231176;171.6%;43.3%
500 false 9000 0;146372;593464;405.4%;256724;175.4%;43.3%
500 false 10000 0;155136;651360;419.9%;281784;181.6%;43.3%
500 false 10000 1024;161764;661336;408.8%;293632;181.5%;44.4%
100 false 1000 0;15512;25732;165.9%;17752;114.4%;69.0%
200 false 1000 0;19768;42584;215.4%;23592;119.3%;55.4%
300 false 1000 0;23960;59692;249.1%;37968;158.5%;63.6%
400 false 1000 0;27964;70836;253.3%;45012;161.0%;63.5%
500 false 1000 0;28676;85124;296.8%;48124;167.8%;56.5%

Let me know if you are also interested in memory utilization after broker restart (though that is rather a topic of bz867826 ([RFE] QPid memory usage is not consistent across restart)).

Comment 8 Pavel Moravec 2014-06-17 18:27:27 UTC
Quoting the figures above, kudos to Gordon to decrease memory utilization by 50% (in the relevant use cases)!

Comment 9 Justin Ross 2014-06-17 18:38:28 UTC
Thanks, Pavel and Gordon.  Given comment 3, the memory increases still existing fall in the expected range. -> POST

Comment 12 Zdenek Kraus 2015-01-26 19:23:44 UTC
I have results summarized:
:: x86_64,c++
---------------- + ---------------- + ---------------- + ----------------
Qcnt,Mcnt,Msize  | 18-36.el6.x86_64 | 30-5.el6.x86_64  | result
---------------- + ---------------- + ---------------- + ----------------
1,10000,1024     | 32636            | 30952            | 0.948400539282  
1,100000,0       | 125668           | 110016           | 0.875449597352  
10,100000,0      | 147856           | 154664           | 1.04604480035   
100,10000,0      | 47592            | 72200            | 1.51706169104   
1,10000,0        | 19428            | 19664            | 1.0121474161    
10,1000,2048     | 13152            | 14108            | 1.07268856448   
1,10000,2048     | 40604            | 45056            | 1.10964437001   
100,100000,0     | 371368           | 600272           | 1.61638051744   
10,1000,1024     | 13744            | 14772            | 1.07479627474   
10,10000,1024    | 32828            | 35764            | 1.08943584745   
500,1000,2048    | 42276            | 49236            | 1.16463241555   
100,10000,2048   | 68968            | 93504            | 1.35575919267   
100,10000,1024   | 60844            | 83468            | 1.37183617119   
100,1000,1024    | 22232            | 22552            | 1.01439366679   
500,1000,1024    | 42188            | 50780            | 1.20365980848   
10,10000,0       | 22244            | 24460            | 1.09962237008   
500,10000,0      | 150160           | 276788           | 1.84328716036   
500,10000,2048   | 175308           | 304136           | 1.73486663472   
500,10000,1024   | 161340           | 288024           | 1.78519895872   
10,10000,2048    | 42940            | 45720            | 1.06474149977   
1,1000,2048      | 11920            | 12912            | 1.08322147651   
1,1000,1024      | 15192            | 11984            | 0.788836229595  
100,1000,2048    | 25832            | 23664            | 0.916073087643  
---------------- + ---------------- + ---------------- + ----------------

:: i686,c++
---------------- + ---------------- + ---------------- + ----------------
Qcnt,Mcnt,Msize  | 18-36.el6.i686   | 30-5.el6.i686    | result
---------------- + ---------------- + ---------------- + ----------------
1,10000,1024     | 25620            | 25600            | 0.999219359875  
1,100000,0       | 82352            | 73152            | 0.888284437536  
10,100000,0      | 97324            | 105840           | 1.08750154124   
100,10000,0      | 34264            | 52728            | 1.53887462059   
1,10000,0        | 14072            | 15000            | 1.06594656055   
10,1000,2048     | 10908            | 11988            | 1.09900990099   
1,10000,2048     | 35672            | 35624            | 0.998654406818  
100,100000,0     | 244564           | 433184           | 1.77125006133   
10,1000,1024     | 9588             | 11000            | 1.14726741761   
10,10000,1024    | 26860            | 29140            | 1.08488458675   
500,1000,2048    | 32396            | 39296            | 1.21298925793   
100,10000,2048   | 53156            | 73672            | 1.38595831139   
100,10000,1024   | 43796            | 63524            | 1.45045209608   
100,1000,1024    | 14544            | 17312            | 1.1903190319    
500,1000,1024    | 34828            | 36616            | 1.0513380039    
10,10000,0       | 15960            | 18420            | 1.15413533835   
500,10000,0      | 108108           | 202148           | 1.86987086987   
500,10000,2048   | 124044           | 223112           | 1.79865209119   
500,10000,1024   | 118324           | 214132           | 1.80970893479   
10,10000,2048    | 36800            | 39144            | 1.06369565217   
1,1000,2048      | 10408            | 11456            | 1.10069177556   
1,1000,1024      | 8936             | 10436            | 1.1678603402    
100,1000,2048    | 15652            | 18152            | 1.15972399693   
---------------- + ---------------- + ---------------- + ----------------


so basically between 0.18 and 0.30 the memory increment reside in 0.9~2.0 interval, which is acceptable In my opinion.

Could you please check Gordon and Pavel ?

Comment 13 Gordon Sim 2015-01-26 19:37:11 UTC
Also, the cases where there is between 50% and 100% increase are those with 100 or more queues and at least 10000 messages, i.e. 1 million message 'instances'. Ultimately what is acceptable will depend on use cases I guess, but I think this is a 'reasonable' situation.

Comment 14 Pavel Moravec 2015-01-26 19:57:39 UTC
(In reply to Gordon Sim from comment #13)
> Also, the cases where there is between 50% and 100% increase are those with
> 100 or more queues and at least 10000 messages, i.e. 1 million message
> 'instances'. Ultimately what is acceptable will depend on use cases I guess,
> but I think this is a 'reasonable' situation.

I agree. Definitely there is a significant improvement from "Raw data from my tests:" table in c#0 (300% or 400% mem.utilization). Some increase is acceptable drawback of the way the broker deals with messages over multiple queues.

Comment 15 Zdenek Kraus 2015-03-19 13:35:52 UTC
Reran with latest qpid-cpp-server-0.30-7 and results are still very satisfiable.

marking this as VERIFIED.

Comment 17 errata-xmlrpc 2015-04-14 13:48:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-0805.html


Note You need to log in before you can comment on or make changes to this bug.