Description of problem: With paging turned on, a large number of PagePositionImpl instances in memory, and do not away until garbage collection kicks in. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
fix and test using byteman. Replicating 100% every time: https://github.com/clebertsuconic/hornetq/commit/dc9f121cd0b23a062bb362b4f12bef916627c1e6
I read customer ticket and it seems that customer can still see many PagePositionImpl instances in memory. I've wrote my test simulating customer issue and watched the memory using visualvm. Also created a few heap dumps. PagePositionImpl instances are growing and never garbage collected. To reproduce the problem follow those steps: Clone our testsuite from git: git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git Go to eap-tests-hornetq/scripts and run groovy script PrepareServers.groovy with -DEAP_VERSION=6.4.0.DR13 parameter (groovy at least 2.2.1+ must be used): groovy -DEAP_VERSION=6.4.0.DR13 PrepareServers.groovy (Script will prepare 4 servers - server1..4 in the directory where are you currently standing.) Export these paths to server directories and mcast address: export JBOSS_HOME_1=$PWD/server1/jboss-eap export JBOSS_HOME_2=$PWD/server2/jboss-eap export JBOSS_HOME_3=$PWD/server3/jboss-eap export JBOSS_HOME_4=$PWD/server4/jboss-eap export MCAST_ADDR=235.3.4.5 Go to jboss-hornetq-testsuite/ and switch to "pageLeak" branch: git checkout pageLeak and run this test which will hit OOM eventually: mvn clean install -Dtest=PageLeakSoakTestCase#testPageLeaking | tee log Measure memory of EAP 6.4.0.DR13 server and make some heap dumps. See that memory consumption is growing.
Adding test scenario of PageLeakSoakTestCase#testPageLeaking test: 1. Start EAP 6.3.0.DR13(HQ 2.3.24.Final) with deployed InTopic 2. Start 2 subscribers - slow subscriber which reads 1 msg/s (Thread.sleep(1000); after each receive) - fast subscriber which read at maximum sleep 3. Start publisher which sends 10 000 000 messages to InTopic
I did a typo in step. 1 - it's 6.4.0.DR13
I did some observation. Number of PagePositionImpl instances goes down only if number of messages for slow subscriber goes down. This happens if slow subscriber read messages faster than publisher is sending them. It seems that number of PagePositionImpl instances is equal to number of files in messagingpaging directory.
I don't think the user is still hitting an issue. AFAIK the issue is fixed... GC kicked in should send the instances away after my fix.
I'm not sure whether all problems were resolved. PagePositionImpl instance appears to created for every page file. This can eventually lead to OOM if number of page files goes to 10 thousands. It appears that those instances are GC-ed only if all messages in the page file are read.
Otherwise they're held in memory. Is it limitation of paging?
The PagePosition is kept in Memory on a soft cache. Whenever it needs memory the soft-cache is released. The cache has a limited size as well, whenever you move to reading a different page, the pagePosition should be released. any GC operation should release the object... Now if you are seeing infinite grow on any test than I would like to see it and it would be an issue. Otherwise it's working as supposed to
>> It seems that number of PagePositionImpl instances is equal to number of files in messagingpaging directory. << That's expected until you release the entire page file. When you have a consumer behind, I will remove all the page-positions and keep one per file, until the file is removed. It's what I call the page-completion... so this is as engineered.... Paging is not a database and it should not be used as that... in case there is a large number of files my recommendation is to user a large page-size. So, I don't see any issues so far. If you found a situation where the number of pagePositions would grow indefinitely then it's an issue.
Thanks for feedback. So those instances are because of page-completion and it's by design. As you mentioned increasing page-size-bytes reduces number of page files and number of PagePositionImpl instances in memory. Setting this bz as verified then.