Bug 745921 (EDG-5) - HotRod client/server memory leak suspected
Summary: HotRod client/server memory leak suspected
Keywords:
Status: CLOSED NEXTRELEASE
Alias: EDG-5
Product: JBoss Data Grid 6
Classification: JBoss
Component: Infinispan
Version: 6.0.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 6.0.0
Assignee: Tristan Tarrant
QA Contact:
URL: http://jira.jboss.org/jira/browse/EDG-5
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-31 14:29 UTC by Martin Gencur
Modified: 2012-08-15 16:47 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-11-25 14:13:04 UTC
Type: Bug


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker EDG-5 0 None Closed HotRod client/server memory leak suspected 2011-11-25 14:12:33 UTC

Description Martin Gencur 2011-08-31 14:29:39 UTC
project_key: EDG

After running soak tests with HotRod client with duration 8h and 48h we found out that throughput is gradually decreasing during the test.

The following runs shows (in its artifacts) various statistics one of which is throughput (operations/second):
https://hudson.mw.lab.eng.bos.redhat.com/hudson/view/EDG6/job/edg-60-soak-hotrod-size4/8/
https://hudson.mw.lab.eng.bos.redhat.com/hudson/view/EDG6/job/edg-60-soak-hotrod-size4/9/

This is a potential memory leak.

Comment 1 Galder Zamarreño 2011-09-01 08:34:30 UTC
In which diagram/artifact did you see that decrease exactly? The diagram in https://hudson.mw.lab.eng.bos.redhat.com/hudson/view/EDG6/job/edg-60-soak-hotrod-size4/9/artifact/report/chart-cluster-throughput.png doesn't show much.

If you suspect a memory leak, I'd suggest getting a heap dump when the test finishes so that we can inspect it.

Comment 3 Tristan Tarrant 2011-09-05 08:39:19 UTC
Looking at the memory charts on that run I see nothing that would indicate a memory leak (see https://hudson.mw.lab.eng.bos.redhat.com/hudson/view/EDG6/job/edg-60-soak-hotrod-size4/9/artifact/report/chart-cluster-heap.png). Network traffic decreases (but that is expected with decreasing throughput). CPU usage decreases as well, so it's not like the server is being stressed there. Anything that we can do to see if too much time is spent in lock contention/acquisition ?

Comment 4 Ondrej Nevelik 2011-09-15 08:14:44 UTC
I ran this test with the same configuration yesterday, the build was successful, however the tests weren't finished, they failed in the iteration 32 (so after 32 minutes, instead of 8 hours) because the mean response time was above limits (7 sec): https://hudson.qa.jboss.com/hudson/view/EDG6/job/edg-60-soak-hotrod-size4/13/

Comment 5 Anne-Louise Tangring 2011-09-26 19:14:45 UTC
Docs QE Status: Removed: NEW 


Comment 6 Martin Gencur 2011-10-05 06:59:26 UTC
I configured SmartFrog to collect heap statistics also on driver nodes (HotRod clients are running on them) and ran 2hrs soak tests with 11000 clients. The results is the following: https://hudson.qa.jboss.com/hudson/view/EDG6/job/edg-60-soak-hotrod-size4/23/artifact/report/chart-driver-heap.png . So it seems there is really a memory leak in HotRod clients but I will need to run 8hrs soak test to confirm this. I'll do that over night because our perf. lab is occupied during the day.

Comment 8 Galder Zamarreño 2011-10-10 10:44:32 UTC
ISPN-1383 could be having an effect. Netty has some inbound buffers on the decoder size that are never pruned. They're capacity is only increased but never decreased. So, if the size of data that's stored varies over time, this capacity increase could have an impact on more mem consumption.

Comment 9 Martin Gencur 2011-10-10 11:00:32 UTC
According to my last comment, the memory leak does not seem to be there anymore. Not in HotRod server nor in HotRod clients. Here's the graph for servers: https://hudson.qa.jboss.com/hudson/view/EDG6/job/edg-60-soak-hotrod-size4/24/artifact/report/chart-cluster-heap.png. And here's the graph for HotRod clients: https://hudson.qa.jboss.com/hudson/view/EDG6/job/edg-60-soak-hotrod-size4/24/artifact/report/chart-driver-heap.png. The throughput is more or less constant over the 8hrs run. This was no true with EDG ALPHA1 where the throughput was decreasing.


Note You need to log in before you can comment on or make changes to this bug.