Bug 1122631

Summary:

Preloading via JDBC uses an exorbitant amount of memory.

Product:

[JBoss] JBoss Data Grid 6

Reporter:

John Osborne <josborne>

Component:

EAP

Assignee:

Tristan Tarrant <ttarrant>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Martin Gencur <mgencur>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

6.2.1

CC:

afield, dberinde, dmehra, jdg-bugs, slaskawi, wfink

Target Milestone:

ER3

Target Release:

6.4.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
mvn project and server.log	none

Description John Osborne 2014-07-23 16:28:36 UTC

Created attachment 920274 [details]
mvn project and server.log

I am preloading 13.5 GB of data (confirmed in PostgreSQL Admin), yet JDG runs out of memory unless I set the Java xmx Heap Space parameter to 44-48GB.

This happens when running DIST or REPL Mode. I have confirmed this issue on JDG 6.2.1 library mode and JDG 6.2.1 client/server mode.

I am attaching the code/configs and log files.  I have this running on a VM inside the RH network, so please ask to see a demo of this issue.

The log files show a failure when xmx=32GB.

Comment 2 Martin Gencur 2014-07-29 07:46:16 UTC

John,
I'm not sure I understand the issue. Let me ask a few questions:

* How did you store the data in the relational database in the first place? Manually via INSERT operations on DB or via a cache and JDBC cache store?
* When does the problem occur exactly? When you restart the app and try to preload the data to the cache via a JDBC cache store? I've taken a look at the app and it looks like by "preload", you mean executing a query via SELECT and then manually constructing TIN object and storing them in the cache. I could imagine that such an operation could take much more memory then what is actually in the DB because you're constructing new object. There will be some overhead. So what do you mean by "preload"?

Thanks
Martin

Comment 3 Dan Berindei 2014-07-29 09:51:08 UTC

Martin, I think this is related to ISPN-3937.

It seems the PostgreSQL driver also loads the entire result set in memory by default, and you need to set up the statement "just right" to avoid that: 
http://stackoverflow.com/questions/827110/large-resultset-on-postgresql-query

The attached application uses ResultSet.TYPE_SCROLL_INSENSITIVE to fetch the rows to preload, doesn't disable autocommmit, and doesn't call setFetchSize(), so the cause of the OOM is almost certainly the PostgreSQL driver.

I don't think we have a test for memory usage during preload with PostgreSQL, so we might have the same issues.

Comment 4 John Osborne 2014-07-30 19:51:29 UTC

* How did you store the data in the relational database in the first place?
Manually via INSERT operations on DB or via a cache and JDBC cache store?

I loaded it through JDG via "put" operations.

* When does the problem occur exactly? When you restart the app and try to
preload the data to the cache via a JDBC cache store? I've taken a look at the
app and it looks like by "preload", you mean executing a query via SELECT and
then manually constructing TIN object and storing them in the cache. I could
imagine that such an operation could take much more memory then what is
actually in the DB because you're constructing new object. There will be some
overhead. So what do you mean by "preload"?

When I preload the data in the cache via the JDBC store.

The app has a separate method to go through a different DB via JDBC.  You can ignore that.  It is not getting called.  In this scenario I am only using the preload JDBC Cache store in the XML configuration.

Comment 5 John Osborne 2014-07-30 19:51:39 UTC

The amount of memory still does not add up.

The DB that is loading has a total size of 13.5GB, yet when the DB is preloaded, you need the VM to be set to 44-48GB for it to preload without running out of memory.

Comment 6 Dan Berindei 2014-07-31 06:17:41 UTC

John, could you test without actually inserting data in JDG, just querying the DB and iterating over the ResultSet, and see how much memory you need for that?

Comment 7 John Osborne 2014-08-01 13:50:33 UTC

I have tested that and in that case it does not use any extra amount of memory.  Only when its preloaded via JDG does it run out of memory.  This happens in server mode and library mode.

I have code and VMs setup within the RH network that I can  give you access to in order to see the problem if that helps.

Comment 8 Martin Gencur 2014-08-28 08:11:17 UTC

John, I'm going to use your app to reproduce it locally. From your response it does not seem the problem was caused by the JDBC driver.

Comment 9 Martin Gencur 2014-08-28 09:08:27 UTC

John, can you describe which operations to call on your application in order to reproduce the issue. I'm just making sure I'd doing the same as you.
Thanks

Comment 10 Martin Gencur 2014-08-29 07:38:44 UTC

I performed another round of tests where I used TIN classes (defined in John's application). This tin class has 4 string fields: id, name, surname, and misc text. I went and created 90000 of these objects and stored them in PostgreSQL. The DB table created by StringBasedJDBCCacheStore occupied 90MB.
The memory consumption (always after performing a manual full GC) was like this:
Clear EAP server after start: 19404K
EAP including John's application, no data stored yet: 155062K
EAP after storing 90MB data in JDG: 445844K
EAP after further restart where the application was automatically started and JDG preloaded data from DB through the cache store: 325486K 

So the numbers go like this:
19404K -> 155062K -> 445844K -> 325486K

The interim conclusion is that JDG needs heap size that is 3 times higher than the data stored in the DB, for this particular use case with TIN objects.

I'm going to perform a heap dump and analyze where the memory is spent.

Comment 11 Martin Gencur 2014-08-29 07:40:08 UTC

Note to my previous comment: I also tried with MySQL and got similar results so I don't think this is a problem of the JDBC driver for Postgres.

Comment 12 Martin Gencur 2014-08-29 08:38:53 UTC

OK, when using storeAsBinary in this test (<storeAsBinary enabled="true" storeKeysAsBinary="true" storeValuesAsBinary="true"/>), the results are following:
EAP after storing 90MB data in JDG: 424101K
EAP after further restart: 254418K

While it doesn't help much to save memory when loading data into JDG first time, it is much better after EAP restart (after preload). Taking into account that EAP+app without data takes 155062K, the final memory usage 254418K means that the actual data takes just a bit more (~99MB)then what is its size in DB (91MB).

Comment 13 Martin Gencur 2014-08-29 15:42:56 UTC

Looks like the value of maxEntries in eviction element significantly affects memory consumption. BoundedConcurrentHashMap has Segments and each segment allocates a table (array of HashEntry whose size is allocated in advance according to maxEntries). I verified after Dan's suggestion that setting maxEntries to 90000 and storing 90000 entries results in this memory consumption (using storeAsBinary):
Clear EAP with John's application: 30659K
After loading data: 237497K
After restart: 120710K
All these values are significantly lower.

There's still the difference between the moments when the data is loaded and when the server is restarted: 237497K vs. 120710K
According to heapdump, this is caused by the fact that JDG marshalls the values and stores them in an InternalCacheEntry as ExpandableMarshalledValueByteStream, while after restart and cache store preload, the ImmutableMarshalledValueByteStream is used to hold the data. The expandable one has additional bytes allocated for future writes and the bytes occupy a lot of space even though they're never used. 
I'll try to prepare a fix.

Comment 15 Martin Gencur 2014-10-30 12:39:41 UTC

I came to the following conclusions.

1) This issue was not caused by a JDBC driver
2) a lot of memory is consumed before actually storing the data if there's very high maxEntries parameter set for eviction.
This can only be fixed by changing the implementation of BoundedConcurrentHashMap.
3) storeAsBinary saves memory in this particular use case but the actual effect needs to be measured with real data. I will backport the pull request from ISPN-4678 to the product.

Comment 17 Divya Mehra 2014-10-30 14:28:50 UTC

>> storeAsBinary saves memory in this particular use case but the actual effect needs to be measured with real data. I will backport the pull request from ISPN-4678 to the product.

Can we measure the savings on ISPN 7 before we backport to JDG 6.4 ?

Comment 18 Martin Gencur 2014-10-30 14:38:59 UTC

Savings can be found in the PR for related ISPN JIRA.

Comment 19 Sebastian Łaskawiec 2014-11-03 11:34:24 UTC

PR: https://github.com/infinispan/jdg/pull/317

Comment 20 Martin Gencur 2014-11-07 10:04:50 UTC

Clarification: The pull request fixed the use case with storeAsBinary where it helps save memory. The other cases where storeAsBinary is NOT set still have the original behavior.

We might need a separate issue for the other cases.