Bug 1122631
Summary: | Preloading via JDBC uses an exorbitant amount of memory. | ||||||
---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Data Grid 6 | Reporter: | John Osborne <josborne> | ||||
Component: | EAP | Assignee: | Tristan Tarrant <ttarrant> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Martin Gencur <mgencur> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 6.2.1 | CC: | afield, dberinde, dmehra, jdg-bugs, slaskawi, wfink | ||||
Target Milestone: | ER3 | ||||||
Target Release: | 6.4.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | Type: | Bug | |||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
John, I'm not sure I understand the issue. Let me ask a few questions: * How did you store the data in the relational database in the first place? Manually via INSERT operations on DB or via a cache and JDBC cache store? * When does the problem occur exactly? When you restart the app and try to preload the data to the cache via a JDBC cache store? I've taken a look at the app and it looks like by "preload", you mean executing a query via SELECT and then manually constructing TIN object and storing them in the cache. I could imagine that such an operation could take much more memory then what is actually in the DB because you're constructing new object. There will be some overhead. So what do you mean by "preload"? Thanks Martin Martin, I think this is related to ISPN-3937. It seems the PostgreSQL driver also loads the entire result set in memory by default, and you need to set up the statement "just right" to avoid that: http://stackoverflow.com/questions/827110/large-resultset-on-postgresql-query The attached application uses ResultSet.TYPE_SCROLL_INSENSITIVE to fetch the rows to preload, doesn't disable autocommmit, and doesn't call setFetchSize(), so the cause of the OOM is almost certainly the PostgreSQL driver. I don't think we have a test for memory usage during preload with PostgreSQL, so we might have the same issues. * How did you store the data in the relational database in the first place? Manually via INSERT operations on DB or via a cache and JDBC cache store? I loaded it through JDG via "put" operations. * When does the problem occur exactly? When you restart the app and try to preload the data to the cache via a JDBC cache store? I've taken a look at the app and it looks like by "preload", you mean executing a query via SELECT and then manually constructing TIN object and storing them in the cache. I could imagine that such an operation could take much more memory then what is actually in the DB because you're constructing new object. There will be some overhead. So what do you mean by "preload"? When I preload the data in the cache via the JDBC store. The app has a separate method to go through a different DB via JDBC. You can ignore that. It is not getting called. In this scenario I am only using the preload JDBC Cache store in the XML configuration. The amount of memory still does not add up. The DB that is loading has a total size of 13.5GB, yet when the DB is preloaded, you need the VM to be set to 44-48GB for it to preload without running out of memory. John, could you test without actually inserting data in JDG, just querying the DB and iterating over the ResultSet, and see how much memory you need for that? I have tested that and in that case it does not use any extra amount of memory. Only when its preloaded via JDG does it run out of memory. This happens in server mode and library mode. I have code and VMs setup within the RH network that I can give you access to in order to see the problem if that helps. John, I'm going to use your app to reproduce it locally. From your response it does not seem the problem was caused by the JDBC driver. John, can you describe which operations to call on your application in order to reproduce the issue. I'm just making sure I'd doing the same as you. Thanks I performed another round of tests where I used TIN classes (defined in John's application). This tin class has 4 string fields: id, name, surname, and misc text. I went and created 90000 of these objects and stored them in PostgreSQL. The DB table created by StringBasedJDBCCacheStore occupied 90MB. The memory consumption (always after performing a manual full GC) was like this: Clear EAP server after start: 19404K EAP including John's application, no data stored yet: 155062K EAP after storing 90MB data in JDG: 445844K EAP after further restart where the application was automatically started and JDG preloaded data from DB through the cache store: 325486K So the numbers go like this: 19404K -> 155062K -> 445844K -> 325486K The interim conclusion is that JDG needs heap size that is 3 times higher than the data stored in the DB, for this particular use case with TIN objects. I'm going to perform a heap dump and analyze where the memory is spent. Note to my previous comment: I also tried with MySQL and got similar results so I don't think this is a problem of the JDBC driver for Postgres. OK, when using storeAsBinary in this test (<storeAsBinary enabled="true" storeKeysAsBinary="true" storeValuesAsBinary="true"/>), the results are following: EAP after storing 90MB data in JDG: 424101K EAP after further restart: 254418K While it doesn't help much to save memory when loading data into JDG first time, it is much better after EAP restart (after preload). Taking into account that EAP+app without data takes 155062K, the final memory usage 254418K means that the actual data takes just a bit more (~99MB)then what is its size in DB (91MB). Looks like the value of maxEntries in eviction element significantly affects memory consumption. BoundedConcurrentHashMap has Segments and each segment allocates a table (array of HashEntry whose size is allocated in advance according to maxEntries). I verified after Dan's suggestion that setting maxEntries to 90000 and storing 90000 entries results in this memory consumption (using storeAsBinary): Clear EAP with John's application: 30659K After loading data: 237497K After restart: 120710K All these values are significantly lower. There's still the difference between the moments when the data is loaded and when the server is restarted: 237497K vs. 120710K According to heapdump, this is caused by the fact that JDG marshalls the values and stores them in an InternalCacheEntry as ExpandableMarshalledValueByteStream, while after restart and cache store preload, the ImmutableMarshalledValueByteStream is used to hold the data. The expandable one has additional bytes allocated for future writes and the bytes occupy a lot of space even though they're never used. I'll try to prepare a fix. I came to the following conclusions. 1) This issue was not caused by a JDBC driver 2) a lot of memory is consumed before actually storing the data if there's very high maxEntries parameter set for eviction. This can only be fixed by changing the implementation of BoundedConcurrentHashMap. 3) storeAsBinary saves memory in this particular use case but the actual effect needs to be measured with real data. I will backport the pull request from ISPN-4678 to the product. >> storeAsBinary saves memory in this particular use case but the actual effect needs to be measured with real data. I will backport the pull request from ISPN-4678 to the product.
Can we measure the savings on ISPN 7 before we backport to JDG 6.4 ?
Savings can be found in the PR for related ISPN JIRA. Clarification: The pull request fixed the use case with storeAsBinary where it helps save memory. The other cases where storeAsBinary is NOT set still have the original behavior. We might need a separate issue for the other cases. |
Created attachment 920274 [details] mvn project and server.log I am preloading 13.5 GB of data (confirmed in PostgreSQL Admin), yet JDG runs out of memory unless I set the Java xmx Heap Space parameter to 44-48GB. This happens when running DIST or REPL Mode. I have confirmed this issue on JDG 6.2.1 library mode and JDG 6.2.1 client/server mode. I am attaching the code/configs and log files. I have this running on a VM inside the RH network, so please ask to see a demo of this issue. The log files show a failure when xmx=32GB.