Description of problem: With LevelDB configured as a cache store for a distributed cache and passivation set to false, loading the cache with large amounts of data causes a lock timeout for the Hot Rod client. Version-Release number of selected component (if applicable): 6.2.1 How reproducible: Consistent Steps to Reproduce: 1. Configure distributed cache with leveldb store as follows: <distributed-cache name="prices" mode="ASYNC" segments="20" owners="2" remote-timeout="30000" start="EAGER" l1-lifespan="0"> <compatibility enabled="true"/> <locking isolation="READ_COMMITTED" acquire-timeout="30000" concurrency-level="1000" striping="true"/> <transaction mode="NONE"/> <leveldb-store passivation="false" /> </distributed-cache> 2. The environment typically consists of a cluster of three nodes but it suffices to start one node to reproduce the issue 3. Create a large map of entries (e.g., 179180 entries) and use Hot Rod to store all the data in the cache through a putAll call. Actual results: 01:23:29,807 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (HotRodServerWorker-67) ISPN000136: Execution error: org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [30 seconds] on key [ad6ff7e2-1e26-4962-a92a-210769a043f4_1404238912585] for requestor [Thread[HotRodServerWorker-67,5,main]]! Lock held by [Thread[HotRodServerWorker-65,5,main]] at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:198) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLockNoCheck(LockManagerImpl.java:181) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.locking.AbstractLockingInterceptor.lockKey(AbstractLockingInterceptor.java:149) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.locking.AbstractLockingInterceptor.lockKey(AbstractLockingInterceptor.java:145) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.locking.AbstractLockingInterceptor.visitPutKeyValueCommand(AbstractLockingInterceptor.java:69) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:112) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:32) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:112) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:32) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:219) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:141) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.CacheMgmtInterceptor.updateStoreStatistics(CacheMgmtInterceptor.java:148) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:134) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.compat.BaseTypeConverterInterceptor.visitPutKeyValueCommand(BaseTypeConverterInterceptor.java:63) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:110) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:73) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:32) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1404) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.CacheImpl.putInternal(CacheImpl.java:882) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.CacheImpl.put(CacheImpl.java:874) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.DecoratedCache.put(DecoratedCache.java:472) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.AbstractDelegatingAdvancedCache.put(AbstractDelegatingAdvancedCache.java:182) [infinispan-core-6.0.3.Final-redhat-3.jar:6.0.3.Final-redhat-3] at org.infinispan.server.core.AbstractProtocolDecoder.put(AbstractProtocolDecoder.scala:183) [infinispan.jar:6.0.3.Final-redhat-3] at org.infinispan.server.core.AbstractProtocolDecoder.decodeValue(AbstractProtocolDecoder.scala:140) [infinispan.jar:6.0.3.Final-redhat-3] at org.infinispan.server.core.AbstractProtocolDecoder.decode(AbstractProtocolDecoder.scala:55) [infinispan.jar:6.0.3.Final-redhat-3] at org.infinispan.server.core.AbstractProtocolDecoder.decode(AbstractProtocolDecoder.scala:29) [infinispan.jar:6.0.3.Final-redhat-3] at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.infinispan.server.core.AbstractProtocolDecoder.messageReceived(AbstractProtocolDecoder.scala:377) [infinispan.jar:6.0.3.Final-redhat-3] at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.6.6.Final-redhat-1.jar:3.6.6.Final-redhat-1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_40] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_40] at java.lang.Thread.run(Thread.java:724) [rt.jar:1.7.0_40] Expected results: Additional info: The default setup per documentation is JNI, which I assume is the only supported mode but our server schemas also support the JAVA implementation while our documentation does not describe it and the code for it is missing. Iām not sure if the default cache writing behavior is synchronous and whether changing it to asynchronous (write-behind) would resolve the concurrency issue cited above, but our documentation on LevelDB lacks any information on the subject and I have not been able to configure it with write-behind the way that other cache stores would be configured (and our schema implies).
can you please try without lock striping set to "true"? I.e. Set it to false. If it still fails, then we should run a further analysis.
Hi Mircea, it still fails and the stack looks the same.
can you attach test code? (please) So far, I'm unable to reproduce it :/
Also, is the choice of cache store actually relevant ? Does this happen with the singlefilestore ? Without a store altogether ?
Tristan: No, I tried a simple file-store setup instead of leveldb-store and stored my sample data successfully 3 times in a row while the leveldb-store fails every time: <file-store name="myfilestore" passivation="false"/> (In reply to Tristan Tarrant from comment #5) > Also, is the choice of cache store actually relevant ? Does this happen with > the singlefilestore ? Without a store altogether ?
Created attachment 914591 [details] Assets to reproduce leveldb concurrency issue The tar file includes a required module to set up on JDG and the modified infinispan module to declare a dependency on it. It also includes the updated clustered.xml. Once the modules and configuration file have been updated, start a single node (to make it easy) in clustered mode. Also included in the tar.gz is a JAR file to load the cache and reproduce the error. Alongside GasShopperClient.jar, create a directory called GasShopperClient_lib and copy the following into it: * jboss-datagrid-6.2.1-server/client/hotrod/java/* * jdg/modules/system/layers/base/com/redhat/refarch/jdg/gasshopper/main/gasshopper.jar (from the tar file) * opencsv-2.3.jar from http://sourceforge.net/projects/opencsv/files/opencsv/2.3/ Also download http://www.essostations.com/poi/garmin_loader.zip and place the CSV files alongside GasShopperClient.jar Now use the jar file to populate the cache with sample data, pointing to your JDG server IP address and the CSV files you've downloaded: java -Djdg.host=10.19.139.101 -jar GasShopperClient.jar load ./esso.csv,./exxon_usa.csv,./mobil_usa.csv
Pedro, I tried very hard to reproduce the issue using a simple test case with OOTB Java types but could not. However it reproduces consistently in my reference architecture environment. I tried my best to attach all the assets and instructions to help you recreate the environment but let me know if you find anything missing. (In reply to Pedro Ruivo from comment #4) > can you attach test code? (please) > > So far, I'm unable to reproduce it :/
Thanks Babak. I'm able to reproduce the problem. I'll start investigating it .
Hi all, I've found the problem. In LevelDBStore, a blocking queue is getting full (https://github.com/infinispan/infinispan/blob/master/persistence/leveldb/src/main/java/org/infinispan/persistence/leveldb/LevelDBStore.java#L467) and it is only empty when the eviction thread is triggered. I don't have a solution for it now, but you can try to increase the list max size (default value is 10000). Unfortunately, I don't see any way to configure in the server parser :/ I think the only workaround for now is to kick the eviction thread more frequently (default is 1 minute). You can do this in <expiration interval=<time in milliseconds> /> (I think)
Pedro, is this issue only reproducible with LevelDB cache store, or can it potentially be reproduced with another cache store ?
I don't think so Divya (Babak already tried with file-store and it works finie there). It is a LevelDB implementation issue.
Pedro, is that not the expiration queue? I tried this but it didn't help: <expiration queue-size="1000000" /> I don't believe the leveldb-store has an expiration interval attribute to let me change the eviction frequency.
No, you don't have access to the LevelDB queue. And that is one problem... The expiration interval that I was talking about is not in LevelDB. example: <distributed-cache name="prices" mode="ASYNC" segments="20" owners="2" remote-timeout="30000" start="EAGER" l1-lifespan="0"> <compatibility enabled="true"/> <locking isolation="READ_COMMITTED" acquire-timeout="30000" concurrency-level="1000" striping="true"/> <transaction mode="NONE"/> <leveldb-store passivation="false" /> <expiration interval=<time in milliseconds> /> </distributed-cache>
Thanks, Pedro, I was expecting expiration to be set as a child of the leveldb-store element but now I see that it's under the cache. I tried setting it to 1 second and when that didn't work, to 100ms. I still get the same error.
Babak, it really needs a fix. I'm working on it (and I will fix the parser issue)
just to be clear the bug is in the parser that is unable to parse the queue-size correctly (I'm trying to find the cause).
Pull Request done. Now it is possible to properly set the queue size.
(In reply to Babak Mozaffari from comment #15) > Thanks, Pedro, I was expecting expiration to be set as a child of the > leveldb-store element but now I see that it's under the cache. > > I tried setting it to 1 second and when that didn't work, to 100ms. I still > get the same error. Hi Babak, I found why setting the interval didn't work. it is a bug in 6.2. it is fixed in 6.3 and you should be able to configure the queue-size + expiration interval. Cheers.