Bug 1309181 - Lock timeout when registering a queried class to the registry cache
Lock timeout when registering a queried class to the registry cache
Status: VERIFIED
Product: JBoss Data Grid 6
Classification: JBoss
Component: Infinispan (Show other bugs)
6.6.0
Unspecified Unspecified
unspecified Severity unspecified
: ER1
: 6.6.1
Assigned To: Tristan Tarrant
Martin Gencur
:
Depends On:
Blocks: 1309749
  Show dependency treegraph
 
Reported: 2016-02-17 00:38 EST by Osamu Nagano
Modified: 2016-08-16 04:32 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
serverlogs.zip (78.34 KB, application/zip)
2016-02-17 00:38 EST, Osamu Nagano
no flags Details
eap640.zip (16.84 KB, application/zip)
2016-02-17 00:40 EST, Osamu Nagano
no flags Details
restcache.zip (10.71 KB, application/zip)
2016-02-17 00:40 EST, Osamu Nagano
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker ISPN-4696 Blocker Resolved SuspectException kills operation on ClusterRegistryImpl 2017-07-06 00:57 EDT
Red Hat Knowledge Base (Solution) 2155791 None None None 2016-02-17 00:45 EST

  None (edit)
Description Osamu Nagano 2016-02-17 00:38:58 EST
Created attachment 1127846 [details]
serverlogs.zip

Description of problem:
GSS guided a customer to start one node first to be the coordinator, then start the rest of nodes to avoid confusion.  Though a lock timeout still happens when starting the rest of servers at once.  Specifically, registering a queried class to cache "__cluster_registry_cache__" throws TimeoutException.  It happens about once in 20 times startup with 1+4 nodes environment and more frequent in the 16 nodes customer's environment.

Sample stacktrace from server203.log in attachment "serverlogs.zip".
~~~
16:18:59,237 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (remote-thread-0,ontpx230:server203) ISPN000136: Error executing command PrepareCommand, writing keys []: org.infinispan.util.concurren
t.TimeoutException: Unable to acquire lock after [10 seconds] on key [ScopedKey{scope = QueryKnownClasses#DUMMY, key = class com.example.restcache.model.MyDataOne}] for requestor [GlobalTransaction:<ontpx230:serve
r204-12447>:8:remote]! Lock held by [GlobalTransaction:<ontpx230:server203-54882>:8:local]
        at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:198) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLock(LockManagerImpl.java:171) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockKeyAndCheckOwnership(AbstractTxLockingInterceptor.java:185) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAndRegisterBackupLock(AbstractTxLockingInterceptor.java:116) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.locking.OptimisticLockingInterceptor$LockAcquisitionVisitor.lockAndRecord(OptimisticLockingInterceptor.java:228) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-red
hat-4]
        at org.infinispan.interceptors.locking.OptimisticLockingInterceptor$LockAcquisitionVisitor.visitSingleKeyCommand(OptimisticLockingInterceptor.java:223) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.F
inal-redhat-4]
        at org.infinispan.interceptors.locking.OptimisticLockingInterceptor$LockAcquisitionVisitor.visitPutKeyValueCommand(OptimisticLockingInterceptor.java:216) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0
.Final-redhat-4]
        at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.acquireLocksVisitingCommands(OptimisticLockingInterceptor.java:287) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.visitPrepareCommand(OptimisticLockingInterceptor.java:81) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:127) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.TxInterceptor.invokeNextInterceptorAndVerifyTransaction(TxInterceptor.java:134) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:121) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:127) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:112) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:117) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.statetransfer.TransactionSynchronizerInterceptor.visitPrepareCommand(TransactionSynchronizerInterceptor.java:39) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:127) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.statetransfer.StateTransferInterceptor.handleTxCommand(StateTransferInterceptor.java:215) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.statetransfer.StateTransferInterceptor.visitPrepareCommand(StateTransferInterceptor.java:103) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:127) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:112) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:117) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:127) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:105) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:74) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:117) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:127) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.BatchingInterceptor.handleDefault(BatchingInterceptor.java:58) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:117) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:127) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:336) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.commands.tx.PrepareCommand.perform(PrepareCommand.java:117) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:92) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.remoting.InboundInvocationHandlerImpl.access$000(InboundInvocationHandlerImpl.java:47) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at org.infinispan.remoting.InboundInvocationHandlerImpl$2.run(InboundInvocationHandlerImpl.java:185) [infinispan-embedded-6.4.0.Final-redhat-4.jar:6.4.0.Final-redhat-4]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_65]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_65]
        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_65]
~~~


Version-Release number of selected component (if applicable):
JDG 6.6.0 library mode


How reproducible:
1/20 in 1+4 nodes environment.


Steps to Reproduce:
Sample app, restcache.zip, and EAP 6.4.0 domain are attached.  The sample app starts a cache named "DUMMY" and registers some @Indexed classes when it is deployed.

1. Start master host-controller which has one server to be the coordinator.
unzip eap640.zip
cd eap640
$JBOSS_HOME/bin/domain.sh -Djboss.domain.base.dir=./master --host-config=host-master.xml&

2. Start slave host-controller which has 4 servers.
(in the same "eap640" directory)
sudo ip addr add 127.0.1.1/8 dev lo label lo:1
$JBOSS_HOME/bin/domain.sh -Djboss.domain.base.dir=./slave --host-config=host-slave.xml --master-address=127.0.0.1 -b 127.0.1.1 -bmanagement=127.0.1.1&

3. Build and deploy the sample app.
unzip restcache.zip
cd restcache
mvn package
$JBOSS_HOME/bin/jboss-cli.sh -c "deploy /path/to/restcache.war --all-server-groups"

4. Repeat restarting the slave host-controller until reproducing the stacktrace.
To stop, $JBOSS_HOME/bin/jboss-cli.sh -c 'shutdown --host=<SLAVE_HC>'.
To start, the same as in step 2.
Comment 1 Osamu Nagano 2016-02-17 00:40 EST
Created attachment 1127847 [details]
eap640.zip
Comment 3 Osamu Nagano 2016-02-17 00:40 EST
Created attachment 1127848 [details]
restcache.zip
Comment 5 Tristan Tarrant 2016-02-17 08:28:37 EST
A possible fix would be to allow passing a custom lockAcquisitionTimeout to the ClusterRegistry via some system property.
We won't be using the ClusterRegistry in JDG 7 so this only applies to JDG 6.6.
Comment 13 Anna Manukyan 2016-08-16 04:32:10 EDT
Verified, thanks!

Note You need to log in before you can comment on or make changes to this bug.