Bug 1134476 - InfinispanIndexManager locks held when primary node change
Summary: InfinispanIndexManager locks held when primary node change
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Data Grid 6
Classification: JBoss
Component: Performance
Version: 6.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: CR1
: 6.3.1
Assignee: Sanne Grinovero
QA Contact: Martin Gencur
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-27 14:34 UTC by Alan Field
Modified: 2015-01-26 14:03 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-01-26 14:03:28 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker ISPN-4599 0 Critical Resolved InfinispanIndexManager locks held when primary node change 2018-11-19 01:27:48 UTC

Description Alan Field 2014-08-27 14:34:26 UTC
Description of problem:

Tracking bug for ISPN-4599

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Radim Vansa 2014-09-08 16:01:01 UTC
I am still getting exceptions as described in ISPN-4599 from the test.

According to Sanne, this is related to https://hibernate.atlassian.net/browse/HSEARCH-1665

Workaround is to start one node, initialize the index (writing single entry into the cache) and then continue starting the other nodes. This helped, but I got tons of these exception afterwards:

11:39:49,633 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate Search: Index updates queue processor for index org.radargun.query.NumberObject-1) HSEARCH000058: HSEARCH000117: IOException on the IndexWriter
java.io.FileNotFoundException: No sub-file with id .tis found (fileName=_11l.cfs files: [.tii, .fnm, .frq, .fdt, .nrm, .prx, .fdx])
	at org.apache.lucene.index.CompoundFileReader.openInput(CompoundFileReader.java:157)
	at org.apache.lucene.index.TermInfosReader.<init>(TermInfosReader.java:104)
	at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:83)
	at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:116)
	at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:696)
	at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:671)
	at org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:200)
	at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3615)
	at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3376)
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3485)
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3467)
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3451)
	at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.commitIndexWriter(IndexWriterHolder.java:158)
	at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.commitIndexWriter(IndexWriterHolder.java:171)
	at org.hibernate.search.backend.impl.lucene.ExclusiveIndexWorkspaceImpl.afterTransactionApplied(ExclusiveIndexWorkspaceImpl.java:45)
	at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:124)
	at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:67)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

Comment 3 Sanne Grinovero 2014-09-09 10:05:06 UTC
Did you have some TimeoutException before that? Or a partition?

The error means that the data found in Infinispan is not consistent with what we wrote before, so there was a data loss.

Comment 4 Radim Vansa 2014-09-09 10:59:33 UTC
Yes, looking on the timestamps, loading the entries has timed out several times on the other nodes:

11:36:18,482 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (Loader-7) ISPN000136: Execution error
org.infinispan.util.concurrent.TimeoutException: Node edg-perf01-27017 timed out
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:174)
	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:523)
	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:281)
	at org.infinispan.query.indexmanager.RemoteIndexingBackend.sendCommand(RemoteIndexingBackend.java:63)
	at org.infinispan.query.indexmanager.RemoteIndexingBackend.applyWork(RemoteIndexingBackend.java:54)
	at org.infinispan.query.indexmanager.InfinispanBackendQueueProcessor.applyWork(InfinispanBackendQueueProcessor.java:79)
	at org.hibernate.search.indexes.impl.DirectoryBasedIndexManager.performOperations(DirectoryBasedIndexManager.java:126)
	at org.hibernate.search.backend.impl.WorkQueuePerIndexSplitter.commitOperations(WorkQueuePerIndexSplitter.java:63)
	at org.hibernate.search.backend.impl.BatchedQueueingProcessor.performWorks(BatchedQueueingProcessor.java:99)
	at org.hibernate.search.backend.impl.TransactionalWorker.performWork(TransactionalWorker.java:105)
	at org.infinispan.query.backend.QueryInterceptor.performSearchWorks(QueryInterceptor.java:229)
	at org.infinispan.query.backend.QueryInterceptor.performSearchWork(QueryInterceptor.java:223)
	at org.infinispan.query.backend.QueryInterceptor.updateIndexes(QueryInterceptor.java:217)
	at org.infinispan.query.backend.QueryInterceptor.processPutKeyValueCommand(QueryInterceptor.java:435)
	at org.infinispan.query.backend.QueryInterceptor.visitPutKeyValueCommand(QueryInterceptor.java:158)
	at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
	at org.infinispan.interceptors.locking.AbstractLockingInterceptor.visitPutKeyValueCommand(AbstractLockingInterceptor.java:46)
	at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
	at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:112)
	at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:32)
	at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
	at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:112)
	at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:32)
	at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
	at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:178)
	at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:101)
	at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
	at org.infinispan.interceptors.CacheMgmtInterceptor.updateStoreStatistics(CacheMgmtInterceptor.java:148)
	at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:134)
	at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
	at org.infinispan.interceptors.compat.BaseTypeConverterInterceptor.visitPutKeyValueCommand(BaseTypeConverterInterceptor.java:63)
	at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
	at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:110)
	at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:73)
	at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:32)
	at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
	at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333)
	at org.infinispan.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1434)
	at org.infinispan.CacheImpl.putInternal(CacheImpl.java:912)
	at org.infinispan.CacheImpl.put(CacheImpl.java:904)
	at org.infinispan.DecoratedCache.put(DecoratedCache.java:405)
	at org.radargun.service.InfinispanOperations$Cache.put(InfinispanOperations.java:51)
	at org.radargun.stages.cache.test.LoadDataStage$Loader.run(LoadDataStage.java:176)
Caused by: org.jgroups.TimeoutException: timeout sending message to edg-perf01-27017
	at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:419)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:349)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:167)
	... 49 more

Comment 5 Radim Vansa 2014-09-10 08:05:13 UTC
(to explicity answer the second tip:) no, there was no partition.

Should I provide any more info?

Comment 6 Sanne Grinovero 2014-09-10 10:58:30 UTC
I think you're hitting various other issues. Do you know what the timeout cause is? I'm not sure how this relates to ISPN-4599.

BTW I've fixed the Search issue and deployed a fixed snapshot build.

Comment 7 Radim Vansa 2014-09-11 17:17:42 UTC
The index manager issue seems to be resolved (no more workarounds needed) in 6.3.1.CR1.

Moving further problems during indexing to https://bugzilla.redhat.com/show_bug.cgi?id=1140790


Note You need to log in before you can comment on or make changes to this bug.