Reproduced with jdg-6.5 & jdg-6.6 branches. When performing queries on cache with index stored within JDG & using multiple clustered server instances, the following exception messages appear in client log: WARN: ISPN004005: Error received from the server: org.hibernate.search.SearchException: Unable to reopen IndexReader Exception in thread "Thread-6" org.infinispan.client.hotrod.exceptions.HotRodClientException:Request for message id[513] returned server error (status=0x85): org.hibernate.search.SearchException: Unable to reopen IndexReader at org.infinispan.client.hotrod.impl.protocol.Codec20.checkForErrorsInResponseStatus(Codec20.java:336) at org.infinispan.client.hotrod.impl.protocol.Codec20.readPartialHeader(Codec20.java:126) at org.infinispan.client.hotrod.impl.protocol.Codec20.readHeader(Codec20.java:112) at org.infinispan.client.hotrod.impl.operations.HotRodOperation.readHeaderAndValidate(HotRodOperation.java:56) at org.infinispan.client.hotrod.impl.operations.QueryOperation.executeOperation(QueryOperation.java:57) at org.infinispan.client.hotrod.impl.operations.QueryOperation.executeOperation(QueryOperation.java:24) at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:52) at org.infinispan.client.hotrod.impl.query.RemoteQuery.executeQuery(RemoteQuery.java:72) at org.infinispan.client.hotrod.impl.query.RemoteQuery.list(RemoteQuery.java:62) at org.jboss.as.quickstarts.datagrid.hotrod.query.ReproducerManager$QueryThread.run(ReproducerManager.java:160) The issue does not seem to occur when using single server instance. Created reproducer based on remote-query quickstart, available at https://github.com/mcimbora/jboss-jdg-quickstarts/commits/jdg-6.5.x. Steps to reproduce: 1. Build remote-query quickstart using aforementioned reproducer 2. Add the following configuration snippet into default "clustered.xml" configuration of JDG server <distributed-cache name="dist_lucene" mode="SYNC" start="EAGER"> <locking acquire-timeout="30000"/> <indexing index="LOCAL"> <property name="default.directory_provider">infinispan</property> <property name="default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager </property> <!--<property name="default.exclusive_index_use">true</property>--> <property name="default.metadata_cachename">lucene_metadata_repl"</property> <property name="default.data_cachename">lucene_data_dist</property> <property name="default.locking_cachename">lucene_locking_repl</property> <property name="hibernate.search.lucene_version">LUCENE_36</property> </indexing> </distributed-cache> <replicated-cache name="lucene_metadata_repl" mode="SYNC"> <locking acquire-timeout="30000"/> </replicated-cache> <replicated-cache name="lucene_locking_repl" mode="SYNC"> <locking acquire-timeout="30000"/> </replicated-cache> <distributed-cache name="lucene_data_dist" mode="SYNC"> <locking acquire-timeout="30000"/> </distributed-cache> 3. Start 2 JDG server instances using bin/clustered.sh script. 4. Start ReproducerManager
This exception is only the one from the client. We'd really need to know what happened on the server side. Sorry it doesn't ring a bell, the only wild guess I could make is that the index got corrupted; which in turn could be caused by Infinispan having lost some data, like what you'd have when multiple nodes shut down. In such cases you can have it rebuild the index using the MassIndexer, invoked via JMX: - https://access.redhat.com/documentation/en-US/Red_Hat_JBoss_Data_Grid/6.5/html/Administration_and_Configuration_Guide/MassIndexer.html I had a quick look at the reproducer; I'm wondering how do you know if it's reproducing the same problem?
Thanks Sanne. Originally I was hitting the issue in RadarGun tests and I was trying to simulate my test reusing the quickstart (the same exception). Not sure whether that answers your question, if not, could you elaborate? I'll try to get some server side logs & update the issue.
OK, problem solved. Server side log indicated problem with metadata cache - double checking server configuration revealed that value of default.metadata_cachename property contained additional quotation marks. 15:10:43,374 DEBUG [org.infinispan.server.hotrod.CacheDecodeContext] (HotRodServerWorker-2) Exception caught: io.netty.handler.codec.DecoderException: org.infinispan.server.hotrod.HotRodException: org.hibernate.search.SearchException: Unable to reopen IndexReader at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:417) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:141) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at org.infinispan.server.hotrod.HotRodDecoder.org$infinispan$server$core$transport$StatsChannelHandler$$super$channelRead(HotRodDecoder.scala:31) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.server.core.transport.StatsChannelHandler$class.channelRead(StatsChannelHandler.scala:32) [infinispan-server-core-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.server.hotrod.HotRodDecoder.channelRead(HotRodDecoder.scala:31) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:341) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:327) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:785) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:116) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:494) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:461) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:378) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:350) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_25] Caused by: org.infinispan.server.hotrod.HotRodException: org.hibernate.search.SearchException: Unable to reopen IndexReader at org.infinispan.server.hotrod.CacheDecodeContext.createServerException(CacheDecodeContext.scala:98) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.server.hotrod.HotRodDecoder.decode(HotRodDecoder.scala:55) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:362) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1] ... 14 more Caused by: org.hibernate.search.SearchException: Unable to reopen IndexReader at org.hibernate.search.indexes.impl.SharingBufferReaderProvider$PerDirectoryLatestReader.refreshAndGet(SharingBufferReaderProvider.java:254) at org.hibernate.search.indexes.impl.SharingBufferReaderProvider.openIndexReader(SharingBufferReaderProvider.java:85) at org.hibernate.search.reader.impl.MultiReaderFactory.openReader(MultiReaderFactory.java:53) at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:635) at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:535) at org.hibernate.search.query.engine.impl.HSQueryImpl.queryEntityInfos(HSQueryImpl.java:265) at org.infinispan.query.impl.CacheQueryImpl.list(CacheQueryImpl.java:161) at org.infinispan.query.remote.QueryFacadeImpl.executeIndexedQuery(QueryFacadeImpl.java:181) [infinispan-remote-query-server-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.query.remote.QueryFacadeImpl.query(QueryFacadeImpl.java:85) [infinispan-remote-query-server-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.server.hotrod.Decoder2x$.customReadKey(Decoder2x.scala:350) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.server.hotrod.HotRodDecoder.customDecodeKey(HotRodDecoder.scala:196) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.server.hotrod.HotRodDecoder.org$infinispan$server$hotrod$HotRodDecoder$$decodeKey(HotRodDecoder.scala:106) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.server.hotrod.HotRodDecoder$$anonfun$decode$1.apply$mcV$sp(HotRodDecoder.scala:48) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.server.hotrod.HotRodDecoder.wrapSecurity(HotRodDecoder.scala:208) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.server.hotrod.HotRodDecoder.decode(HotRodDecoder.scala:45) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] ... 15 more Caused by: java.io.FileNotFoundException: Error loading metadata for index file: segments_dx|M|org.infinispan.query.remote.indexing.ProtobufValueWrapper at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:142) [infinispan-lucene-directory-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.infinispan.lucene.impl.DirectoryLuceneV3.openInput(DirectoryLuceneV3.java:133) [infinispan-lucene-directory-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1] at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:265) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:363) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:709) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:554) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) at org.apache.lucene.index.SegmentInfos.readCurrentVersion(SegmentInfos.java:483) at org.apache.lucene.index.DirectoryReader.isCurrent(DirectoryReader.java:891) at org.apache.lucene.index.DirectoryReader.doOpenNoWriter(DirectoryReader.java:455) at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:434) at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:375) at org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:508) at org.hibernate.search.indexes.impl.SharingBufferReaderProvider$PerDirectoryLatestReader.refreshAndGet(SharingBufferReaderProvider.java:251) ... 29 more
Not sure this is not a bug! Isn't this invalid XML?: <property name="default.metadata_cachename">lucene_metadata_repl"</property> And shouldn't an error get thrown when the configuration is read?
Well '"' is a valid XML character, unless used within an attribute.
Sure Matej, but a single '"' character in a property doesn't seem correct. A single ''' would need to be escaped as an apostrophe to be valid. I'm just wondering if this should fail fast with a better error message, because it's not the easiest thing to figure out from the exception you got!
Alan, that makes sense. I did some more investigation - if the cache (lucene_metadata_repl") is not among one of pre-configured caches, each of the servers creates it as 'local' (this explains why it happens only in clustered environment - metadata cannot be shared across the cluster). Sanne, is it technically possible to check whether caches customized by properties (e.g. default.metadata_cachename) are defined in configuration & fail fast if not? Does it make sense to you?
Hi Matej, sorry for the delay. I agree it would be nice to do something to improve on this, but technically having a non-defined cache could be valid as the Cache would inherit a (potentially valid) configuration from the default cache. In the Infinispan team meeting of last week we discussed to remove the default cache (in a future major version), so in that case this would have failed as you'd need an explicit cache configuration to be defined for each cache usage. That would help, would you agree?