Bug 1250571 - Remote query - Unable to reopen IndexReader exceptions in clustered environment
Remote query - Unable to reopen IndexReader exceptions in clustered environment
Status: CLOSED NOTABUG
Product: JBoss Data Grid 6
Classification: JBoss
Component: Infinispan (Show other bugs)
6.5.0,6.6.0
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Tristan Tarrant
Martin Gencur
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-05 09:26 EDT by Matej Čimbora
Modified: 2015-10-05 12:55 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-08-11 09:45:04 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Matej Čimbora 2015-08-05 09:26:05 EDT
Reproduced with jdg-6.5 & jdg-6.6 branches.

When performing queries on cache with index stored within JDG & using multiple clustered server instances, the following exception messages appear in client log:

WARN: ISPN004005: Error received from the server: org.hibernate.search.SearchException: Unable to reopen IndexReader
Exception in thread "Thread-6" org.infinispan.client.hotrod.exceptions.HotRodClientException:Request for message id[513] returned server error (status=0x85): org.hibernate.search.SearchException: Unable to reopen IndexReader
	at org.infinispan.client.hotrod.impl.protocol.Codec20.checkForErrorsInResponseStatus(Codec20.java:336)
	at org.infinispan.client.hotrod.impl.protocol.Codec20.readPartialHeader(Codec20.java:126)
	at org.infinispan.client.hotrod.impl.protocol.Codec20.readHeader(Codec20.java:112)
	at org.infinispan.client.hotrod.impl.operations.HotRodOperation.readHeaderAndValidate(HotRodOperation.java:56)
	at org.infinispan.client.hotrod.impl.operations.QueryOperation.executeOperation(QueryOperation.java:57)
	at org.infinispan.client.hotrod.impl.operations.QueryOperation.executeOperation(QueryOperation.java:24)
	at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:52)
	at org.infinispan.client.hotrod.impl.query.RemoteQuery.executeQuery(RemoteQuery.java:72)
	at org.infinispan.client.hotrod.impl.query.RemoteQuery.list(RemoteQuery.java:62)
	at org.jboss.as.quickstarts.datagrid.hotrod.query.ReproducerManager$QueryThread.run(ReproducerManager.java:160)

The issue does not seem to occur when using single server instance.

Created reproducer based on remote-query quickstart, available at https://github.com/mcimbora/jboss-jdg-quickstarts/commits/jdg-6.5.x.

Steps to reproduce:
1. Build remote-query quickstart using aforementioned reproducer
2. Add the following configuration snippet into default "clustered.xml" configuration of JDG server

<distributed-cache name="dist_lucene" mode="SYNC" start="EAGER">
                    <locking acquire-timeout="30000"/>
                    <indexing index="LOCAL">
                        <property name="default.directory_provider">infinispan</property>
                        <property name="default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager
                        </property>
                        <!--<property name="default.exclusive_index_use">true</property>-->
                        <property name="default.metadata_cachename">lucene_metadata_repl"</property>
                        <property name="default.data_cachename">lucene_data_dist</property>
                        <property name="default.locking_cachename">lucene_locking_repl</property>
                        <property name="hibernate.search.lucene_version">LUCENE_36</property>
                    </indexing>
                </distributed-cache>
                <replicated-cache name="lucene_metadata_repl" mode="SYNC">
                    <locking acquire-timeout="30000"/>
                </replicated-cache>
                <replicated-cache name="lucene_locking_repl" mode="SYNC">
                    <locking acquire-timeout="30000"/>
                </replicated-cache>
                <distributed-cache name="lucene_data_dist" mode="SYNC">
                    <locking acquire-timeout="30000"/>
                </distributed-cache>
3. Start 2 JDG server instances using bin/clustered.sh script.
4. Start ReproducerManager
Comment 3 Sanne Grinovero 2015-08-11 08:18:57 EDT
This exception is only the one from the client. We'd really need to know what happened on the server side.

Sorry it doesn't ring a bell, the only wild guess I could make is that the index got corrupted; which in turn could be caused by Infinispan having lost some data, like what you'd have when multiple nodes shut down.

In such cases you can have it rebuild the index using the MassIndexer, invoked via JMX:
 - https://access.redhat.com/documentation/en-US/Red_Hat_JBoss_Data_Grid/6.5/html/Administration_and_Configuration_Guide/MassIndexer.html

I had a quick look at the reproducer; I'm wondering how do you know if it's reproducing the same problem?
Comment 4 Matej Čimbora 2015-08-11 08:27:58 EDT
Thanks Sanne. Originally I was hitting the issue in RadarGun tests and I was trying to simulate my test reusing the quickstart (the same exception). Not sure whether that answers your question, if not, could you elaborate? I'll try to get some server side logs & update the issue.
Comment 5 Matej Čimbora 2015-08-11 09:45:04 EDT
OK, problem solved. Server side log indicated problem with metadata cache - double checking server configuration revealed that value of default.metadata_cachename property contained additional quotation marks.

15:10:43,374 DEBUG [org.infinispan.server.hotrod.CacheDecodeContext] (HotRodServerWorker-2) Exception caught: io.netty.handler.codec.DecoderException: org.infinispan.server.hotrod.HotRodException: org.hibernate.search.SearchException: Unable to reopen IndexReader
	at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:417) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:141) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at org.infinispan.server.hotrod.HotRodDecoder.org$infinispan$server$core$transport$StatsChannelHandler$$super$channelRead(HotRodDecoder.scala:31) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.server.core.transport.StatsChannelHandler$class.channelRead(StatsChannelHandler.scala:32) [infinispan-server-core-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.server.hotrod.HotRodDecoder.channelRead(HotRodDecoder.scala:31) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:341) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:327) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:785) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:116) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:494) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:461) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:378) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:350) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_25]
Caused by: org.infinispan.server.hotrod.HotRodException: org.hibernate.search.SearchException: Unable to reopen IndexReader
	at org.infinispan.server.hotrod.CacheDecodeContext.createServerException(CacheDecodeContext.scala:98) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.server.hotrod.HotRodDecoder.decode(HotRodDecoder.scala:55) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:362) [netty-all-4.0.18.Final-redhat-1.jar:4.0.18.Final-redhat-1]
	... 14 more
Caused by: org.hibernate.search.SearchException: Unable to reopen IndexReader
	at org.hibernate.search.indexes.impl.SharingBufferReaderProvider$PerDirectoryLatestReader.refreshAndGet(SharingBufferReaderProvider.java:254)
	at org.hibernate.search.indexes.impl.SharingBufferReaderProvider.openIndexReader(SharingBufferReaderProvider.java:85)
	at org.hibernate.search.reader.impl.MultiReaderFactory.openReader(MultiReaderFactory.java:53)
	at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:635)
	at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:535)
	at org.hibernate.search.query.engine.impl.HSQueryImpl.queryEntityInfos(HSQueryImpl.java:265)
	at org.infinispan.query.impl.CacheQueryImpl.list(CacheQueryImpl.java:161)
	at org.infinispan.query.remote.QueryFacadeImpl.executeIndexedQuery(QueryFacadeImpl.java:181) [infinispan-remote-query-server-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.query.remote.QueryFacadeImpl.query(QueryFacadeImpl.java:85) [infinispan-remote-query-server-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.server.hotrod.Decoder2x$.customReadKey(Decoder2x.scala:350) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.server.hotrod.HotRodDecoder.customDecodeKey(HotRodDecoder.scala:196) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.server.hotrod.HotRodDecoder.org$infinispan$server$hotrod$HotRodDecoder$$decodeKey(HotRodDecoder.scala:106) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.server.hotrod.HotRodDecoder$$anonfun$decode$1.apply$mcV$sp(HotRodDecoder.scala:48) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.server.hotrod.HotRodDecoder.wrapSecurity(HotRodDecoder.scala:208) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.server.hotrod.HotRodDecoder.decode(HotRodDecoder.scala:45) [infinispan-server-hotrod-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	... 15 more
Caused by: java.io.FileNotFoundException: Error loading metadata for index file: segments_dx|M|org.infinispan.query.remote.indexing.ProtobufValueWrapper
	at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:142) [infinispan-lucene-directory-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.infinispan.lucene.impl.DirectoryLuceneV3.openInput(DirectoryLuceneV3.java:133) [infinispan-lucene-directory-6.4.0.DR1-redhat-1.jar:6.4.0.DR1-redhat-1]
	at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:265)
	at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:363)
	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:709)
	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:554)
	at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
	at org.apache.lucene.index.SegmentInfos.readCurrentVersion(SegmentInfos.java:483)
	at org.apache.lucene.index.DirectoryReader.isCurrent(DirectoryReader.java:891)
	at org.apache.lucene.index.DirectoryReader.doOpenNoWriter(DirectoryReader.java:455)
	at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:434)
	at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:375)
	at org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:508)
	at org.hibernate.search.indexes.impl.SharingBufferReaderProvider$PerDirectoryLatestReader.refreshAndGet(SharingBufferReaderProvider.java:251)
	... 29 more
Comment 6 Alan Field 2015-08-11 11:05:24 EDT
Not sure this is not a bug! Isn't this invalid XML?:

<property name="default.metadata_cachename">lucene_metadata_repl"</property>

And shouldn't an error get thrown when the configuration is read?
Comment 7 Matej Čimbora 2015-08-11 11:14:40 EDT
Well '"' is a valid XML character, unless used within an attribute.
Comment 8 Alan Field 2015-08-11 11:27:36 EDT
Sure Matej, but a single '"' character in a property doesn't seem correct. A single ''' would need to be escaped as an apostrophe to be valid. I'm just wondering if this should fail fast with a better error message, because it's not the easiest thing to figure out from the exception you got!
Comment 9 Matej Čimbora 2015-08-12 07:23:11 EDT
Alan, that makes sense. I did some more investigation - if the cache (lucene_metadata_repl") is not among one of pre-configured caches, each of the servers creates it as 'local' (this explains why it happens only in clustered environment - metadata cannot be shared across the cluster). Sanne, is it technically possible to check whether caches customized by properties (e.g. default.metadata_cachename) are defined in configuration & fail fast if not? Does it make sense to you?
Comment 10 Sanne Grinovero 2015-10-05 12:55:43 EDT
Hi Matej,
sorry for the delay.
I agree it would be nice to do something to improve on this, but technically having a non-defined cache could be valid as the Cache would inherit a (potentially valid) configuration from the default cache.

In the Infinispan team meeting of last week we discussed to remove the default cache (in a future major version), so in that case this would have failed as you'd need an explicit cache configuration to be defined for each cache usage. That would help, would you agree?

Note You need to log in before you can comment on or make changes to this bug.