Description of problem: Running multiple clustered JPP 6.2.0 nodes on Windows 2012 is leading to the following error on the first node, after the second node has come up. Everything is fine until instance #2 has been started up, and that this instance is able to get the index data from the coordinator (node #1): --------------- 08:49:46,353 INFO [exo.jcr.component.core.MultiIndex] (MSC service thread 1-4) Setting index OFFLINE (repository/repository_pc-system) 08:49:46,369 INFO [exo.jcr.component.core.MultiIndex] (MSC service thread 1-4) Retrieving index from coordinator (repository/repository_pc-system)... 08:49:46,493 INFO [exo.jcr.component.core.MultiIndex] (MSC service thread 1-4) Setting index ONLINE (repository/repository_pc-system) --------------- We can see the corresponding activity logged on node #1: --------------- 08:49:46,369 INFO [exo.jcr.component.core.MultiIndex] (Incoming-1,172.24.44.249:56201) Setting index OFFLINE (repository/repository_pc-system) 08:49:46,400 INFO [exo.jcr.component.core.MultiIndex] (Incoming-1,172.24.44.249:56201) Setting index ONLINE (repository/repository_pc-system) --------------- However, a few seconds later node #1 fails to write the new IndexInfo to the disk: --------------- 08:49:56,491 ERROR [exo.jcr.component.core.MultiIndex] (MultiIndex Flush Timer) Unable to commit volatile index: java.io.IOException: Cannot delete C:\tmp\TESTCLUSTER\1\jboss-portal-6.2\standalone\data\gatein\jcr\lucene\portal-system_portal\indexes at org.apache.lucene.store.FSDirectory.deleteFile(FSDirectory.java:296) [lucene-core-3.5.0.jar:3.5.0 1204988 - simon - 2011-11-22 14:46:51] at org.exoplatform.services.jcr.impl.core.query.lucene.IndexInfos$2.run(IndexInfos.java:197) [exo.jcr.component.core-1.15.11-GA-redhat-1.jar:1.15.11-GA-redhat-1] at org.exoplatform.commons.utils.SecurityHelper.doPrivilegedExceptionAction(SecurityHelper.java:310) [exo.kernel.commons-2.4.11-GA-redhat-1.jar:2.4.11-GA-redhat-1] at org.exoplatform.commons.utils.SecurityHelper.doPrivilegedIOExceptionAction(SecurityHelper.java:57) [exo.kernel.commons-2.4.11-GA-redhat-1.jar:2.4.11-GA-redhat-1] at org.exoplatform.services.jcr.impl.core.query.lucene.IndexInfos.write(IndexInfos.java:166) [exo.jcr.component.core-1.15.11-GA-redhat-1.jar:1.15.11-GA-redhat-1] at org.exoplatform.services.jcr.impl.core.query.lucene.MultiIndex$9.run(MultiIndex.java:1692) [exo.jcr.component.core-1.15.11-GA-redhat-1.jar:1.15.11-GA-redhat-1] at org.exoplatform.services.jcr.impl.core.query.lucene.MultiIndex$9.run(MultiIndex.java:1662) [exo.jcr.component.core-1.15.11-GA-redhat-1.jar:1.15.11-GA-redhat-1] at org.exoplatform.commons.utils.SecurityHelper.doPrivilegedExceptionAction(SecurityHelper.java:310) [exo.kernel.commons-2.4.11-GA-redhat-1.jar:2.4.11-GA-redhat-1] at org.exoplatform.commons.utils.SecurityHelper.doPrivilegedIOExceptionAction(SecurityHelper.java:57) [exo.kernel.commons-2.4.11-GA-redhat-1.jar:2.4.11-GA-redhat-1] at org.exoplatform.services.jcr.impl.core.query.lucene.MultiIndex.flush(MultiIndex.java:1661) [exo.jcr.component.core-1.15.11-GA-redhat-1.jar:1.15.11-GA-redhat-1] at org.exoplatform.services.jcr.impl.core.query.lucene.MultiIndex.checkFlush(MultiIndex.java:2250) [exo.jcr.component.core-1.15.11-GA-redhat-1.jar:1.15.11-GA-redhat-1] at org.exoplatform.services.jcr.impl.core.query.lucene.MultiIndex.access$2100(MultiIndex.java:107) [exo.jcr.component.core-1.15.11-GA-redhat-1.jar:1.15.11-GA-redhat-1] at org.exoplatform.services.jcr.impl.core.query.lucene.MultiIndex$10.run(MultiIndex.java:1798) [exo.jcr.component.core-1.15.11-GA-redhat-1.jar:1.15.11-GA-redhat-1] at java.util.TimerThread.mainLoop(Timer.java:555) [rt.jar:1.8.0_45] at java.util.TimerThread.run(Timer.java:505) [rt.jar:1.8.0_45] --------------- The node is not able to recover from this error. Version-Release number of selected component (if applicable): How reproducible: always Steps to Reproduce: 1. Copy the server directories: copy standalone standalone1 copy standalone standalone2 copy standalone standalone3 2. Start the H2 db: java -cp modules\system\layers\base\com\h2database\h2\main\h2-1.3.168.redhat-4.jar org.h2.tools.Server 3. Start the three instances (wait until each one has started up successfully): .\bin\standalone.bat -c standalone-ha.xml -b 127.0.0.1 -u 230.0.0.4 -D"jboss.server.base.dir=standalone1" -D"jboss.node.name=node1" -D"jboss.socket.binding.port-offset=100" -D"gatein.jgroups.udp.bind_port=56201" .\bin\standalone.bat -c standalone-ha.xml -b 127.0.0.1 -u 230.0.0.4 -D"jboss.server.base.dir=standalone2" -D"jboss.node.name=node2" -D"jboss.socket.binding.port-offset=200" -D"gatein.jgroups.udp.bind_port=56202" .\bin\standalone.bat -c standalone-ha.xml -b 127.0.0.1 -u 230.0.0.4 -D"jboss.server.base.dir=standalone3" -D"jboss.node.name=node3" -D"jboss.socket.binding.port-offset=300" -D"gatein.jgroups.udp.bind_port=56203" Actual results: Errors when node #2 or #3 is starting up Expected results: No errors Additional info: The same scenario works fine in a Linux environment
I tried to use different implementations for the Lucene FSDirectory class, but to no avail: set "JAVA_OPTS=%JAVA_OPTS% -Dorg.exoplatform.jcr.lucene.FSDirectory.class=org.apache.lucene.store.MMapDirectory and set "JAVA_OPTS=%JAVA_OPTS% -Dorg.exoplatform.jcr.lucene.FSDirectory.class=org.apache.lucene.store.NIOFSDirectory
https://access.redhat.com/jbossnetwork/restricted/softwareDetail.html?softwareId=40551&product=jbportal&version=6.2.0&downloadType=patches