Description of problem: On single or a small number of CPU machine, launching number of (about 8 or above) servers at the same time like in the start-up results in 'org.xnio.channels.ReadTimeoutException: Read timed out'. In EAP 6.0.0, this can be avoided by longer time-out via 'jboss.host.server.connection.timeout' and 'jboss.host.domain.connection.timeout' system properties. But those time-out settings don't take effect in EAP 6.0.1 due to a mechanism change in DC<->HC communication. How reproducible: Always in the customer's environment. Steps to Reproduce: 1. In a domain mode of EAP 6.0.1, set a number of servers (about 10) in host.xml. 2. Start the domain. All managed servers will start at the same time. 3. The following exception happens in the host controller. Actual results: [Host Controller] 16:15:58,036 ERROR [org.jboss.remoting.remote.connection] (Remoting "nceaptint03:MANAGEMENT" read-1) JBREM000200: Remote connection failed: org.xnio.channels.ReadTimeoutException: Read timed out [Server:ib-demo-server1-group2] 16:16:00,236 ERROR [org.jboss.remoting.remote.connection] (Remoting "int03master:ib-demo-server1-group2:MANAGEMENT" read-1) JBREM000200: Remote connection failed: java.io.IOException: JBREM000201: Received invalid message on Remoting connection 111bda67 to nceaptint03/172.16.139.60:9999 [Server:ib-demo-server1-group2] 16:16:00,704 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) MSC00001: Failed to start service jboss.host.controller.client: org.jboss.msc.service.StartException in service jboss.host.controller.client: java.net.ConnectException: JBAS012174: Could not connect to remote://172.16.139.60:9999. The connection failed [Server:ib-demo-server1-group2] at org.jboss.as.server.mgmt.domain.HostControllerServerClient.start(HostControllerServerClient.java:172) [jboss-as-server-7.1.3.Final.jar:7.1.3.Final] [Server:ib-demo-server1-group2] at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811) [jboss-msc-1.0.2.GA.jar:1.0.2.GA] [Server:ib-demo-server1-group2] at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746) [jboss-msc-1.0.2.GA.jar:1.0.2.GA] [Server:ib-demo-server1-group2] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_30] [Server:ib-demo-server1-group2] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_30] [Server:ib-demo-server1-group2] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_30] [Server:ib-demo-server1-group2] Caused by: java.net.ConnectException: JBAS012174: Could not connect to remote://172.16.139.60:9999. The connection failed [Server:ib-demo-server1-group2] at org.jboss.as.protocol.ProtocolConnectionUtils.connectSync(ProtocolConnectionUtils.java:118) [jboss-as-protocol-7.1.3.Final.jar:7.1.3.Final] [Server:ib-demo-server1-group2] at org.jboss.as.protocol.ProtocolChannelClient.connectSync(ProtocolChannelClient.java:84) [jboss-as-protocol-7.1.3.Final.jar:7.1.3.Final] [Server:ib-demo-server1-group2] at org.jboss.as.server.mgmt.domain.HostControllerServerConnection.openChannel(HostControllerServerConnection.java:158) [jboss-as-server-7.1.3.Final.jar:7.1.3.Final] [Server:ib-demo-server1-group2] at org.jboss.as.server.mgmt.domain.HostControllerServerConnection.connect(HostControllerServerConnection.java:86) [jboss-as-server-7.1.3.Final.jar:7.1.3.Final] [Server:ib-demo-server1-group2] at org.jboss.as.server.mgmt.domain.HostControllerServerClient.start(HostControllerServerClient.java:148) [jboss-as-server-7.1.3.Final.jar:7.1.3.Final] [Server:ib-demo-server1-group2] ... 5 more [Server:ib-demo-server1-group2] Caused by: java.io.IOException: JBREM000201: Received invalid message on Remoting connection 111bda67 to nceaptint03/172.16.139.60:9999 [Server:ib-demo-server1-group2] at org.jboss.remoting3.remote.ClientConnectionOpenListener$Capabilities.handleEvent(ClientConnectionOpenListener.java:424) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1] [Server:ib-demo-server1-group2] at org.jboss.remoting3.remote.ClientConnectionOpenListener$Capabilities.handleEvent(ClientConnectionOpenListener.java:226) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1] [Server:ib-demo-server1-group2] at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.6.GA.jar:3.0.6.GA] [Server:ib-demo-server1-group2] at org.xnio.channels.TranslatingSuspendableChannel.handleReadable(TranslatingSuspendableChannel.java:189) [xnio-api-3.0.6.GA.jar:3.0.6.GA] [Server:ib-demo-server1-group2] at org.xnio.channels.TranslatingSuspendableChannel$1.handleEvent(TranslatingSuspendableChannel.java:103) [xnio-api-3.0.6.GA.jar:3.0.6.GA] [Server:ib-demo-server1-group2] at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.6.GA.jar:3.0.6.GA] [Server:ib-demo-server1-group2] at org.xnio.channels.TranslatingSuspendableChannel.handleReadable(TranslatingSuspendableChannel.java:189) [xnio-api-3.0.6.GA.jar:3.0.6.GA] [Server:ib-demo-server1-group2] at org.xnio.ssl.JsseConnectedSslStreamChannel.handleReadable(JsseConnectedSslStreamChannel.java:180) [xnio-api-3.0.6.GA.jar:3.0.6.GA] [Server:ib-demo-server1-group2] at org.xnio.channels.TranslatingSuspendableChannel$1.handleEvent(TranslatingSuspendableChannel.java:103) [xnio-api-3.0.6.GA.jar:3.0.6.GA] [Server:ib-demo-server1-group2] at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.6.GA.jar:3.0.6.GA] [Server:ib-demo-server1-group2] at org.xnio.nio.NioHandle.run(NioHandle.java:90) [Server:ib-demo-server1-group2] at org.xnio.nio.WorkerThread.run(WorkerThread.java:187) [Server:ib-demo-server1-group2] at ...asynchronous invocation...(Unknown Source) [Server:ib-demo-server1-group2] at org.jboss.remoting3.EndpointImpl.doConnect(EndpointImpl.java:270) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1] [Server:ib-demo-server1-group2] at org.jboss.remoting3.EndpointImpl.doConnect(EndpointImpl.java:251) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1] [Server:ib-demo-server1-group2] at org.jboss.remoting3.EndpointImpl.connect(EndpointImpl.java:349) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1] [Server:ib-demo-server1-group2] at org.jboss.remoting3.EndpointImpl.connect(EndpointImpl.java:337) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1] [Server:ib-demo-server1-group2] at org.jboss.as.protocol.ProtocolConnectionUtils.connect(ProtocolConnectionUtils.java:74) [jboss-as-protocol-7.1.3.Final.jar:7.1.3.Final] [Server:ib-demo-server1-group2] at org.jboss.as.protocol.ProtocolConnectionUtils.connectSync(ProtocolConnectionUtils.java:88) [jboss-as-protocol-7.1.3.Final.jar:7.1.3.Final] [Server:ib-demo-server1-group2] ... 9 more Expected results: All managed servers should start normally regardless of the number of servers.
Besides the case 00756264, there was a communication between the customer and a RH developer on the community thread [1]. And the developer introduced a new system property 'org.jboss.as.host.start.servers.sequential' in the code [2], to start the servers sequentially to avoid the exception. This feature works well for the customer and I figured out the necessary 6 commits from his branch to apply on EAP_6.0.1.GA. pick 793ba23 [AS7-5556] Where we convert to a String ensure we set the charset to UTF-8 for both processes. pick 25bfd87 Fix wrong unmarshalling order of process inventory data. pick d3cdd74 [AS7-5887] reconnect servers automatically pick f3e0b72 add managed server std.in state pick 1ed0175 [AS7-6230] wait until the managed server opens it's mgmt channel by default pick 2d956ec [AS7-6230] add a blocking start property for the host-controller [1] https://community.jboss.org/thread/215769 [2] https://github.com/jbossas/jboss-as/pull/3794
The fix is already included in the upstream by AS7-6230 into 7.2.0.CR1. So I close this as targeted to EAP 6.1.0.CR1 too.