Bug 903453 - Remoting "Read timed out" when starting multiple servers on a host of single CPU
Summary: Remoting "Read timed out" when starting multiple servers on a host of single CPU
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: jbossas
Version: 6.0.1
Hardware: All
OS: All
unspecified
high
Target Milestone: CR1
: EAP 6.1.0
Assignee: Fernando Nasser
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 903472
TreeView+ depends on / blocked
 
Reported: 2013-01-24 02:40 UTC by Osamu Nagano
Modified: 2018-12-01 17:36 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-01-28 02:59:27 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker AS7-6230 0 Major Resolved Remoting "Read timed out" when starting multiple servers on a host 2018-04-02 06:43:19 UTC

Description Osamu Nagano 2013-01-24 02:40:21 UTC
Description of problem:
On single or a small number of CPU machine, launching number of (about 8 or above) servers at the same time like in the start-up results in 'org.xnio.channels.ReadTimeoutException: Read timed out'.  In EAP 6.0.0, this can be avoided by longer time-out via 'jboss.host.server.connection.timeout' and 'jboss.host.domain.connection.timeout' system properties.  But those time-out settings don't take effect in EAP 6.0.1 due to a mechanism change in DC<->HC communication.

How reproducible:
Always in the customer's environment.

Steps to Reproduce:
1. In a domain mode of EAP 6.0.1, set a number of servers (about 10) in host.xml.
2. Start the domain.  All managed servers will start at the same time.
3. The following exception happens in the host controller.
  
Actual results:
[Host Controller] 16:15:58,036 ERROR [org.jboss.remoting.remote.connection] (Remoting "nceaptint03:MANAGEMENT" read-1) JBREM000200: Remote connection failed: org.xnio.channels.ReadTimeoutException: Read timed out
[Server:ib-demo-server1-group2] 16:16:00,236 ERROR [org.jboss.remoting.remote.connection] (Remoting "int03master:ib-demo-server1-group2:MANAGEMENT" read-1) JBREM000200: Remote connection failed: java.io.IOException: JBREM000201: Received invalid message on Remoting connection 111bda67 to nceaptint03/172.16.139.60:9999
[Server:ib-demo-server1-group2] 16:16:00,704 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) MSC00001: Failed to start service jboss.host.controller.client: org.jboss.msc.service.StartException in service jboss.host.controller.client: java.net.ConnectException: JBAS012174: Could not connect to remote://172.16.139.60:9999. The connection failed
[Server:ib-demo-server1-group2] 	at org.jboss.as.server.mgmt.domain.HostControllerServerClient.start(HostControllerServerClient.java:172) [jboss-as-server-7.1.3.Final.jar:7.1.3.Final]
[Server:ib-demo-server1-group2] 	at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811) [jboss-msc-1.0.2.GA.jar:1.0.2.GA]
[Server:ib-demo-server1-group2] 	at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746) [jboss-msc-1.0.2.GA.jar:1.0.2.GA]
[Server:ib-demo-server1-group2] 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_30]
[Server:ib-demo-server1-group2] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_30]
[Server:ib-demo-server1-group2] 	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_30]
[Server:ib-demo-server1-group2] Caused by: java.net.ConnectException: JBAS012174: Could not connect to remote://172.16.139.60:9999. The connection failed
[Server:ib-demo-server1-group2] 	at org.jboss.as.protocol.ProtocolConnectionUtils.connectSync(ProtocolConnectionUtils.java:118) [jboss-as-protocol-7.1.3.Final.jar:7.1.3.Final]
[Server:ib-demo-server1-group2] 	at org.jboss.as.protocol.ProtocolChannelClient.connectSync(ProtocolChannelClient.java:84) [jboss-as-protocol-7.1.3.Final.jar:7.1.3.Final]
[Server:ib-demo-server1-group2] 	at org.jboss.as.server.mgmt.domain.HostControllerServerConnection.openChannel(HostControllerServerConnection.java:158) [jboss-as-server-7.1.3.Final.jar:7.1.3.Final]
[Server:ib-demo-server1-group2] 	at org.jboss.as.server.mgmt.domain.HostControllerServerConnection.connect(HostControllerServerConnection.java:86) [jboss-as-server-7.1.3.Final.jar:7.1.3.Final]
[Server:ib-demo-server1-group2] 	at org.jboss.as.server.mgmt.domain.HostControllerServerClient.start(HostControllerServerClient.java:148) [jboss-as-server-7.1.3.Final.jar:7.1.3.Final]
[Server:ib-demo-server1-group2] 	... 5 more
[Server:ib-demo-server1-group2] Caused by: java.io.IOException: JBREM000201: Received invalid message on Remoting connection 111bda67 to nceaptint03/172.16.139.60:9999
[Server:ib-demo-server1-group2] 	at org.jboss.remoting3.remote.ClientConnectionOpenListener$Capabilities.handleEvent(ClientConnectionOpenListener.java:424) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1]
[Server:ib-demo-server1-group2] 	at org.jboss.remoting3.remote.ClientConnectionOpenListener$Capabilities.handleEvent(ClientConnectionOpenListener.java:226) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1]
[Server:ib-demo-server1-group2] 	at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.6.GA.jar:3.0.6.GA]
[Server:ib-demo-server1-group2] 	at org.xnio.channels.TranslatingSuspendableChannel.handleReadable(TranslatingSuspendableChannel.java:189) [xnio-api-3.0.6.GA.jar:3.0.6.GA]
[Server:ib-demo-server1-group2] 	at org.xnio.channels.TranslatingSuspendableChannel$1.handleEvent(TranslatingSuspendableChannel.java:103) [xnio-api-3.0.6.GA.jar:3.0.6.GA]
[Server:ib-demo-server1-group2] 	at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.6.GA.jar:3.0.6.GA]
[Server:ib-demo-server1-group2] 	at org.xnio.channels.TranslatingSuspendableChannel.handleReadable(TranslatingSuspendableChannel.java:189) [xnio-api-3.0.6.GA.jar:3.0.6.GA]
[Server:ib-demo-server1-group2] 	at org.xnio.ssl.JsseConnectedSslStreamChannel.handleReadable(JsseConnectedSslStreamChannel.java:180) [xnio-api-3.0.6.GA.jar:3.0.6.GA]
[Server:ib-demo-server1-group2] 	at org.xnio.channels.TranslatingSuspendableChannel$1.handleEvent(TranslatingSuspendableChannel.java:103) [xnio-api-3.0.6.GA.jar:3.0.6.GA]
[Server:ib-demo-server1-group2] 	at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.6.GA.jar:3.0.6.GA]
[Server:ib-demo-server1-group2] 	at org.xnio.nio.NioHandle.run(NioHandle.java:90)
[Server:ib-demo-server1-group2] 	at org.xnio.nio.WorkerThread.run(WorkerThread.java:187)
[Server:ib-demo-server1-group2] 	at ...asynchronous invocation...(Unknown Source)
[Server:ib-demo-server1-group2] 	at org.jboss.remoting3.EndpointImpl.doConnect(EndpointImpl.java:270) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1]
[Server:ib-demo-server1-group2] 	at org.jboss.remoting3.EndpointImpl.doConnect(EndpointImpl.java:251) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1]
[Server:ib-demo-server1-group2] 	at org.jboss.remoting3.EndpointImpl.connect(EndpointImpl.java:349) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1]
[Server:ib-demo-server1-group2] 	at org.jboss.remoting3.EndpointImpl.connect(EndpointImpl.java:337) [jboss-remoting-3.2.8.SP1.jar:3.2.8.SP1]
[Server:ib-demo-server1-group2] 	at org.jboss.as.protocol.ProtocolConnectionUtils.connect(ProtocolConnectionUtils.java:74) [jboss-as-protocol-7.1.3.Final.jar:7.1.3.Final]
[Server:ib-demo-server1-group2] 	at org.jboss.as.protocol.ProtocolConnectionUtils.connectSync(ProtocolConnectionUtils.java:88) [jboss-as-protocol-7.1.3.Final.jar:7.1.3.Final]
[Server:ib-demo-server1-group2] 	... 9 more

Expected results:
All managed servers should start normally regardless of the number of servers.

Comment 1 Osamu Nagano 2013-01-24 02:53:31 UTC
Besides the case 00756264, there was a communication between the customer and a RH developer on the community thread [1].  And the developer introduced a new system property 'org.jboss.as.host.start.servers.sequential' in the code [2], to start the servers sequentially to avoid the exception.  This feature works well for the customer and I figured out the necessary 6 commits from his branch to apply on EAP_6.0.1.GA.

pick 793ba23 [AS7-5556] Where we convert to a String ensure we set the charset to UTF-8 for both processes.
pick 25bfd87 Fix wrong unmarshalling order of process inventory data.
pick d3cdd74 [AS7-5887] reconnect servers automatically
pick f3e0b72 add managed server std.in state
pick 1ed0175 [AS7-6230] wait until the managed server opens it's mgmt channel by default
pick 2d956ec [AS7-6230] add a blocking start property for the host-controller

[1] https://community.jboss.org/thread/215769
[2] https://github.com/jbossas/jboss-as/pull/3794

Comment 2 Osamu Nagano 2013-01-28 02:59:27 UTC
The fix is already included in the upstream by AS7-6230 into 7.2.0.CR1.  So I close this as targeted to EAP 6.1.0.CR1 too.


Note You need to log in before you can comment on or make changes to this bug.