Description of problem: If client with opened JNDI context is disconnected from network, then clean shutdown (ctrl-c) of server takes 15 minutes. This scenario takes place, when network connections is lost between JMS clients with JNDI context and server. Version-Release number of selected component (if applicable): jboss-remoting-3.3.3.Final-redhat-1.jar How reproducible: always Steps to Reproduce: 1. Start EAP 6.3.1.CP.CR1 on first machine 2. Start client which creates JNDI context on second machine (use attached JNDIContext.java) 3. Disconnect network between client and server 4. Try to cleanly shutdown EAP 6.3.1.CP.CR1 server (by ctrl-c) Actual results: It takes 15 minutes for server to shutdown. Expected results: Server should shutdown almost immediately.
Created attachment 935307 [details] JNDIContext.java
Created attachment 938062 [details] Jboss-remoting test case
stack trace 19:03:52,217 TRACE [org.jboss.remoting.remote.connection] (Remoting "localhost" read-1) Connection error detail: java.io.IOException: Expiró el tiempo de conexión at sun.nio.ch.FileDispatcherImpl.read0(Native Method) [rt.jar:1.7.0_65] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) [rt.jar:1.7.0_65] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) [rt.jar:1.7.0_65] at sun.nio.ch.IOUtil.read(IOUtil.java:197) [rt.jar:1.7.0_65] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) [rt.jar:1.7.0_65] at org.xnio.nio.AbstractNioStreamChannel.read(AbstractNioStreamChannel.java:249) at org.xnio.channels.FramedMessageChannel.receive(FramedMessageChannel.java:87) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1] at org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:75) [jboss-remoting-3.3.3.Final-redhat-1.jar:3.3.3.Final-redhat-1] at org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:45) [jboss-remoting-3.3.3.Final-redhat-1.jar:3.3.3.Final-redhat-1] at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1] at org.xnio.channels.TranslatingSuspendableChannel.handleReadable(TranslatingSuspendableChannel.java:189) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1] at org.xnio.channels.TranslatingSuspendableChannel$1.handleEvent(TranslatingSuspendableChannel.java:103) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1] at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1] at org.xnio.nio.NioHandle.run(NioHandle.java:90) at org.xnio.nio.WorkerThread.run(WorkerThread.java:198) When the network is disconnected the sockets are not closed inmediatly by the OS. The OS keeps the sockets opened till the max idle time is reached. This time is based in the OS variable tcp_retries2. By default this variable is set to 15 which means about 924.6 seconds (15,4 min). After that the OS close the socket raising an error (IOException in java) More info https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt When JBoss tries to shutdown the endpoint with an opened connection, the endpoint is expecting to read something from the socket (RemoteReadListener). This is not happening therefore the endpoint shutdown gets blocked till the operating system throws an error (closing the socket). Attached an example to reproduce the problem with the jboss-remoting library. mvn compile mvn exec:java -Dexec.mainClass="example.Test" To simulate a network disconnection, the following commands can be used: ip link set lo down (disconnection) ip link set lo up (restore the connection) To modify in linux the value of the tcp_retries2 echo 8 > /proc/sys/net/ipv4/tcp_retries2 This is happening also with the latest version of the jboss remoting.
Created attachment 940036 [details] Thread dump during shutdown
More info after the thread dump. 1) The WorkerThread is waiting in Selector::select (locked) 2) When the channel is closed by the OS, the key you get from the select operation is an OP_READ 3) the RemoteReadListener is invoked in order to handle it. 4) when the listener tries to read from the channel, it throws an IOException (stack trace attached) 5) The listener removes the key. 6) there are no more keys to process and the WorkerThread executes its shutdown
Jira: https://issues.jboss.org/browse/WFCORE-112 PR sent and merged: Master: https://github.com/jboss-remoting/jboss-remoting/pull/24 pushed 3.3: https://github.com/jboss-remoting/jboss-remoting/commit/d2de3549fa44a01e19896e233a6fbc20575407bd
Verified with EAP 6.3.3.CP.CR1. (Verification note: nice reproducer in upstream JIRA.)