Bug 1139197 - [QE] (6.3.z) Long server shut-down with unresponsive client with opened JNDI Context
Summary: [QE] (6.3.z) Long server shut-down with unresponsive client with opened JNDI ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Remoting
Version: 6.3.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: CR1
: EAP 6.3.3
Assignee: Enrique Gonzalez Martinez
QA Contact: Jitka Kozana
URL:
Whiteboard:
Depends On: 1139202
Blocks: 1149621 eap633-payload
TreeView+ depends on / blocked
 
Reported: 2014-09-08 11:19 UTC by Miroslav Novak
Modified: 2019-08-19 12:39 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1139202 (view as bug list)
Environment:
Last Closed: 2019-08-19 12:39:20 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
JNDIContext.java (1.84 KB, text/plain)
2014-09-08 11:21 UTC, Miroslav Novak
no flags Details
Jboss-remoting test case (2.82 KB, application/octet-stream)
2014-09-16 15:01 UTC, Enrique Gonzalez Martinez
no flags Details
Thread dump during shutdown (2.90 KB, application/x-gzip)
2014-09-22 14:09 UTC, Enrique Gonzalez Martinez
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1132189 1 None None None 2021-01-20 06:05:38 UTC
Red Hat Issue Tracker REM3-194 0 Major Resolved Long server shut-dow with unresponsive client with opened JNDI Context 2017-05-19 06:20:44 UTC

Internal Links: 1132189

Description Miroslav Novak 2014-09-08 11:19:01 UTC
Description of problem:
If client with opened JNDI context is disconnected from network, then clean shutdown (ctrl-c) of server takes 15 minutes.
This scenario takes place, when network connections is lost between JMS clients with JNDI context and server. 

Version-Release number of selected component (if applicable):
jboss-remoting-3.3.3.Final-redhat-1.jar

How reproducible:
always

Steps to Reproduce:
1. Start EAP 6.3.1.CP.CR1 on first machine
2. Start client which creates JNDI context on second machine (use attached JNDIContext.java)
3. Disconnect network between client and server
4. Try to cleanly shutdown EAP 6.3.1.CP.CR1 server (by ctrl-c)

Actual results:
It takes 15 minutes for server to shutdown.

Expected results:
Server should shutdown almost immediately.

Comment 1 Miroslav Novak 2014-09-08 11:21:13 UTC
Created attachment 935307 [details]
JNDIContext.java

Comment 2 Enrique Gonzalez Martinez 2014-09-16 15:01:13 UTC
Created attachment 938062 [details]
Jboss-remoting test case

Comment 3 Enrique Gonzalez Martinez 2014-09-16 15:21:40 UTC
stack trace
19:03:52,217 TRACE [org.jboss.remoting.remote.connection] (Remoting "localhost" read-1) Connection error detail: java.io.IOException: Expiró el tiempo de conexión
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method) [rt.jar:1.7.0_65]
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) [rt.jar:1.7.0_65]
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) [rt.jar:1.7.0_65]
        at sun.nio.ch.IOUtil.read(IOUtil.java:197) [rt.jar:1.7.0_65]
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) [rt.jar:1.7.0_65]
        at org.xnio.nio.AbstractNioStreamChannel.read(AbstractNioStreamChannel.java:249)
        at org.xnio.channels.FramedMessageChannel.receive(FramedMessageChannel.java:87) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1]
        at org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:75) [jboss-remoting-3.3.3.Final-redhat-1.jar:3.3.3.Final-redhat-1]
        at org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:45) [jboss-remoting-3.3.3.Final-redhat-1.jar:3.3.3.Final-redhat-1]
        at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1]
        at org.xnio.channels.TranslatingSuspendableChannel.handleReadable(TranslatingSuspendableChannel.java:189) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1]
        at org.xnio.channels.TranslatingSuspendableChannel$1.handleEvent(TranslatingSuspendableChannel.java:103) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1]
        at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.10.GA-redhat-1.jar:3.0.10.GA-redhat-1]
        at org.xnio.nio.NioHandle.run(NioHandle.java:90)
        at org.xnio.nio.WorkerThread.run(WorkerThread.java:198)

When the network is disconnected the sockets are not closed inmediatly by the OS. The OS keeps the sockets opened till the max idle time is reached. This time is based in the OS variable tcp_retries2.
By default this variable is set to 15 which means about 924.6 seconds (15,4 min). After that the OS close the socket raising an error (IOException in java)

More info https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

When JBoss tries to shutdown the endpoint with an opened connection, the endpoint is expecting to read something from the socket (RemoteReadListener). This is not happening therefore the endpoint shutdown gets blocked till the operating system throws an error (closing the socket).


Attached an example to reproduce the problem with the jboss-remoting library.
mvn compile
mvn exec:java -Dexec.mainClass="example.Test"

To simulate a network disconnection, the following commands can be used:
ip link set lo down (disconnection)
ip link set lo up (restore the connection)

To modify in linux the value of the tcp_retries2 
echo 8 > /proc/sys/net/ipv4/tcp_retries2

This is happening also with the latest version of the jboss remoting.

Comment 4 Enrique Gonzalez Martinez 2014-09-22 14:09:08 UTC
Created attachment 940036 [details]
Thread dump during shutdown

Comment 5 Enrique Gonzalez Martinez 2014-09-23 08:39:03 UTC
More info after the thread dump.

1) The WorkerThread is waiting in Selector::select (locked)
2) When the channel is closed by the OS, the key you get from the select operation is an OP_READ
3) the RemoteReadListener is invoked in order to handle it.
4) when the listener tries to read from the channel, it throws an IOException (stack trace attached)
5) The listener removes the key.
6) there are no more keys to process and the WorkerThread executes its shutdown

Comment 9 Ladislav Thon 2015-01-20 10:32:42 UTC
Verified with EAP 6.3.3.CP.CR1.

(Verification note: nice reproducer in upstream JIRA.)


Note You need to log in before you can comment on or make changes to this bug.