Bug 1262114 - [GSS](6.4.z) Deadlock when connection is closing while we are writing
[GSS](6.4.z) Deadlock when connection is closing while we are writing
Status: CLOSED CURRENTRELEASE
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Remoting (Show other bugs)
6.4.3
Unspecified Unspecified
unspecified Severity unspecified
: CR1
: EAP 6.4.5
Assigned To: Aaron Ogburn
Jitka Kozana
:
Depends On:
Blocks: 1262449 1264927 1265008 1266518 1278889 1235745 1253482
  Show dependency treegraph
 
Reported: 2015-09-10 16:14 EDT by Aaron Ogburn
Modified: 2017-01-17 06:43 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker REM3-204 Major Resolved Deadlock when connection is closing while we are writing 2017-11-22 01:57 EST

  None (edit)
Description Aaron Ogburn 2015-09-10 16:14:08 EDT
Description of problem:

https://issues.jboss.org/browse/REM3-204

When the connection is closed while we are sending some content, we have a deadlock happening between the RemoteConnection.RemoteWriteListener.queue and the BufferPipeOutputStream, for example:

"Remoting read-1":
	at org.jboss.remoting3.remote.OutboundMessage.cancel(OutboundMessage.java:288)
	- waiting to lock <0xdd0ae4c0> (a org.xnio.streams.BufferPipeOutputStream)
	at org.jboss.remoting3.remote.RemoteConnectionChannel.closeMessages(RemoteConnectionChannel.java:560)
	at org.jboss.remoting3.remote.RemoteConnectionChannel.closeAction(RemoteConnectionChannel.java:542)
	at org.jboss.remoting3.spi.AbstractHandleableCloseable.closeAsync(AbstractHandleableCloseable.java:372)
	at org.jboss.remoting3.remote.RemoteConnectionHandler.closeAllChannels(RemoteConnectionHandler.java:429)
	at org.jboss.remoting3.remote.RemoteConnectionHandler.sendCloseRequest(RemoteConnectionHandler.java:233)
	at org.jboss.remoting3.remote.RemoteConnectionHandler.handleConnectionClose(RemoteConnectionHandler.java:113)
	at org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:81)
	- locked <0xdd29f670> (a java.util.ArrayDeque)
	at org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:45)
	at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72)
	at org.xnio.channels.TranslatingSuspendableChannel.handleReadable(TranslatingSuspendableChannel.java:189)
	at org.xnio.channels.TranslatingSuspendableChannel$1.handleEvent(TranslatingSuspendableChannel.java:103)
	at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72)
	at org.xnio.channels.TranslatingSuspendableChannel.handleReadable(TranslatingSuspendableChannel.java:189)
	at org.xnio.ssl.JsseConnectedSslStreamChannel.handleReadable(JsseConnectedSslStreamChannel.java:183)
	at org.xnio.channels.TranslatingSuspendableChannel$1.handleEvent(TranslatingSuspendableChannel.java:103)
	at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72)
	at org.xnio.nio.NioHandle.run(NioHandle.java:90)
	at org.xnio.nio.WorkerThread.run(WorkerThread.java:198)
"Remoting task-2":
	at org.jboss.remoting3.remote.RemoteConnection$RemoteWriteListener.send(RemoteConnection.java:294)
	- waiting to lock <0xdd29f670> (a java.util.ArrayDeque)
	at org.jboss.remoting3.remote.RemoteConnection.send(RemoteConnection.java:122)
	at org.jboss.remoting3.remote.OutboundMessage$1.accept(OutboundMessage.java:154)
	at org.xnio.streams.BufferPipeOutputStream.send(BufferPipeOutputStream.java:126)
	at org.xnio.streams.BufferPipeOutputStream.send(BufferPipeOutputStream.java:114)
	at org.xnio.streams.BufferPipeOutputStream.flush(BufferPipeOutputStream.java:143)
	- locked <0xdd0ae4c0> (a org.xnio.streams.BufferPipeOutputStream)
	at org.xnio.streams.BufferPipeOutputStream.close(BufferPipeOutputStream.java:161)
	- locked <0xdd0ae4c0> (a org.xnio.streams.BufferPipeOutputStream)
	at org.jboss.remoting3.remote.OutboundMessage.close(OutboundMessage.java:283)
	- locked <0xdd0ae4c0> (a org.xnio.streams.BufferPipeOutputStream)
	at org.jboss.as.ejb3.remote.protocol.versionone.ChannelAssociation.releaseChannelMessageOutputStream(ChannelAssociation.java:85)
	at org.jboss.as.ejb3.remote.EJBRemoteConnectorService.sendVersionMessage(EJBRemoteConnectorService.java:184)
	at org.jboss.as.ejb3.remote.EJBRemoteConnectorService.access$000(EJBRemoteConnectorService.java:73)
	at org.jboss.as.ejb3.remote.EJBRemoteConnectorService$ChannelOpenListener.channelOpened(EJBRemoteConnectorService.java:211)
	at org.jboss.remoting3.spi.SpiUtils$ServiceOpenTask.run(SpiUtils.java:126)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)


Version-Release number of selected component (if applicable):

3.3.5
Comment 4 Enrique Gonzalez Martinez 2015-09-18 05:07:28 EDT
Hi Aaron:

The problem is that when an endpoint receives a CLOSE signal it tries to send another CLOSE to the other end resulting in this behaviour. IMO the problem is in here:(send something is async, and closing the channel is also async... this causes the race condition)

This is the entry point when the endpoint receives the CLOSE signal
https://github.com/jboss-remoting/jboss-remoting/blob/3.3.5.Final/src/main/java/org/jboss/remoting3/remote/RemoteConnectionHandler.java#L112

This is the entry point when the endpoint sends the CLOSE signal
https://github.com/jboss-remoting/jboss-remoting/blob/3.3.5.Final/src/main/java/org/jboss/remoting3/remote/RemoteConnectionHandler.java#L409

IMO the offender is:
https://github.com/jboss-remoting/jboss-remoting/blob/3.3.5.Final/src/main/java/org/jboss/remoting3/remote/RemoteConnectionHandler.java#L113

should be executed only when the endpoint is closed (closeActionn) and not when is receiving the CLOSE SIGNAL (handling close)
Comment 5 Aaron Ogburn 2015-09-21 16:00:41 EDT
Thinking on Enrique's suggestion, I'm not sure it explains it because sendCloseRequest would only be called after the connection is flagged as closing in RemoteReadListener.handleEvent.  So anything sent as a result of the sendCloseRequest should actually not send because of the closing check added to RemoteConnection$RemoteWriteListener.send.

But I've noticed that the possibility for this deadlock is introduced by the changes for the RejectedExecutionException bug:

https://github.com/jboss-remoting/jboss-remoting/commit/97bddc0d16deb421f1dea0baa61aeaeaa7c504b4


We removed the executor altogether that used to handle the message cancels during channel close.  Thus the RemoteConnectionHandler.closeAllChannels call is blocking in and can deadlock in OutboundMessage.cancel on remoting 3.3.5.Final.

Thus, this deadlock is avoidable on EAP 6.4.1 and earlier for now.
Comment 7 Aaron Ogburn 2015-09-21 17:44:41 EDT
3.3 PR to revert prior changes:

https://github.com/jboss-remoting/jboss-remoting/pull/46
Comment 8 Aaron Ogburn 2015-09-21 18:12:51 EDT
Commit to avoid RejectedExecutionExceptions as well without having removed the executor to introduce these deadlocks:

https://github.com/jboss-remoting/jboss-remoting/commit/61c32c01c7b9f893a50842f08d3ebe9d3ef81797
Comment 9 Enrique Gonzalez Martinez 2015-09-22 03:22:45 EDT
Hi Aaron:

if your fix for this issue is reverting things, they are already in:

PR 3.3: https://github.com/jboss-remoting/jboss-remoting/pull/46
Upstream: not required.

Your comment https://bugzilla.redhat.com/show_bug.cgi?id=1262114#c8 is related to another BZ (https://bugzilla.redhat.com/show_bug.cgi?id=1238420)
Comment 11 Richard Janík 2015-11-05 10:09:46 EST
Verified.

For the record, I used reproducer from https://bugzilla.redhat.com/show_bug.cgi?id=1264927 and tried with the reproducer.btm present in there and also with a reproducer.btm containing the following rule:

RULE trigger deadlock two
CLASS org.jboss.remoting3.remote.OutboundMessage
METHOD cancel
AT ENTRY
IF TRUE
DO Thread.sleep(3000)
ENDRULE
Comment 12 Petr Penicka 2017-01-17 06:43:32 EST
Retroactively bulk-closing issues from released EAP 6.4 cumulative patches.
Comment 13 Petr Penicka 2017-01-17 06:43:37 EST
Retroactively bulk-closing issues from released EAP 6.4 cumulative patches.

Note You need to log in before you can comment on or make changes to this bug.