Bug 1268185 - [GSS](6.4.z) Custom socket factory for JGroups subsystem not set correctly
[GSS](6.4.z) Custom socket factory for JGroups subsystem not set correctly
Status: CLOSED CURRENTRELEASE
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Clustering (Show other bugs)
6.4.3
Unspecified Unspecified
high Severity high
: CR1
: EAP 6.4.5
Assigned To: Romain Pelisse
Ladislav Thon
:
: 1268186 (view as bug list)
Depends On:
Blocks: 1276471 1235745
  Show dependency treegraph
 
Reported: 2015-10-01 23:35 EDT by dereed
Modified: 2017-01-17 06:46 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Byteman script to show ManagedSocketFactory vs. DefaultSocketFactory usage (1.89 KB, text/plain)
2015-11-03 08:17 EST, Ladislav Thon
no flags Details
Node1(UDP) log (104.86 KB, text/plain)
2015-11-03 09:18 EST, Ladislav Thon
no flags Details
Node2(UDP) log (104.72 KB, text/plain)
2015-11-03 09:18 EST, Ladislav Thon
no flags Details
Node1(TCP) log (101.46 KB, text/plain)
2015-11-03 09:19 EST, Ladislav Thon
no flags Details
Node2(TCP) log (98.15 KB, text/plain)
2015-11-03 09:19 EST, Ladislav Thon
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker WFCORE-1033 Critical Resolved ManagedDatagramSocketBinding and ManagedMulticastSocketBinding throw NPE if created with bind address 2017-03-22 07:58 EDT
JBoss Issue Tracker WFCORE-1063 Major Resolved Creating named datagram socket throws ISE 2017-03-22 07:58 EDT
JBoss Issue Tracker WFCORE-1064 Major Resolved Creating named multicast socket doesn't properly register socket using name 2017-03-22 07:58 EDT
JBoss Issue Tracker WFLY-5449 Blocker Resolved Custom socket factory for JGroups subsystem not set correctly 2017-03-22 07:58 EDT

  None (edit)
Description dereed 2015-10-01 23:35:07 EDT
EAP's JChannelFactory tries to set a custom socket factory on the JGroups transport.

This is not the correct API to use, and it gets overwritten when the JGroups channel starts.
A custom socket factory should be set on the JChannel.

The only time the custom socket factory is currently used is if there's a race condition where two channels are started at the same time, and the custom factory is set just before the other channel uses it.
Comment 1 dereed 2015-10-01 23:42:11 EDT
This can be fixed either by just removing the custom socket factory (which is not currently used anyways except if the race condition triggers), or by setting it in the correct place.

If it's fixed by setting it in the correct place, BZ 1268186 is also required.
Comment 2 Enrique Gonzalez Martinez 2015-10-06 02:49:40 EDT
PR sent upstream (still not merged)

https://github.com/wildfly/wildfly-core/pull/1145
https://github.com/wildfly/wildfly/pull/8227
Comment 6 Enrique Gonzalez Martinez 2015-10-20 07:06:45 EDT
update in this issue:

PR in wildfly core (upstream) is already merged
https://github.com/wildfly/wildfly-core/pull/1145

wildfy issue was not merged but closed:
https://github.com/wildfly/wildfly/pull/8227
Comment 9 Enrique Gonzalez Martinez 2015-10-21 07:17:52 EDT
Hi Paul, could you clarify BZ 1268186, does it need to be fixed ?
Comment 10 Enrique Gonzalez Martinez 2015-10-22 09:42:11 EDT
This commit will be reverted in upstream once wildfly-core containing the fixes is upgraded in wildfly.

https://github.com/wildfly/wildfly/commit/36f5bd99c8893a75ae0338708a5bd41263676060
Comment 11 dereed 2015-10-22 22:22:09 EDT
https://github.com/wildfly/wildfly/pull/8297 also fixes BZ 1268186.
Comment 12 dereed 2015-10-22 22:25:24 EDT
(In reply to dereed from comment #11)
> https://github.com/wildfly/wildfly/pull/8297 also fixes BZ 1268186.

More specifically, the calls where BZ 1268186 was being triggered in this use case are removed in PR 8297.
Comment 13 Enrique Gonzalez Martinez 2015-10-23 02:40:31 EDT
*** Bug 1268186 has been marked as a duplicate of this bug. ***
Comment 15 dereed 2015-10-23 22:30:07 EDT
https://github.com/jbossas/jboss-eap/pull/2599
(backport of the 3 PRs listed in #7)
Comment 17 Ladislav Thon 2015-11-03 08:15:47 EST
Using the default UDP stack and the default TCP stack in the the default standalone-ha.xml (with a single trivial change in the TCP case), I've verified that with this fix, a majority of sockets are created via the ManagedSocketFactory.

However, the FD_SOCK protocol still creates its sockets via the JGroups' own DefaultSocketFactory. I will attach a Byteman script that shows this.

How do we want to proceed?
Comment 19 Ladislav Thon 2015-11-03 08:17 EST
Created attachment 1088972 [details]
Byteman script to show ManagedSocketFactory vs. DefaultSocketFactory usage
Comment 20 Ladislav Thon 2015-11-03 08:37:13 EST
I just tried with WildFly 10.0.0.CR4 and it works perfectly there, even the FD_SOCK sockets are created via the ManagedSocketFactory. This means that the backport is wrong.
Comment 23 Ladislav Thon 2015-11-03 09:18 EST
Created attachment 1089013 [details]
Node1(UDP) log
Comment 24 Ladislav Thon 2015-11-03 09:18 EST
Created attachment 1089014 [details]
Node2(UDP) log
Comment 25 Ladislav Thon 2015-11-03 09:19 EST
Created attachment 1089015 [details]
Node1(TCP) log
Comment 26 Ladislav Thon 2015-11-03 09:19 EST
Created attachment 1089016 [details]
Node2(TCP) log
Comment 27 Ladislav Thon 2015-11-03 09:21:56 EST
I've attached logs of both servers of a 2-node cluster (both with the default UDP stack and the default TCP stack), running with the attached Byteman script attached. I.e., they show the stacktraces. Look for the string "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! JGroups default".
Comment 29 Enrique Gonzalez Martinez 2015-11-04 06:58:32 EST
The attribute shared (transport tag) has a different default value:

in wildfly (is false) so it is not a singleton
https://github.com/wildfly/wildfly/blob/master/clustering/jgroups/extension/src/main/java/org/jboss/as/clustering/jgroups/subsystem/TransportResourceDefinition.java#L121

in eap is true so it is a singleton (point 2 in the comment 28 is executed)
https://github.com/jbossas/jboss-eap/blob/6.x/clustering/jgroups/src/main/java/org/jboss/as/clustering/jgroups/subsystem/TransportResource.java#L65

This is the reason why it is not possible to reproduce the issue upstream with default values.

I'm reopening the issue upstream as well.
Comment 30 JBoss JIRA Server 2015-11-04 06:59:29 EST
Enrique González Martínez <elguardian@gmail.com> updated the status of jira WFLY-5449 to Reopened
Comment 31 JBoss JIRA Server 2015-11-04 11:10:01 EST
Dennis Reed <dereed@redhat.com> updated the status of jira WFLY-5449 to Resolved
Comment 32 dereed 2015-11-04 11:13:16 EST
I've confirmed Enrique's findings in #29.
The issue with FD_SOCK's socket factory is a bug in JGroups related to singletons.
It's a separate issue from this BZ -- EAP is setting the factory correctly now, and it's no longer breaking when used, but JGroups isn't using it for FD_SOCK.
Comment 33 dereed 2015-11-04 11:39:00 EST
Upstream for the FD_SOCK issue:
https://issues.jboss.org/browse/JGRP-1974

It's a separate bug in a different component, and shouldn't block this BZ.
Comment 34 Ladislav Thon 2015-11-05 02:25:29 EST
OK, makes sense.

Verified with EAP 6.4.5.CP.CR1.
Comment 40 Petr Penicka 2017-01-17 06:46:34 EST
Retroactively bulk-closing issues from released EAP 6.4 cumulative patches.
Comment 41 Petr Penicka 2017-01-17 06:46:38 EST
Retroactively bulk-closing issues from released EAP 6.4 cumulative patches.

Note You need to log in before you can comment on or make changes to this bug.