Bug 1268185
Summary: | [GSS](6.4.z) Custom socket factory for JGroups subsystem not set correctly | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | dereed | ||||||||||||
Component: | Clustering | Assignee: | Romain Pelisse <rpelisse> | ||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Ladislav Thon <lthon> | ||||||||||||
Severity: | high | Docs Contact: | |||||||||||||
Priority: | high | ||||||||||||||
Version: | 6.4.3 | CC: | bmaxwell, brian.stansberry, cdewolf, dereed, egonzale, lthon, paul.ferraro, philfest, rpelisse, rsvoboda, smatasar, vtunka | ||||||||||||
Target Milestone: | CR1 | ||||||||||||||
Target Release: | EAP 6.4.5 | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | Type: | Bug | |||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 1235745, 1276471 | ||||||||||||||
Attachments: |
|
Description
dereed
2015-10-02 03:35:07 UTC
This can be fixed either by just removing the custom socket factory (which is not currently used anyways except if the race condition triggers), or by setting it in the correct place. If it's fixed by setting it in the correct place, BZ 1268186 is also required. PR sent upstream (still not merged) https://github.com/wildfly/wildfly-core/pull/1145 https://github.com/wildfly/wildfly/pull/8227 update in this issue: PR in wildfly core (upstream) is already merged https://github.com/wildfly/wildfly-core/pull/1145 wildfy issue was not merged but closed: https://github.com/wildfly/wildfly/pull/8227 Upstream PR related: https://github.com/wildfly/wildfly/pull/8297 https://github.com/wildfly/wildfly-core/pull/1145 https://github.com/wildfly/wildfly-core/pull/1181 Hi Paul, could you clarify BZ 1268186, does it need to be fixed ? This commit will be reverted in upstream once wildfly-core containing the fixes is upgraded in wildfly. https://github.com/wildfly/wildfly/commit/36f5bd99c8893a75ae0338708a5bd41263676060 (In reply to dereed from comment #11) > https://github.com/wildfly/wildfly/pull/8297 also fixes BZ 1268186. More specifically, the calls where BZ 1268186 was being triggered in this use case are removed in PR 8297. *** Bug 1268186 has been marked as a duplicate of this bug. *** https://github.com/jbossas/jboss-eap/pull/2599 (backport of the 3 PRs listed in #7) Using the default UDP stack and the default TCP stack in the the default standalone-ha.xml (with a single trivial change in the TCP case), I've verified that with this fix, a majority of sockets are created via the ManagedSocketFactory. However, the FD_SOCK protocol still creates its sockets via the JGroups' own DefaultSocketFactory. I will attach a Byteman script that shows this. How do we want to proceed? Created attachment 1088972 [details]
Byteman script to show ManagedSocketFactory vs. DefaultSocketFactory usage
I just tried with WildFly 10.0.0.CR4 and it works perfectly there, even the FD_SOCK sockets are created via the ManagedSocketFactory. This means that the backport is wrong. Created attachment 1089013 [details]
Node1(UDP) log
Created attachment 1089014 [details]
Node2(UDP) log
Created attachment 1089015 [details]
Node1(TCP) log
Created attachment 1089016 [details]
Node2(TCP) log
I've attached logs of both servers of a 2-node cluster (both with the default UDP stack and the default TCP stack), running with the attached Byteman script attached. I.e., they show the stacktraces. Look for the string "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! JGroups default". The attribute shared (transport tag) has a different default value: in wildfly (is false) so it is not a singleton https://github.com/wildfly/wildfly/blob/master/clustering/jgroups/extension/src/main/java/org/jboss/as/clustering/jgroups/subsystem/TransportResourceDefinition.java#L121 in eap is true so it is a singleton (point 2 in the comment 28 is executed) https://github.com/jbossas/jboss-eap/blob/6.x/clustering/jgroups/src/main/java/org/jboss/as/clustering/jgroups/subsystem/TransportResource.java#L65 This is the reason why it is not possible to reproduce the issue upstream with default values. I'm reopening the issue upstream as well. Enrique González Martínez <elguardian> updated the status of jira WFLY-5449 to Reopened Dennis Reed <dereed> updated the status of jira WFLY-5449 to Resolved I've confirmed Enrique's findings in #29. The issue with FD_SOCK's socket factory is a bug in JGroups related to singletons. It's a separate issue from this BZ -- EAP is setting the factory correctly now, and it's no longer breaking when used, but JGroups isn't using it for FD_SOCK. Upstream for the FD_SOCK issue: https://issues.jboss.org/browse/JGRP-1974 It's a separate bug in a different component, and shouldn't block this BZ. OK, makes sense. Verified with EAP 6.4.5.CP.CR1. Retroactively bulk-closing issues from released EAP 6.4 cumulative patches. Retroactively bulk-closing issues from released EAP 6.4 cumulative patches. |