1340342 – [GSS](6.4.z) Race condition in ServiceProviderRegistryService

Bug 1340342 - [GSS](6.4.z) Race condition in ServiceProviderRegistryService

Summary: [GSS](6.4.z) Race condition in ServiceProviderRegistryService

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	JBoss Enterprise Application Platform 6
Classification:	JBoss
Component:	Clustering
Sub Component:
Version:	6.4.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	CR1
Target Release:	EAP 6.4.10
Assignee:	Fedor Gavrilov
QA Contact:	Michal Vinkler
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	eap6410-payload 1350064
TreeView+	depends on / blocked

Reported:	2016-05-27 06:06 UTC by dereed
Modified:	2017-01-17 12:56 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-01-17 12:56:40 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)
singleton.btm (1.30 KB, text/plain) 2016-05-27 06:10 UTC, dereed	no flags	Details
View All

Description dereed 2016-05-27 06:06:30 UTC

org.jboss.as.clustering.service.ServiceProviderRegistryService is a multi-threaded class, but does not have any thread synchronization.

In particular, there is a race condition between local 
calls to "register" and remotely triggered calls to "modified".

This can result in the following order:
- ThreadA: "register" reads the current cache keySet
- ThreadB: "modified" call arrives for a new node started at the same time
- ThreadB: "register" calls "notifyListeners" with the new (correct) list
- ThreadA: "register" calls "notifyListeners" with the old (wrong) list

Result: listener ends up with the wrong data last.
For example, for singletons this can result in different election results on 
different nodes, resulting in multiple singletons or no singletons.

Comment 1 dereed 2016-05-27 06:09:56 UTC

Reproduction steps:
Install attached singleton.btm in node1 of a two-node cluster.
Start node1.
(the byteman script pauses the register() call for 20 seconds)
A few seconds later start node2.

Result: the calls are in the wrong order, with the older data last
INFO  [stdout] (notification-thread-0) XXX SingletonService.election candidates [jboss1/singleton, jboss2/singleton]
...
INFO  [stdout] (ServerService Thread Pool -- 55) XXX SingletonService.election candidates [jboss1/singleton]

Expected result: the calls are in the correct order
INFO  [stdout] (ServerService Thread Pool -- 55) XXX SingletonService.election candidates [jboss1/singleton]
...
INFO  [stdout] (notification-thread-0) XXX SingletonService.election candidates [jboss1/singleton, jboss2/singleton]

Comment 2 dereed 2016-05-27 06:10:33 UTC

Created attachment 1162371 [details]
singleton.btm

Comment 3 dereed 2016-05-27 06:21:45 UTC

[continuation of previous comment]

An alternative expected result would be that both calls have the same key list.
The important part is just that the last call has the correct entries.

Comment 11 Jiří Bílek 2016-08-09 13:42:37 UTC

Verified with EAP 6.4.10.CP.CR1

Comment 12 Petr Penicka 2017-01-17 12:56:40 UTC

Retroactively bulk-closing issues from released EAP 6.4 cummulative patches.

Note You need to log in before you can comment on or make changes to this bug.