org.jboss.as.clustering.service.ServiceProviderRegistryService is a multi-threaded class, but does not have any thread synchronization. In particular, there is a race condition between local calls to "register" and remotely triggered calls to "modified". This can result in the following order: - ThreadA: "register" reads the current cache keySet - ThreadB: "modified" call arrives for a new node started at the same time - ThreadB: "register" calls "notifyListeners" with the new (correct) list - ThreadA: "register" calls "notifyListeners" with the old (wrong) list Result: listener ends up with the wrong data last. For example, for singletons this can result in different election results on different nodes, resulting in multiple singletons or no singletons.
Reproduction steps: Install attached singleton.btm in node1 of a two-node cluster. Start node1. (the byteman script pauses the register() call for 20 seconds) A few seconds later start node2. Result: the calls are in the wrong order, with the older data last INFO [stdout] (notification-thread-0) XXX SingletonService.election candidates [jboss1/singleton, jboss2/singleton] ... INFO [stdout] (ServerService Thread Pool -- 55) XXX SingletonService.election candidates [jboss1/singleton] Expected result: the calls are in the correct order INFO [stdout] (ServerService Thread Pool -- 55) XXX SingletonService.election candidates [jboss1/singleton] ... INFO [stdout] (notification-thread-0) XXX SingletonService.election candidates [jboss1/singleton, jboss2/singleton]
Created attachment 1162371 [details] singleton.btm
[continuation of previous comment] An alternative expected result would be that both calls have the same key list. The important part is just that the last call has the correct entries.
Verified with EAP 6.4.10.CP.CR1
Retroactively bulk-closing issues from released EAP 6.4 cummulative patches.