TP.registerProbeHandlers is not thread safe since it modifies preregistered_probe_handlers outside of any synchronization. If a thread calls this method while another thread is inside startDiagnostics (which can happen easily with a shared transport), it can cause a NullPointerException when startDiagnostics is looping through preregistered_probe_handlers. Access to preregistered_probe_handlers should be synchronized.
This bug is inside JGroups.
Already fixed in upstream/EAP 7.
Backporting JGRP-1869 also included https://issues.jboss.org/browse/JGRP-1834.
Testing details: In order to trigger, diagnostics must be enabled: - add a new socket-binding <socket-binding name="diag" port="0" multicast-address="${jboss.default.multicast.address:230.0.0.4}" multicast-port="12345"/> - add diagnostics-socket-binding for that new socket binding <transport type="UDP" socket-binding="jgroups-udp" diagnostics-socket-binding="diag"/> And multiple JGroups channels (with the same shared transport) must be started. For example, deploy both a <distributable/> war and a @Clustered EJB. Then it's a timing race condition. I have not been successful forcing it to trigger with Byteman yet, but have occasionally when just starting EAP with the above configuration.
Hello Dennis, I cannot reproduce the issue. Could you attach the appropriate standalone.xml and deployment, please?
Created attachment 1174754 [details] test.ear
Created attachment 1174755 [details] standalone-ha.xml
Attached an example deployment to trigger the issue (an ear with a <distributable/> war and a @Clustered EJB), and standalone-ha.xml from EAP 6.4.6 (the version I had easily available) with the two changes as detailed in #4 to enable the diagnostics socket.
And as mentioned above, it's a race condition and I was not able to get a test case to consistently trigger it. With this simple application the errors will trigger occasionally on startup of EAP.
Thank you Dennis, error occured in EAP 6.4.6 7 times in 10 starts, error did not occured in EAP 6.4.9 in 30 starts. Verified with EAP 6.4.9.CP.CR2
Retroactively bulk-closing issues from released EAP 6.4 cummulative patches.