Bug 956805
Summary: | Failed to start the Infinispan subsystem with cause NullPointerException in domain mode | ||
---|---|---|---|
Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | wfink |
Component: | EJB | Assignee: | David M. Lloyd <david.lloyd> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Jan Martiska <jmartisk> |
Severity: | unspecified | Docs Contact: | Russell Dickenson <rdickens> |
Priority: | unspecified | ||
Version: | 6.0.1 | CC: | brian.stansberry, cdewolf, dereed, ewertz, myarboro, rhusar |
Target Milestone: | ER1 | ||
Target Release: | EAP 6.2.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-12-15 16:14:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
wfink
2013-04-25 17:14:23 UTC
For this to NPE to happen, EJBRemoteConnectorService.getEJBRemoteConnectorSocketBinding() would have to return null. SocketBinding getEJBRemoteConnectorSocketBinding() { if (this.remotingServer == null) { return null; } return this.remotingServer.getSocketBinding(); } So either this.remotingServer would have to be null, or this.remotingServer.getSocketBinding() returns null. The latter is unlikely; it returns a value that is injected via a simple dependency in the RemotingService utility class: .addDependency(bindingName, SocketBinding.class, streamServerService.getSocketBindingInjector()) The former is a bit more unusual. The value of "remotingServer" is provided to EJBRemoteConnectorService in an odd way. The management operation handler EJB3RemoteServiceAdd adds a dependency: // add dependency on the remoting server (which allows remoting connector to connect to it) ejbRemoteConnectorServiceBuilder.addDependency(remotingServerServiceName); and then in EJBRemoteConnectorService.start() the depended on service is looked up from MSC: // get the remoting server (which allows remoting connector to connect to it) service final ServiceContainer serviceContainer = context.getController().getServiceContainer(); final ServiceController streamServerServiceController = serviceContainer.getRequiredService(this.remotingConnectorServiceName); final AbstractStreamServerService streamServerService = (AbstractStreamServerService) streamServerServiceController.getService(); I don't understand why this is handled in this convoluted way. Why isn't an InjectedValue used, with EJB3RemoteServiceAdd setting up an injection? This unusual way of doing this should still work though. (In reply to Brian Stansberry from comment #1) > > I don't understand why this is handled in this convoluted way. Why isn't an > InjectedValue used, with EJB3RemoteServiceAdd setting up an injection? > Never mind; I get it. The Service<T> is needed, not the T. Confirmed in 6.1.1ER4 Hit this while trying to dig into a variety of failures associated with org.jboss.as.test.clustering.cluster.ejb3.stateless.RemoteStatelessFailoverTestCase and BZ 921532. It appears to be a race condition. Like Brian theorized, this.remotingServer == null when this failure occurs. This is because EJBRemoteConnectorService.start(), which handles the unconventional dependency injection Brian described, hasn't been called when infinispan attempts to get the remoteServer information. So it fails. The dependency injection itself doesn't fail, and shows up milliseconds later in the logs, but by then it's too late. Wanted to get this information down somewhere while I'm still digging on how to actually fix it. What I could never figure out is why the EJBRemoteConnectorService.start() wouldn't have been called. You could have a race where it gets called, but too late, if there isn't a proper dependency somewhere. But I couldn't find a code path that would result in a missing dependency. Whew. 4-service dependency interaction is fun. ClusteredBackingCacheEntryStoreSourceService starts, calls thru to RegisteryCollector, which notifies it's listener LocalEjbReceiver, which needs the EJBRemoteConnectorService.remotingServer value. Technically speaking, the LocalEjbReceiver implementation of the RegisteryCollector listener is what requires the EJBRemoteConnectorService.remotingService value, but right now the LocalEjbReceiver only has a 'ServiceLookupValue' attachment to the EJBRemoteConnectorService. Not a dependency. However, if that dependency was added, I don't think it would stop the problem. The ClusteredBackingCacheEntryStoreSourceService only depends on the RegistryCollector and a 'ClientMappingRegistry'. Technically, if both of those started, the ClusteredBacking could start without the LocalEjbReceiver or the EJBRemoteConnectorService having started yet. I think. But I can't reproduce this anymore, so I'm not entirely sure this is actually the solution. Nevermind that last part. Reproduction successful. Also confirmed theory and solution. Cleaning, checking upstream, and submitting pull requests tomorrow. Verified in 6.2.0.ER1. Assigning jpai EJB issues to david.lloyd. Please re-assign to Cheng or others as needed. Customer tested workaround: start EAP without any deployments (or at least no clustered EJBs deployed), then add the deployments after it's fully started. |