Bug 1027733

Summary: Failed to start service jboss.remoting.endpoint.management with NPE as cause
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Richard Janík <rjanik>
Component: RemotingAssignee: baranowb <bbaranow>
Status: CLOSED WORKSFORME QA Contact: Jitka Kozana <jkudrnac>
Severity: high Docs Contact: Russell Dickenson <rdickens>
Priority: unspecified    
Version: 6.2.0CC: bbaranow, cdewolf, jmartisk, rjanik, rsvoboda
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-06-19 09:58:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Richard Janík 2013-11-07 10:54:53 UTC
Description of problem:

EAP 6.2.0.ER7

Here is the stacktrace:

07:35:07,893 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-10) MSC000001: Failed to start service jboss.remoting.endpoint.management: org.jboss.msc.service.StartException in service jboss.remoting.endpoint.management: Failed to start service
	at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1767) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1]
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_45]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_45]
	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45]
Caused by: java.lang.NullPointerException
	at sun.nio.ch.Util.atBugLevel(Util.java:448) [rt.jar:1.6.0_45]
	at sun.nio.ch.SelectorImpl.<init>(SelectorImpl.java:40) [rt.jar:1.6.0_45]
	at sun.nio.ch.WindowsSelectorImpl.<init>(WindowsSelectorImpl.java:102) [rt.jar:1.6.0_45]
	at sun.nio.ch.WindowsSelectorProvider.openSelector(WindowsSelectorProvider.java:26) [rt.jar:1.6.0_45]
	at java.nio.channels.Selector.open(Selector.java:209) [rt.jar:1.6.0_45]
	at org.xnio.nio.NioXnioWorker.<init>(NioXnioWorker.java:112)
	at org.xnio.nio.NioXnio.createWorker(NioXnio.java:126)
	at org.jboss.remoting3.EndpointImpl.construct(EndpointImpl.java:137) [jboss-remoting-3.2.17.GA-redhat-1.jar:3.2.17.GA-redhat-1]
	at org.jboss.remoting3.Remoting.createEndpoint(Remoting.java:60) [jboss-remoting-3.2.17.GA-redhat-1.jar:3.2.17.GA-redhat-1]
	at org.jboss.remoting3.Remoting.createEndpoint(Remoting.java:73) [jboss-remoting-3.2.17.GA-redhat-1.jar:3.2.17.GA-redhat-1]
	at org.jboss.as.remoting.EndpointService.start(EndpointService.java:71) [jboss-as-remoting-7.3.0.Final-redhat-10.jar:7.3.0.Final-redhat-10]
	at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1]
	at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1]
	... 3 more

I've caught this while running clustering testsuite (in DistributionWebFailoverTestCase) and only once from all the configurations - on Windows 64b, oracle jdk 1.7.

https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-jgroups-tcpgossip-win-matrix/68/jdk=java16_default,label=Win2k8_x86_64/testReport/org.jboss.as.test.clustering.cluster.web/DistributionWebFailoverTestCase%28SYNC-tcp%29/testGracefulSimpleFailover/

Comment 1 Jan Martiska 2014-01-07 14:54:47 UTC
I hit this too. RHEL 5 and 6. EAP 6.2.0.GA.

This is caused by a bug in HotSpot JDK 5 and 6: http://bugs.sun.com/view_bug.do?bug_id=6427854
 
It is fixed in JDK 1.7.
This can happen because the method sun.nio.ch.Util.atBugLevel is not thread-safe, so if it is run by multiple threads simultaneously, a NPE can emerge.

The question is if it is possible on JBoss remoting side to make sure that the method will not be run concurrently, which it sometimes does, apparently, at least during a org.jboss.remoting3.EndpointImpl.construct call.

Comment 2 Jan Martiska 2014-01-07 14:58:15 UTC
High severity because it intermittently causes EAP 6.2 to be unable to boot on JDK 1.6. And it happened to me twice just today.

Comment 8 Rostislav Svoboda 2015-06-29 10:10:36 UTC
qa_nacking for CP, no longer reproducible.