+++ This bug was initially created as a clone of Bug #1064644 +++ Description of problem: Depending on when a thread context switch happens, an IIOP enabled EJB fails to be deployed by throwing java.lang.StackOverflowError. Here is the stack trace from the customer's log file. ~~~ 2014-02-09 07:11:06.412 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-8) MSC000001: Failed to start service jboss.deployment.subunit."XXXX.ear"."YYYY.jar".POST_MODULE: org.jboss.msc.service.StartException in service jboss.deployment.subunit."XXXX.ear"."YYYY.jar".POST_MODULE: JBAS018733: Failed to process phase POST_MODULE of subdeployment "YYYY.jar" of deployment "XXXX.ear" at org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:123) [jboss-as-server-7.1.3.Final-redhat-4.jar:7.1.3.Final-redhat-4] at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811) [jboss-msc-1.0.2.GA-redhat-2.jar:1.0.2.GA-redhat-2] at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746) [jboss-msc-1.0.2.GA-redhat-2.jar:1.0.2.GA-redhat-2] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_45] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45] Caused by: java.lang.StackOverflowError at java.util.WeakHashMap.expungeStaleEntries(WeakHashMap.java:274) [rt.jar:1.6.0_45] at java.util.WeakHashMap.getTable(WeakHashMap.java:302) [rt.jar:1.6.0_45] at java.util.WeakHashMap.get(WeakHashMap.java:349) [rt.jar:1.6.0_45] at org.jboss.as.jacorb.rmi.WorkCacheManager.lookupDone(WorkCacheManager.java:145) at org.jboss.as.jacorb.rmi.WorkCacheManager.getAnalysis(WorkCacheManager.java:87) at org.jboss.as.jacorb.rmi.InterfaceAnalysis.getInterfaceAnalysis(InterfaceAnalysis.java:53) at org.jboss.as.jacorb.rmi.Util.isValidRMIIIOP(Util.java:152) at org.jboss.as.jacorb.rmi.OperationAnalysis.<init>(OperationAnalysis.java:67) at org.jboss.as.jacorb.rmi.AttributeAnalysis.<init>(AttributeAnalysis.java:87) at org.jboss.as.jacorb.rmi.AttributeAnalysis.<init>(AttributeAnalysis.java:102) at org.jboss.as.jacorb.rmi.ContainerAnalysis.analyzeAttributes(ContainerAnalysis.java:554) at org.jboss.as.jacorb.rmi.ContainerAnalysis.doAnalyze(ContainerAnalysis.java:184) at org.jboss.as.jacorb.rmi.InterfaceAnalysis.doAnalyze(InterfaceAnalysis.java:62) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) [:1.6.0_45] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_45] at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_45] at org.jboss.as.jacorb.rmi.WorkCacheManager.doTheWork(WorkCacheManager.java:177) at org.jboss.as.jacorb.rmi.WorkCacheManager.getAnalysis(WorkCacheManager.java:105) at org.jboss.as.jacorb.rmi.InterfaceAnalysis.getInterfaceAnalysis(InterfaceAnalysis.java:53) at org.jboss.as.jacorb.rmi.Util.getTypeIDLName(Util.java:104) at org.jboss.as.jacorb.rmi.ParameterAnalysis.<init>(ParameterAnalysis.java:50) at org.jboss.as.jacorb.rmi.OperationAnalysis.<init>(OperationAnalysis.java:91) at org.jboss.as.jacorb.rmi.InterfaceAnalysis.analyzeOperations(InterfaceAnalysis.java:116) at org.jboss.as.jacorb.rmi.ContainerAnalysis.doAnalyze(ContainerAnalysis.java:186) at org.jboss.as.jacorb.rmi.InterfaceAnalysis.doAnalyze(InterfaceAnalysis.java:62) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) [:1.6.0_45] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_45] at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_45] ... at org.jboss.as.jacorb.rmi.WorkCacheManager.doTheWork(WorkCacheManager.java:177) at org.jboss.as.jacorb.rmi.WorkCacheManager.getAnalysis(WorkCacheManager.java:105) at org.jboss.as.jacorb.rmi.InterfaceAnalysis.getInterfaceAnalysis(InterfaceAnalysis.java:53) at org.jboss.as.jacorb.rmi.Util.getTypeIDLName(Util.java:104) at org.jboss.as.jacorb.rmi.ParameterAnalysis.<init>(ParameterAnalysis.java:50) at org.jboss.as.jacorb.rmi.OperationAnalysis.<init>(OperationAnalysis.java:91) at org.jboss.as.jacorb.rmi.InterfaceAnalysis.analyzeOperations(InterfaceAnalysis.java:116) at org.jboss.as.jacorb.rmi.ContainerAnalysis.doAnalyze(ContainerAnalysis.java:186) at org.jboss.as.jacorb.rmi.InterfaceAnalysis.doAnalyze(InterfaceAnalysis.java:62) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) [:1.6.0_45] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_45] at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_45] ~~~ The last part including "Util.getTypeIDLName" has been repeated as many as possible and the bottom of the stack has been lost. The same stack has been appeared in a different MSC service thread which is also executing an IIOP enabled EJB deployment. The customer identified an improper synchronization in org.jboss.as.jacorb.rmi.WorkCacheManager#getAnalysis (). It is reproducible by manually controlling a thread context switch using a debugger. Version-Release number of selected component (if applicable): EAP 6.0.1 (this is a bug of org.jboss.as.jacorb, not JacORB) How reproducible: Once in the customer's environment. But intentionally reproducible using a debugger. I'll update with the reproducing step soon. --- Additional comment from Osamu Nagano on 2014-02-12 22:09:24 EST --- REPRODUING STEPS: 0. Prepare EAP 6.0.1 standalone-full instance and deploy the following 2 EJB jar archives from attached jts.zip. jts/ejbiiop01/target/ejbiiop01.jar jts/ejbiiop02/target/ejbiiop02.jar And enable a remote debug by commenting out the following line in bin/standalone.conf. JAVA_OPTS="$JAVA_OPTS -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=y" 1. Restart the server. The process has been suspended for a debugger. Connect using a debugger (Eclipse is better) and set a breakpoint at the following line. jacorb/src/main/java/org/jboss/as/jacorb/rmi/Util.java:104 ~~~ 102 // remote interface? 103 if (cls.isInterface() && java.rmi.Remote.class.isAssignableFrom(cls)) { 104 InterfaceAnalysis ia = InterfaceAnalysis.getInterfaceAnalysis(cls); 105 106 return ia.getIDLModuleName() + "::" + ia.getIDLName(); 107 } ~~~ 2. Resume the server. Deployment of those EJBs runs in parallel and hit the breakpoint in 2 threads. The target class is "javax.ejb.EJBObject". Set a new breakpoint at the following line. jacorb/src/main/java/org/jboss/as/jacorb/rmi/WorkCacheManager.java:105 ~~~ 83 ContainerAnalysis getAnalysis(final Class cls) throws RMIIIOPViolationException { ... 104 // Do the work 105 doTheWork(cls, ret); 106 ~~~ Then disable the old breakpoint and resume those 2 threads. 3. Hit the new breakpoint in 2 threads. Let call those threads A and B. Resume A and B alternately. This process continues forever and a stack accumulates in both threads. --- Additional comment from Osamu Nagano on 2014-02-12 22:16:34 EST --- Here is my analysis. A key of hash map "workInProgress" is just a class object, but it should be a pair of class AND thread. Or the following method should be synchronized as a whole, not being devided by "doTheWork()". jacorb/src/main/java/org/jboss/as/jacorb/rmi/WorkCacheManager.java ~~~ 83 ContainerAnalysis getAnalysis(final Class cls) throws RMIIIOPViolationException { 84 ContainerAnalysis ret; 85 86 synchronized (this) { 87 ret = lookupDone(cls); 88 if (ret != null) 89 return ret; 90 91 // is it work-in-progress? 92 final InProgress inProgress = (InProgress) workInProgress.get(cls); 93 if (inProgress != null) { 94 if (inProgress.thread == Thread.currentThread()) 95 return inProgress.analysis; // return unfinished 96 97 // Do not wait for the other thread: We may deadlock 98 // Double work is better that deadlock... 99 } 100 101 ret = createWorkInProgress(cls); 102 } 103 104 // Do the work 105 doTheWork(cls, ret); 106 107 // We did it 108 synchronized (this) { 109 workInProgress.remove(cls); 110 workDone.put(cls, new SoftReference(ret)); 111 notifyAll(); 112 } 113 114 return ret; 115 } ~~~ --- Additional comment from Osamu Nagano on 2014-03-03 03:50:39 EST --- Pull request sent: https://github.com/jbossas/jboss-eap/pull/995
Pull request sent to 6.2.x branch: https://github.com/jbossas/jboss-eap/pull/1012
New PR has been sent to 6.2.x branch to reflect an additional fix in the upstream. https://github.com/jbossas/jboss-eap/pull/1160 The previous PR, #1012, has been closed without merging because the new one includes the same.
Stuart Douglas <stuart.w.douglas> updated the status of jira WFLY-2920 to Resolved
Verified in 6.2.3.CR2.