Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1064644 - [GSS] (6.3.0) StackOverflowError when org.jboss.as.jacorb.rmi.InterfaceAnalysis is analyzing javax.ejb.EJBObject
[GSS] (6.3.0) StackOverflowError when org.jboss.as.jacorb.rmi.InterfaceAnalys...
Status: CLOSED CURRENTRELEASE
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: IIOP (Show other bugs)
6.0.1
Unspecified Unspecified
unspecified Severity unspecified
: DR3
: EAP 6.3.0
Assigned To: Osamu Nagano
:
Depends On:
Blocks: 1065196 1072747
  Show dependency treegraph
 
Reported: 2014-02-12 21:44 EST by Osamu Nagano
Modified: 2014-06-28 11:38 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
In previous versions of JBoss EAP 6, it was found that, depending on the timing of a thread context switch, IIOP-enabled EJBs could fail to deploy correctly, causing a StackOverflowError. This issue was traced to improper thread synchronization in `org.jboss.as.jacorb.rmi.WorkCacheManager`. The issue has been corrected in this release and the StackOverflowError no longer presents.
Story Points: ---
Clone Of:
: 1065196 1072747 (view as bug list)
Environment:
Last Closed: 2014-06-28 11:38:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Two IIOP-enabled EJB archives from Quickstarts (48.50 KB, application/zip)
2014-02-12 21:44 EST, Osamu Nagano
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker WFLY-2920 Major Resolved StackOverflowError when org.jboss.as.jacorb.rmi.InterfaceAnalysis is analyzing javax.ejb.EJBObject 2015-09-14 22:51:19 EDT

  None (edit)
Description Osamu Nagano 2014-02-12 21:44:05 EST
Created attachment 862611 [details]
Two IIOP-enabled EJB archives from Quickstarts

Description of problem:
Depending on when a thread context switch happens, an IIOP enabled EJB fails to be deployed by throwing java.lang.StackOverflowError.  Here is the stack trace from the customer's log file.

~~~
2014-02-09 07:11:06.412 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-8) MSC000001: Failed to start service jboss.deployment.subunit."XXXX.ear"."YYYY.jar".POST_MODULE: org.jboss.msc.service.StartException in service jboss.deployment.subunit."XXXX.ear"."YYYY.jar".POST_MODULE: JBAS018733: Failed to process phase POST_MODULE of subdeployment "YYYY.jar" of deployment "XXXX.ear"
	at org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:123) [jboss-as-server-7.1.3.Final-redhat-4.jar:7.1.3.Final-redhat-4]
	at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811) [jboss-msc-1.0.2.GA-redhat-2.jar:1.0.2.GA-redhat-2]
	at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746) [jboss-msc-1.0.2.GA-redhat-2.jar:1.0.2.GA-redhat-2]
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_45]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_45]
	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45]
Caused by: java.lang.StackOverflowError
	at java.util.WeakHashMap.expungeStaleEntries(WeakHashMap.java:274) [rt.jar:1.6.0_45]
	at java.util.WeakHashMap.getTable(WeakHashMap.java:302) [rt.jar:1.6.0_45]
	at java.util.WeakHashMap.get(WeakHashMap.java:349) [rt.jar:1.6.0_45]
	at org.jboss.as.jacorb.rmi.WorkCacheManager.lookupDone(WorkCacheManager.java:145)
	at org.jboss.as.jacorb.rmi.WorkCacheManager.getAnalysis(WorkCacheManager.java:87)
	at org.jboss.as.jacorb.rmi.InterfaceAnalysis.getInterfaceAnalysis(InterfaceAnalysis.java:53)
	at org.jboss.as.jacorb.rmi.Util.isValidRMIIIOP(Util.java:152)
	at org.jboss.as.jacorb.rmi.OperationAnalysis.<init>(OperationAnalysis.java:67)
	at org.jboss.as.jacorb.rmi.AttributeAnalysis.<init>(AttributeAnalysis.java:87)
	at org.jboss.as.jacorb.rmi.AttributeAnalysis.<init>(AttributeAnalysis.java:102)
	at org.jboss.as.jacorb.rmi.ContainerAnalysis.analyzeAttributes(ContainerAnalysis.java:554)
	at org.jboss.as.jacorb.rmi.ContainerAnalysis.doAnalyze(ContainerAnalysis.java:184)
	at org.jboss.as.jacorb.rmi.InterfaceAnalysis.doAnalyze(InterfaceAnalysis.java:62)
	at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) [:1.6.0_45]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_45]
	at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_45]
	at org.jboss.as.jacorb.rmi.WorkCacheManager.doTheWork(WorkCacheManager.java:177)
	at org.jboss.as.jacorb.rmi.WorkCacheManager.getAnalysis(WorkCacheManager.java:105)
	at org.jboss.as.jacorb.rmi.InterfaceAnalysis.getInterfaceAnalysis(InterfaceAnalysis.java:53)
	at org.jboss.as.jacorb.rmi.Util.getTypeIDLName(Util.java:104)
	at org.jboss.as.jacorb.rmi.ParameterAnalysis.<init>(ParameterAnalysis.java:50)
	at org.jboss.as.jacorb.rmi.OperationAnalysis.<init>(OperationAnalysis.java:91)
	at org.jboss.as.jacorb.rmi.InterfaceAnalysis.analyzeOperations(InterfaceAnalysis.java:116)
	at org.jboss.as.jacorb.rmi.ContainerAnalysis.doAnalyze(ContainerAnalysis.java:186)
	at org.jboss.as.jacorb.rmi.InterfaceAnalysis.doAnalyze(InterfaceAnalysis.java:62)
	at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) [:1.6.0_45]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_45]
	at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_45]
      ...
	at org.jboss.as.jacorb.rmi.WorkCacheManager.doTheWork(WorkCacheManager.java:177)
	at org.jboss.as.jacorb.rmi.WorkCacheManager.getAnalysis(WorkCacheManager.java:105)
	at org.jboss.as.jacorb.rmi.InterfaceAnalysis.getInterfaceAnalysis(InterfaceAnalysis.java:53)
	at org.jboss.as.jacorb.rmi.Util.getTypeIDLName(Util.java:104)
	at org.jboss.as.jacorb.rmi.ParameterAnalysis.<init>(ParameterAnalysis.java:50)
	at org.jboss.as.jacorb.rmi.OperationAnalysis.<init>(OperationAnalysis.java:91)
	at org.jboss.as.jacorb.rmi.InterfaceAnalysis.analyzeOperations(InterfaceAnalysis.java:116)
	at org.jboss.as.jacorb.rmi.ContainerAnalysis.doAnalyze(ContainerAnalysis.java:186)
	at org.jboss.as.jacorb.rmi.InterfaceAnalysis.doAnalyze(InterfaceAnalysis.java:62)
	at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) [:1.6.0_45]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_45]
	at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_45]
~~~

The last part including "Util.getTypeIDLName" has been repeated as many as possible and the bottom of the stack has been lost.  The same stack has been appeared in a different MSC service thread which is also executing an IIOP enabled EJB deployment.

The customer identified an improper synchronization in org.jboss.as.jacorb.rmi.WorkCacheManager#getAnalysis ().  It is reproducible by manually controlling a thread context switch using a debugger.


Version-Release number of selected component (if applicable):
EAP 6.0.1 (this is a bug of org.jboss.as.jacorb, not JacORB)


How reproducible:
Once in the customer's environment.  But intentionally reproducible using a debugger.  I'll update with the reproducing step soon.
Comment 1 Osamu Nagano 2014-02-12 22:09:24 EST
REPRODUING STEPS:

0. Prepare EAP 6.0.1 standalone-full instance and deploy the following 2 EJB jar archives from attached jts.zip.
  jts/ejbiiop01/target/ejbiiop01.jar
  jts/ejbiiop02/target/ejbiiop02.jar

And enable a remote debug by commenting out the following line in bin/standalone.conf.
  JAVA_OPTS="$JAVA_OPTS -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=y"


1. Restart the server.
The process has been suspended for a debugger.  Connect using a debugger (Eclipse is better) and set a breakpoint at the following line.

jacorb/src/main/java/org/jboss/as/jacorb/rmi/Util.java:104
~~~
    102         // remote interface?
    103         if (cls.isInterface() && java.rmi.Remote.class.isAssignableFrom(cls)) {
    104             InterfaceAnalysis ia = InterfaceAnalysis.getInterfaceAnalysis(cls);
    105 
    106             return ia.getIDLModuleName() + "::" + ia.getIDLName();
    107         }
~~~


2. Resume the server.
Deployment of those EJBs runs in parallel and hit the breakpoint in 2 threads.  The target class is "javax.ejb.EJBObject".  Set a new breakpoint at the following line.

jacorb/src/main/java/org/jboss/as/jacorb/rmi/WorkCacheManager.java:105
~~~
     83     ContainerAnalysis getAnalysis(final Class cls) throws RMIIIOPViolationException {
    ...
    104         // Do the work
    105         doTheWork(cls, ret);
    106 
~~~

Then disable the old breakpoint and resume those 2 threads.


3. Hit the new breakpoint in 2 threads.
Let call those threads A and B.  Resume A and B alternately.  This process continues forever and a stack accumulates in both threads.
Comment 2 Osamu Nagano 2014-02-12 22:16:34 EST
Here is my analysis.  A key of hash map "workInProgress" is just a class object, but it should be a pair of class AND thread.  Or the following method should be synchronized as a whole, not being devided by "doTheWork()".

jacorb/src/main/java/org/jboss/as/jacorb/rmi/WorkCacheManager.java
~~~
     83     ContainerAnalysis getAnalysis(final Class cls) throws RMIIIOPViolationException {
     84         ContainerAnalysis ret;
     85 
     86         synchronized (this) {
     87             ret = lookupDone(cls);
     88             if (ret != null)
     89                 return ret;
     90 
     91             // is it work-in-progress?
     92             final InProgress inProgress = (InProgress) workInProgress.get(cls);
     93             if (inProgress != null) {
     94                 if (inProgress.thread == Thread.currentThread())
     95                     return inProgress.analysis; // return unfinished
     96 
     97                 // Do not wait for the other thread: We may deadlock
     98                 // Double work is better that deadlock...
     99             }
    100 
    101             ret = createWorkInProgress(cls);
    102         }
    103 
    104         // Do the work
    105         doTheWork(cls, ret);
    106 
    107         // We did it
    108         synchronized (this) {
    109             workInProgress.remove(cls);
    110             workDone.put(cls, new SoftReference(ret));
    111             notifyAll();
    112         }
    113 
    114         return ret;
    115     }
~~~
Comment 3 Osamu Nagano 2014-03-03 03:50:39 EST
Pull request sent:
https://github.com/jbossas/jboss-eap/pull/995
Comment 4 Stuart Douglas 2014-03-05 15:59:19 EST
As part of the TCK testing for Wildfly a problem was discovered with the patch for this issue. 

I have submitted a PR to fix it: https://github.com/jbossas/jboss-eap/pull/1022
Comment 5 Kabir Khan 2014-03-06 11:17:47 EST
N
Comment 6 Jan Martiska 2014-03-18 09:20:53 EDT
Verified in 6.3.0.DR4.
Comment 7 JBoss JIRA Server 2014-04-08 01:45:27 EDT
Stuart Douglas <stuart.w.douglas@gmail.com> updated the status of jira WFLY-2920 to Resolved
Comment 8 sgilda 2014-05-12 15:48:49 EDT
Changed <literal></literal> tags in Doc Text to ticks (`) to fix Bug 1096865

Note You need to log in before you can comment on or make changes to this bug.