Bug 1147948 - (6.4.z) Hanging EJB threads because of a persistent timer and failed deployment
Summary: (6.4.z) Hanging EJB threads because of a persistent timer and failed deployment
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: EJB
Version: 6.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: CR1
: EAP 6.4.12
Assignee: Enrique Gonzalez Martinez
QA Contact: Jan Martiska
URL:
Whiteboard: eap6412-proposed
Depends On:
Blocks: eap6412-payload
TreeView+ depends on / blocked
 
Reported: 2014-09-30 11:43 UTC by Jan Martiska
Modified: 2017-01-17 13:10 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-01-17 13:10:12 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
reproducer (8.86 KB, application/zip)
2014-09-30 11:43 UTC, Jan Martiska
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker WFLY-4133 0 None None None 2018-05-30 15:31:05 UTC

Description Jan Martiska 2014-09-30 11:43:14 UTC
Created attachment 942672 [details]
reproducer

If an application with a persistent EJB timer is about to be re-deployed while there are queued timeouts for that timer, the taks for serving these timeouts seem to be fired *before* the whole deployment is processed. These threads are then scheduled to wait for the EJB component to be started (and then invoke the timeout method). However, if the re-deployment fails for some reason, these threads will remain stuck and never return. 

Consequently, if there are 10 or more queued timeouts before the failed deployment attempt, all EJB service threads will get stuck (by default, EJB subsystem uses a thread pool of 10 threads max) and EAP will be unable to process any EJB calls (including timer timeouts). Also, EAP will get stuck during an attempt to shut down.

The stack trace of a stuck thread looks like this:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:502)
org.jboss.as.ee.component.BasicComponent.waitForComponentStart(BasicComponent.java:117)
org.jboss.as.ee.component.BasicComponent.constructComponentInstance(BasicComponent.java:147)
org.jboss.as.ee.component.BasicComponent.constructComponentInstance(BasicComponent.java:135)
org.jboss.as.ee.component.BasicComponent.createInstance(BasicComponent.java:90)
org.jboss.as.ejb3.component.stateless.StatelessSessionComponent$1.create(StatelessSessionComponent.java:64)
org.jboss.as.ejb3.component.stateless.StatelessSessionComponent$1.create(StatelessSessionComponent.java:61)
org.jboss.as.ejb3.pool.AbstractPool.create(AbstractPool.java:60)
org.jboss.as.ejb3.pool.strictmax.StrictMaxPool.get(StrictMaxPool.java:123)
org.jboss.as.ejb3.component.pool.PooledInstanceInterceptor.processInvocation(PooledInstanceInterceptor.java:47)
org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInOurTx(CMTTxInterceptor.java:274)
org.jboss.as.ejb3.tx.CMTTxInterceptor.required(CMTTxInterceptor.java:341)
org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:240)
org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41)
org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64)
org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
org.jboss.as.ejb3.component.interceptors.AdditionalSetupInterceptor.processInvocation(AdditionalSetupInterceptor.java:55)
org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45)
org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288)
org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61)
org.jboss.as.ejb3.timerservice.TimedObjectInvokerImpl.callTimeout(TimedObjectInvokerImpl.java:101)
org.jboss.as.ejb3.timerservice.task.CalendarTimerTask.callTimeout(CalendarTimerTask.java:60)
org.jboss.as.ejb3.timerservice.task.TimerTask.run(TimerTask.java:132)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
org.jboss.threads.JBossThread.run(JBossThread.java:122)

Attaching a reproducer with two deployments - good/ejb.jar and bad/ejb.jar (they need to have the same filename).
- deploy the 'good' one - it creates a persistent timer firing every second
- undeploy it and wait ~10 seconds for the timeouts to queue up
- try to deploy the 'bad' one - the deployment will fail, because there is a @RequestScoped annotation on a @Stateless EJB, which is against CDI spec (and that's the only difference from the 'good' one)
- see stuck EJB service threads and EAP being unable to stop using ctrl-c

Comment 1 Enrique Gonzalez Martinez 2014-11-20 16:31:37 UTC
The TimerService starts executing the pending timeouts after the BasicComponentCreateService starts. This happens before failing the deployment due to the WeldBootstrapService failure (in this case)

The lock happens before the ComponentStartService starts.This service invokes the BasicComponent::start resposible for unlocking any service waiting in BasicComponent.waitForComponentStart.

Comment 2 Enrique Gonzalez Martinez 2015-03-13 09:38:00 UTC
upstream: https://github.com/wildfly/wildfly/pull/7006
6.4.x https://github.com/jbossas/jboss-eap/pull/2351

Comment 4 Mike McCune 2016-03-28 22:33:16 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 6 Jiří Bílek 2016-11-04 14:26:30 UTC
Verified with EAP 6.4.12.CP.CR1

Comment 7 Petr Penicka 2017-01-17 13:10:12 UTC
Retroactively bulk-closing issues from released EAP 6.4 cummulative patches.


Note You need to log in before you can comment on or make changes to this bug.