Red Hat Bugzilla – Bug 1031199
EJB backing cache's can generate large retention from cancelled tasks in its scheduled executor's DelayedWorkQueue
Last modified: 2014-07-04 01:58:26 EDT
Description of problem: EJB backing cache's frequently cancel remove and passivation tasks with each access and replace them with new ones. Per http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ScheduledThreadPoolExecutor.html cancelled tasks are not removed from queue until their scheduled delay passes. So due to the java ScheduledThreadPoolExecutor's lazy cancelled task removal, this scheduled task cancellation and recreation model can potentially churn up quite a number of queued cancelled tasks sitting in the executor's DelayedWorkQueue. With longer timeouts and frequent ejb access, this can generate substantial heap overhead. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. configure long ejb timeout with the Nonpassivating or PassivatingBackingCache 2. put load on the ejb Actual results: Cancelled tasks can pile in the executor's DelayedWorkQueue. Expected results: Cancelled tasks are flushed out of the executor's DelayedWorkQueue more eagerly. Additional info: It should be pretty easy to help limit any such build up by calling purge() [1] on the scheduled executor. Likely don't want to purge with each cancel, so perhaps a purge() could be called on a configurable time delay or after a configurable amount of cancels? [1] http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ThreadPoolExecutor.html#purge%28%29
Paul Ferraro <paul.ferraro@redhat.com> updated the status of jira WFLY-2521 to Coding In Progress
David Lloyd <david.lloyd@redhat.com> made a comment on jira WFLY-2521 Using STPE for this probably isn't a great fit as it just doesn't do well with lots of cancellations. A simple sorted fast-access queue of some sort would probably work much better.
Paul Ferraro <paul.ferraro@redhat.com> made a comment on jira WFLY-2521 [~dmlloyd] That's a good point. It looks like queue removal is an O(N) operation - so even setRemoveOnCancelPolicy(true) is not ideal. I'll open a separate jira to address this.
Paul Ferraro <paul.ferraro@redhat.com> made a comment on jira WFLY-2521 See: https://issues.jboss.org/browse/WFLY-2534
https://github.com/jbossas/jboss-eap/pull/706
Hey Paul, Were you planning to add similar cancelled task removes in the NonPassivatingBackingCache's executor as well? Thanks, Aaron
Good catch. I've updated the PR.
Verified in 6.3.0.DR0.