Bug 1416141

Summary: [RFE] Use JDK and EE builtins for scheduled tasks and threads
Product: [oVirt] ovirt-engine Reporter: Roy Golan <rgolan>
Component: Backend.CoreAssignee: Ravi Nori <rnori>
Status: CLOSED CURRENTRELEASE QA Contact: Lukas Svaty <lsvaty>
Severity: high Docs Contact:
Priority: high    
Version: futureCC: bgraveno, bugs, mgoldboi, mperina, pstehlik, rgolan, rnori, trichard
Target Milestone: ovirt-4.2.0Keywords: CodeChange, FutureFeature, Performance
Target Release: 4.2.0Flags: rule-engine: ovirt-4.2+
lsvaty: testing_plan_complete-
mgoldboi: planning_ack+
mperina: devel_ack+
pstehlik: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
You can now schedule jobs using JBoss ManagedThreadFactory, ManagedExecutorService, and ManagedScheduledExecutorService instead of Java EE ExecutorService and Quartz.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-20 11:38:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roy Golan 2017-01-24 16:47:12 UTC
Description of problem:

Java EE have ready made implementation for scheduling tasks and threads. Today we use Quartz for scheduled tasks with a custom thread pool exclusivity for it and also few internal thread pools for one off tasks.

All this can be replaced with EE builtins ManagedScheduledExecutorService and 
ManagedExecutorService. Also EJB timers can be an option. 
Those have few advantages:
1. configurable in runtime by existing tools - jmx/ management console on port 8706 using cli and web
2. proven, supported for long time
3. less code, we don't need the engine Scheduler module and its glue-code @OnTimer annotation and the annoying builerplate
4. Cleaner api - which is also pretty similar to quarz (the verbs are almost the same, but more consize as it lets you schedule a Runnable hence lambda makes it a beauty)
5. compatibility - the api lets you schedule with rate, delay, cancel and cancel job. ManagedExecutorService is a superset of our ThreadPoolUtils
6. Supports injection - easy to code and test

TBD - Gluster services in engine is using db to store quartz triggers. This means this change is probably irrelevant for gluster services and this means we won't tottaly remove the quartz depedency

Comment 1 Roy Golan 2017-01-25 08:06:48 UTC
continued description:

Scope:
1. Thread pools - Eliminate our own ThreadPoolUtil, eliminate redundant pools, leave only those who need bounded resources (or create managed ones that answer the need - e.g HostUpdateScheduerService)
2. Scheduled tasks - Replace quartz where we can (gluster needs more info), 

pass criteria:
 - Engine schedules threads and tasks with no regression. Same delay between jobs, jobs fired with the same rate. 
 - No performance degradation.

Comment 2 Pavel Stehlik 2017-09-05 07:37:28 UTC
CodeChange? if not, please share reprosteps, thx, P.

Comment 3 Lukas Svaty 2017-09-25 10:15:45 UTC
adding needinfo as per comment#2

Comment 4 Ravi Nori 2017-09-25 14:39:38 UTC
CodeChange

Comment 5 Lukas Svaty 2017-11-03 15:00:02 UTC
per CodeChange moving to VERIFIED

Comment 6 Sandro Bonazzola 2017-12-20 11:38:12 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 7 Tahlia Richardson 2018-01-25 05:51:53 UTC
Cutting down the doc text to appropriate release note size, so I'll include the rest of the text here: 


Use JBoss ManagedThreadFactory, ManagedExecutorService and ManagedScheduledExecutorService instead of Java EE ExecutorService and Quartz to schedule jobs.

There are two ManagedExecutorServices in engine commandCoordinator and hostUpdatesChecker. 

The size of the commandCoordinator executor service is defined by config variable COMMAND_COORDINATOR_THREAD_POOL_SIZE. The default size if 10. To change the size create a new conf file 99-coco-thread-pool.conf in /etc/ovirt-engine/engine.conf.d/

The size of the hostUpdatesChecker executor service is defined by config variable HOST_CHECK_FOR_UPDATES_THREAD_POOL_SIZE. The default size if 5. To change the size create a new conf file 99-host-thread-pool.conf in /etc/ovirt-engine/engine.conf.d/
 
Management clients, such as the WildFly CLI, can also be used to configure ManagedExecutorService instances commandCoordinator and hostUpdatesChecker with out having to restart engine. 

The 'engine' executor service size is define by config variables ENGINE_THREAD_POOL_MIN_SIZE and ENGINE_THREAD_POOL_MAX_SIZE. The default value for ENGINE_THREAD_POOL_MIN_SIZE is 50 and default value for ENGINE_THREAD_POOL_MAX_SIZE is 500. To change the value permanentaly create a conf file 99-engine-thread-pool.conf in /etc/ovirt-engine/engine.conf.d/.

jconsole can be used to change the size of the 'engine' executor service with out having to restart engine.

'engineScheduled' ManagedScheduledExecutorService size is define by config variable ENGINE_SCHEDULED_THREAD_POOL_SIZE. The default value is 100. To change the value
permanentaly create a conf file 99-engine-scheduled-thread-pool.conf in /etc/ovirt-engine/engine.conf.d/

Management clients, such as the WildFly CLI, can also be used to configure ManagedScheduledExecutorService instance 'engineScheduled' with out having to restart engine. 

Engine creates a log entry of the thread pool usage as define by the config variable THREAD_POOL_MONITORING_INTERVAL_IN_SECONDS. The default value is 600 seconds. So every 10 minutes a log entry is created for each pool showing the thread pool usage.

Thread pool 'engine' is using 7 threads out of 500, 1 threads waiting for tasks and 0 tasks in queue.
Thread pool 'engineScheduled' is using 0 threads out of 100 and 100 tasks are waiting in the queue.
Thread pool 'engineThreadMonitoring' is using 1 threads out of 1 and 0 tasks are waiting in the queue.