Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 975631

Summary: [Docs] [Tracker] Document the plug-in scheduler implementation that interfaces to external scheduler via scheduling API and SDK
Product: Red Hat Enterprise Virtualization Manager Reporter: Andrew Burden <aburden>
Component: DocumentationAssignee: Zac Dover <zdover>
Status: CLOSED NOTABUG QA Contact: ecs-bugs
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: aberezin, acathrow, chetan, yeylon
Target Milestone: ---Keywords: FutureFeature
Target Release: 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
URL: http://www.ovirt.org/Features/oVirtSchedulerAPI
Whiteboard: sla
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: external-scheduler Environment:
Last Closed: 2014-04-07 01:59:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 912059, 912076    
Bug Blocks:    

Description Andrew Burden 2013-06-19 01:20:31 UTC
+++ This bug was initially created as a clone of Bug #912076 +++

This bug is a placeholder for this functionality and a detailed requirements list will be provided later. This is opened now to triage and block all the RFE's that depend on this infrastructure. 

Basic concepts are: 
- The end user scheduler is isolated from the engine and the DB - unlike the actual plug-in. 
- Must provide a way to collect enough info to form a decision
- The API should have the option to create binding to a scripting language TBD (SDK) 

The plug-in is based on the infra in bug 912059	that provide infrastructure for scheduler plug in that is running within the engine scope. So the plugin is running within the engine scope while providing interface to an external interface.

Comment 2 Arthur Berezin 2013-08-25 11:37:12 UTC
Added oVirt feature page: http://www.ovirt.org/Features/oVirtSchedulerAPI.

Comment 3 Zac Dover 2013-08-26 05:03:57 UTC
*** Bug 975630 has been marked as a duplicate of this bug. ***

Comment 4 Cheryn Tan 2013-10-01 03:27:35 UTC
Might be useful information, copied from the doc text in the original dev bug:

Starting RHEV 3.3, using virtio balloon for memory optimization is allowed.

Every virtual machine in clusters levels 3.2 and higher include a balloon
device, unless specifically removed. This device requires guest drivers and the guest agent to control the balloon size.

Ballooning optimization is a cluster level policy attribute, which is disabled by default. So in order to have a balloon running, the virtual machine needs to have a balloon device with relevant drivers, and the cluster it belongs to should enable ballooning optimization. Each host
in the cluster will get a balloon policy update when its status changes to 'up'. There's a manual option for emergencies, enabling to force an
update for a specific host.

Once this is set, MoM will start ballooning where and when possible to
allow memory over-commitment, with a limitation of the guaranteed memory
size which every VM has.

It is important to understand that in some scenarios ballooning may collide with KSM. In such cases MoM will try to adjust the balloon size
to minimize collisions. Additionally, in some scenarios ballooning may
cause sub-optimal performance for a VM. Administrators are advised to
use ballooning optimization with caution.

Comment 5 Cheryn Tan 2013-10-01 03:30:22 UTC
Saved before reading. Please ignore comment 4. This is the actual text: 

Red Hat Enterprise Virtualization Manager now includes a new scheduler to handle VM placement, allowing
users to create new scheduling policies, and also write their own logic in Python and include it in a policy.

The new oVirt scheduler serves VM scheduling requests during VM running or migration.
The scheduling process is done by applying hard constraints and soft constraints to get the optimal host for that request at this point of time.

Scheduling policy elements

* Filter: a basic logic unit which filters out hypervisors who do not satisfy the hard constraints for placing a given VM.

* Weight function: a function that calculates a score to a given host based on its internal logic. This is a way to implement soft constraints in the scheduling process. Since these are weights, low score is considered to be better.

* Load balancing module: code implementing a logic to distribute the load. So far the definition of load was mostly CPU
related, so migrating a VM would help to resolve that. The new scheduler allows users to write their own logic to handle 
other load types (network, I/O, etc) by other means such as integrating with 3rd party systems.

Scheduling process description

Every cluster has a scheduling policy. So far we had 3 main policies (None, Even distribution and Power saving),
and now administrators can create their own policies or use he built-in policies. Each policy contains a list of
filters, weight functions (one or more) and a single load balancing module.
The scheduling process takes all relevant hosts and run them through the relevant filters of a specific policy.
Note that Filter order is meaningless.
The filtered host list is then used as an input to the relevant weight functions of that policy, which creates a cost table. The cost table indicates the host with the lowest weight (cost), which is the optimal solution for the
given request. Multiple Weight functions may be prioritized using a factor.

Adding user code

See more info on adding code in Bug 912059.

Important notes

- New scheduling policies created by administrators are not validated by the system. This may end up with unexpected
  results, so it is highly important to verify a new policy is not introducing issues or instability to the system.
- User provided code is unsupported.
- Using user provided code may have a performance impact, so administrators are advised to carefully test their code
  and the general performance changes.

Comment 6 Cheryn Tan 2013-10-03 04:56:05 UTC
(more info from dev bug thanks to Doron)

Red Hat Enterprise Virtualization Manager now includes a new scheduler to handle VM placement, allowing users to create new scheduling policies, and also write their own logic in Python and include it in a policy.

For conceptual explanations of the new scheduler see bug 912076.

The infrastructure allowing users to extend the new scheduler, is based on a service called ovirt-scheduler-proxy. The service's purpose is for RHEV admins to extend the scheduling process with custom python filters, weight functions and load balancing modules.

The daemon is waiting for engine requests using XML-RPC. Engine request may be one of:
- runDiscover: returns an XML containing all available policy units and configurations (configuration is optional).

- runFilters: executes a set of filters plugins sequentially (provided as a name list).

- runScores: executes a set of weight function plugins sequentially (provided as a name list), then calculate a cost table (using factors) and return it to the engine.

- runBalance: executes the balance plugin named {balance name} on the hosts using the given properties_map.

Any plugin file {NAME}.py the user writes must implement at least one of the functions (do_filter, do_scores, do_balance). These files reside in $PYTHONPATH/ovirt_scheduler/plugins folder, unless changes in the proxy's configuration file /etc/ovirt/scheduler/scheduler.conf.
For more information on user code you can check the provided samples.

During the daemon initialization, it will scan this folder to detect user files, and analyze the files for the relevant functionality. The results are kept in the daemon's cache, and provided to the engine when the runDiscover is called. Note that the engine will call it only when it starts up, so in order to introduce a new code, the administrator needs to restart the proxy service, and then RHEV engine.

The scheduling proxy is packaged as a separate optional RPM which is not installed by default. After installing it, the admin needs to allow it in RHEV DB by setting ExternalSchedulerEnabled to True using the configuration utility.

Important notes:
- User provided code is unsupported.
- Using user provided code may have a performance impact, so administrators are advised to carefully test their code and the general performance changes before using it in live setups.