Bug 1104219 - [RFE][nova] Optionally allow nova-scheduler to be deployed as Active/Active or Active/Passive
Summary: [RFE][nova] Optionally allow nova-scheduler to be deployed as Active/Active o...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-foreman-installer
Version: 5.0 (RHEL 7)
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ga
: 5.0 (RHEL 7)
Assignee: Jason Guiditta
QA Contact: Leonid Natapov
URL:
Whiteboard:
Depends On:
Blocks: 1083890 1104884 1111701 1126447
TreeView+ depends on / blocked
 
Reported: 2014-06-03 14:28 UTC by Russell Bryant
Modified: 2019-09-09 14:27 UTC (History)
12 users (show)

Fixed In Version: openstack-foreman-installer-2.0.7-1.el6ost
Doc Type: Enhancement
Doc Text:
With this release, the Compute scheduler runs in Active/Active mode for high availability (HA) by default. This is required because Active/Active mode in HA for the Compute scheduler allows better scaling than the Active/Passive mode. If you wish to run the Compute scheduler in Active/Passive mode, simply set scheduler_host_subset_size=1.
Clone Of:
: 1104884 1111701 1126447 (view as bug list)
Environment:
Last Closed: 2014-08-04 18:32:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:1003 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform Enhancement Advisory 2014-08-04 22:31:07 UTC

Description Russell Bryant 2014-06-03 14:28:48 UTC
The nova-scheduler service requires special consideration when deploying it for HA.

The default should be an Active/Passive setup using Pacemaker.

For larger deployments, we should also provide the option of deploying nova-scheduler in an Active/Active mode.  Some notes about deploying it Active/Active:

* In this mode, the number of nova-scheduler instances should be configurable.  it should default to 2.  It could be up to the number of controller nodes.

* When using Active/Active, the following configuration option needs to be set in nova.conf: scheduler_host_subset_size.  This option defaults to 1.  I would recommend setting it to a default of "10" when using Active/Active.  However, it's something that should be allowed to be tweaked if needed.

Some more information for justification and documentation:

There are trade-offs between running nova-scheduler in Active/Passive or Active/Active HA modes.  The scheduler attempts to determine the *best* host to run a new instance on.  When running more than one scheduler, processing of these requests can happen in parallel.  When they happen in parallel, they are acting on the same state data and will make the same decision.  This will result in a lot of conflicts and require retries.  To mitigate this issue, the "scheduler_host_subset_size" configuration option can be used.  This tells the scheduler to randomly choose a host from the N best hosts instead of the default of the absolute best host.  The trade-off here of course is that the scheduler no longer picks the best host, but instead one that is good enough based on the requirements and scheduler configuration.

Comment 3 Russell Bryant 2014-06-04 21:21:14 UTC
(In reply to Russell Bryant from comment #0)
> The nova-scheduler service requires special consideration when deploying it
> for HA.
> 
> The default should be an Active/Passive setup using Pacemaker.
> 
> For larger deployments, we should also provide the option of deploying
> nova-scheduler in an Active/Active mode.  Some notes about deploying it
> Active/Active:
> 
> * In this mode, the number of nova-scheduler instances should be
> configurable.  it should default to 2.  It could be up to the number of
> controller nodes.
> 
> * When using Active/Active, the following configuration option needs to be
> set in nova.conf: scheduler_host_subset_size.  This option defaults to 1.  I
> would recommend setting it to a default of "10" when using Active/Active. 
> However, it's something that should be allowed to be tweaked if needed.

After further discussion, consider the following changes to this:

1) Make the default A/A instead of A/P.

2) In the A/A configuration, use a default value of 30 instead of 10 as described before.

3) To keep things simple, just run nova-scheduler on every controller node in the A/A case.

4) Still work toward having A/P as an optional deployment choice.

Comment 4 Jason Guiditta 2014-06-16 13:01:09 UTC
I am currently implementing this in quickstack (openstack-foreman-installer component), and I just want to verify that the only configuration difference between A/A and A/P is really just:
1. A/A has scheduler running on all nodes instead of 1
2. A/A has scheduler_host_subset_size=30 (default, can be overridden).

Comment 5 Russell Bryant 2014-06-16 13:08:58 UTC
(In reply to Jason Guiditta from comment #4)
> I am currently implementing this in quickstack (openstack-foreman-installer
> component), and I just want to verify that the only configuration difference
> between A/A and A/P is really just:
> 1. A/A has scheduler running on all nodes instead of 1
> 2. A/A has scheduler_host_subset_size=30 (default, can be overridden).

Yes, that's right.

If you're also implementing A/P, then you need pacemaker config for that, but I think that's it.

Comment 6 Jason Guiditta 2014-06-17 00:31:22 UTC
Patch merged:

https://github.com/redhat-openstack/astapor/pull/271

Comment 10 Jason Guiditta 2014-07-18 14:08:58 UTC
Russel, can you look at my doc text and correct if you think the description is incorrect? Thanks

Comment 11 Russell Bryant 2014-07-18 14:23:14 UTC
(In reply to Jason Guiditta from comment #10)
> Russel, can you look at my doc text and correct if you think the description
> is incorrect? Thanks

Updated.  Let me know if you'd like to discuss further.

Comment 14 errata-xmlrpc 2014-08-04 18:32:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1003.html


Note You need to log in before you can comment on or make changes to this bug.