Bug 1328124 - [RFE][Deployment] Multi-cell Cells V2
Summary: [RFE][Deployment] Multi-cell Cells V2
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 16.0 (Train on RHEL 8.1)
Assignee: Martin Schuppert
QA Contact: Paras Babbar
URL:
Whiteboard:
: 1094868 1694074 (view as bug list)
Depends On: 1094868 1623683 1710426
Blocks: 1500237 1665046 1694074 1768944 1804899
TreeView+ depends on / blocked
 
Reported: 2016-04-18 14:10 UTC by Stephen Gordon
Modified: 2020-03-13 14:02 UTC (History)
28 users (show)

Fixed In Version: tripleo-ansible-0.4.1-0.20191022060226.3441a46.el8ost openstack-tripleo-heat-templates-11.3.1-0.20191022072831.698e7db.el8ost python3-tripleoclient-12.3.1-0.20191022063506.197daf1.el8ost puppet-tripleo-11.3.1-0.20191022051220.c97dbd1.el8ost
Doc Type: Enhancement
Doc Text:
Red Hat OpenStack Platform 16.0 director, now supports multi-compute cell deployments. With this enhancement, your cloud is better positioned for scaling out, because each individual cell has its own database and message queue on a cell controller and reduces the load on the central control plane. For more information, see "Scaling deployments with Compute cells" in the "Instances and Images" guide.
Clone Of:
: 1665046 1804899 (view as bug list)
Environment:
Last Closed: 2020-02-06 14:37:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github olliewalsh/split/tree/fix_computes 0 None None None 2020-08-16 06:43:01 UTC
Launchpad 1840039 0 None None None 2019-08-14 10:17:43 UTC
Launchpad 1840054 0 None None None 2019-08-14 10:17:43 UTC
Launchpad 1840059 0 None None None 2019-08-14 10:17:43 UTC
OpenStack gerrit 523459 0 'None' MERGED Add split-controlplane spec 2021-02-16 15:49:07 UTC
OpenStack gerrit 600042 0 'None' MERGED Export global_config for compute-only stack 2021-02-16 15:49:08 UTC
OpenStack gerrit 600588 0 'None' MERGED cell_v2 multi-cell 2021-02-16 15:49:07 UTC
OpenStack gerrit 600589 0 'None' MERGED cell_v2 multi-cell 2021-02-16 15:49:08 UTC
OpenStack gerrit 636286 0 'None' MERGED [Documentation] Document procedure to manage additional nova cells v2 2021-02-16 15:49:07 UTC
OpenStack gerrit 638991 0 'None' MERGED Add OvnDbInternal to EndpointMap and use it for ovn_db_host 2021-02-16 15:49:07 UTC
OpenStack gerrit 659238 0 'None' ABANDONED Add script to help export cellv2 multicell information 2021-02-16 15:49:09 UTC
OpenStack gerrit 668414 0 'None' ABANDONED Add script to help export cellv2 multicell information 2021-02-16 15:49:08 UTC
OpenStack gerrit 670486 0 'None' MERGED Create <service> _cell_node_names if nova_additional_cell 2021-02-16 15:49:08 UTC
OpenStack gerrit 670487 0 'None' MERGED Set nova_additional_cell as global_vars 2021-02-16 15:49:09 UTC
OpenStack gerrit 672415 0 'None' MERGED Move redis_vip to all_nodes.j2 2021-02-16 15:49:08 UTC
OpenStack gerrit 672744 0 'None' MERGED Enhancement to cell v2 doc with split for stein/train 2021-02-16 15:49:09 UTC
OpenStack gerrit 676192 0 'None' MERGED Also assign default subnets to network segment 2021-02-16 15:49:09 UTC
OpenStack gerrit 676218 0 'None' MERGED Fix vlan id assignment with additional subnets 2021-02-16 15:49:10 UTC
OpenStack gerrit 676226 0 'None' MERGED Fix external resource usage in additional subnets 2021-02-16 15:49:10 UTC
OpenStack gerrit 678022 0 'None' MERGED Create tripleo-cellv2 role 2021-02-16 15:49:10 UTC
OpenStack gerrit 688915 0 'None' MERGED cellv2 post step ansible role description 2021-02-16 15:49:10 UTC
Red Hat Product Errata RHEA-2020:0283 0 None None None 2020-02-06 14:39:51 UTC

Description Stephen Gordon 2016-04-18 14:10:02 UTC
Description of problem:

In the Newton cycle it is expected that work will continue on Cells V2 with a view to making multi-cell deployments possible using the new framework.

This RFE is to track the need to validate this work with a view to offering it as a Technology Preview. Director enablement will be required in a later release to move to full support.

Version-Release number of selected component (if applicable):

Newton/RHOSP-10

Comment 1 Stephen Gordon 2016-04-20 19:31:27 UTC
Moving to rhos-11? based on input from engineering - we will re-visit the status of this work following completion of the Newton cycle.

Comment 2 Stephen Gordon 2016-09-29 16:03:16 UTC
Moving to 12/Pike. While we will continue to collaborate on this feature in the context of upstream development throughout Ocata I do not believe we will be in a position to offer a coherent TripleO supported technology preview in 12.

Comment 4 Stephen Gordon 2017-01-05 19:52:56 UTC
(In reply to Stephen Gordon from comment #2)
> Moving to 12/Pike. While we will continue to collaborate on this feature in
> the context of upstream development throughout Ocata I do not believe we
> will be in a position to offer a coherent TripleO supported technology
> preview in 12.

Dan any initial predictions for Pike? I think given current state I would be tempted to push technology preview of this to Queens with a view to allowing enough time for the Nova work to land and get some basic TripleO support for multi-cell, thoughts?

Comment 5 Dan Smith 2017-01-05 20:37:21 UTC
We're dangerously close to somewhat academic multi-cell capabilities with what is proposed and expected to merge in ocata. If that happens, I would love to be able to get early feedback on it in pike as a preview.

I'm not sure what the tripleo support will look like, but almost all of the work will be needed in order to support the required bits of ocata anyway.

Perhaps we should huddle with some tripleo people, describe the remaining changes and try to formulate a plan?

Comment 6 Stephen Gordon 2017-01-24 18:36:55 UTC
(In reply to Dan Smith from comment #5)
> We're dangerously close to somewhat academic multi-cell capabilities with
> what is proposed and expected to merge in ocata. If that happens, I would
> love to be able to get early feedback on it in pike as a preview.
> 
> I'm not sure what the tripleo support will look like, but almost all of the
> work will be needed in order to support the required bits of ocata anyway.

My concern is that for the multi-cell case that is not *really* true, in that while we will have most of the relevant pieces it is going to involve quite a lot of additional work to define the architecture we want to deploy and develop the orchestration to do it when we start talking about adding Cells.

For example currently we still generally deploy a single message broker and database, albeit in HA across three nodes, for the environment and while customizable roles give us increased flexibility in the general case these specific services are examples of ones that are still exceptions to that today. So to me it seems like on face value there would be quite a lot of work here not so much in the plumbing of loading up the new Cell's DB schema, mapping hosts to it etc., but in providing the infrastructure it will rely on in the first place.

> Perhaps we should huddle with some tripleo people, describe the remaining
> changes and try to formulate a plan?

For 12 the focus for Ollie and Sven is likely going to be completing the RT KVM enablement work, and the OOTB live migration epic as a secondary task. I'm going to target this RFE for 13 *but* I think to be in a position to execute that we do indeed need to start thinking about what a multi-cell architecture would look like in the context of TripleO, what work items we need to raise to track that, etc. during the 12 release cycle.

I also think in the 12 timeframe it would be beneficial to work with performance and scale to work out how far we can scale such deployments (albeit doing the plumbing somewhat manually in the absence of TripleO integration) as that will be where we can see if we are delivering operator facing value.

Basically TL;DR we probably need an epic document to track this through multiple releases.

Comment 7 Stephen Gordon 2017-01-24 18:39:53 UTC
(In reply to Stephen Gordon from comment #6)
> (In reply to Dan Smith from comment #5)
> > We're dangerously close to somewhat academic multi-cell capabilities with
> > what is proposed and expected to merge in ocata. If that happens, I would
> > love to be able to get early feedback on it in pike as a preview.
> > 
> > I'm not sure what the tripleo support will look like, but almost all of the
> > work will be needed in order to support the required bits of ocata anyway.
> 
> My concern is that for the multi-cell case that is not *really* true, in
> that while we will have most of the relevant pieces it is going to involve
> quite a lot of additional work to define the architecture we want to deploy
> and develop the orchestration to do it when we start talking about adding
> Cells.
> 
> For example currently we still generally deploy a single message broker and
> database, albeit in HA across three nodes, for the environment and while
> customizable roles give us increased flexibility in the general case these
> specific services are examples of ones that are still exceptions to that
> today. So to me it seems like on face value there would be quite a lot of
> work here not so much in the plumbing of loading up the new Cell's DB
> schema, mapping hosts to it etc., but in providing the infrastructure it
> will rely on in the first place.

Just to be explicit here, I'm obviously talking about adding *additional* cells beyond the "cell of one" which we obviously have to have.

Comment 8 Stephen Gordon 2017-01-24 21:33:19 UTC
*** Bug 1094868 has been marked as a duplicate of this bug. ***

Comment 9 Stephen Gordon 2017-04-21 12:47:37 UTC
Adding Triaged keyword to highlight that while yes this has been open for > 1 year it is something we triaged and intend to pursue.

Comment 10 Stephen Gordon 2017-07-25 18:00:55 UTC
Moving to 14 as a more realistic target from a director integration POV - which is where a lot of the work would be expected to be once the Nova work is complete.

Comment 29 Erwan Gallen 2019-05-15 10:41:25 UTC
*** Bug 1694074 has been marked as a duplicate of this bug. ***

Comment 31 Martin Schuppert 2019-07-10 08:09:14 UTC
note, multicell deployment is right now broken in master due to https://review.opendev.org/665213 .

* AllNodesConfig was removed
* also in the now created information we miss the cell service node names, like we created them before [1]

[1] https://review.opendev.org/#/c/665213/20/network/ports/net_ip_list_map.j2.yaml

Comment 43 errata-xmlrpc 2020-02-06 14:37:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0283


Note You need to log in before you can comment on or make changes to this bug.