Bug 1469435 - [RFE][Swift] - Implement Swift ring rebalance as an Ansible-based Mistral workflow
Summary: [RFE][Swift] - Implement Swift ring rebalance as an Ansible-based Mistral wor...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: Upstream M1
: 13.0 (Queens)
Assignee: Christian Schwede (cschwede)
QA Contact: Mike Abrams
Kim Nylander
URL:
Whiteboard:
Depends On:
Blocks: 1469441 1469452
TreeView+ depends on / blocked
 
Reported: 2017-07-11 09:26 UTC by Christian Schwede (cschwede)
Modified: 2018-06-27 13:31 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-common-8.6.1-3.el7ost
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:31:39 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Launchpad 1759311 None None None 2018-03-27 15:54:33 UTC
OpenStack gerrit 510884 None master: MERGED tripleo-common: Add Ansible playbook/workflow to rebalance Swift rings (I7bb1d7d4f45bee36df1c435a11bda5d4cdaae896) 2018-04-24 21:55:11 UTC
OpenStack gerrit 556928 None master: MERGED tripleo-common: Fix missing permissions on Swift rebalance playbook (Ia766bc44a647fec15ff662f1ef9ffb67860b155b) 2018-04-24 21:55:06 UTC
OpenStack gerrit 560337 None stable/queens: NEW tripleo-common: Fix parameter indentation on Swift rebalance playbook (I31db903787feded6acecd67dc98ef10b6bf26ea8) 2018-04-24 21:56:10 UTC
Red Hat Product Errata RHEA-2018:2086 normal SHIPPED_LIVE Red Hat OpenStack Platform 13.0 Enhancement Advisory 2018-06-28 19:51:39 UTC

Description Christian Schwede (cschwede) 2017-07-11 09:26:51 UTC
As an operator, I want to trigger a Swift ring rebalance without executing a full overcloud update. 

Today rebalances are only executed if there is a change in the devices (IP address, device added etc). It is executed during an overcloud update. This is a huge overhead if one only wants to rebalance the rings. To make things worse, rebalances might need to be executed multiple times - there should be at least an option to automate this.

Some recent changes in Tripleo/Mistral make it possible to execute Ansible playbooks within a workflow. Implementing the rebalance as a workflow has multiple benefits:

1. It can be executed by an operator as a standalone operation on the undercloud
2. It can be executed automatically by Mistral multiple times (eg when there is a bigger rebalance required)

The workflow itself could be nearly the same as today:

1. Download most recent rings from the undercloud
2. Rebalance
3. Upload updated rings to the overcloud

It could be also done in a more lightweight manner, for example the above workflow is executed only on a single node, and once the updated rings are available they will be fetched by all nodes.

There should be an additional sanity check that uses the swift-dispersion-report tool to check if the dispersion already reached a given level (for example 95%) and also ensure that at least one replication pass finished since the last rebalance (using swift-recon data).

Expected outcome
----------------
Operator can trigger a Mistral workflow that rebalances Swift rings in a safe manner and distributes them. 

Work items
----------
1. Add option to swift-ring-builder to limit rebalance operations to a maximum of X percent
2. Make swift-dispersion-report importable
3. Write Ansible playbook & library to execute descibed workflow

Comment 2 Christian Schwede (cschwede) 2017-12-21 17:32:34 UTC
Upstream patch merged, moving to POST.

Comment 18 errata-xmlrpc 2018-06-27 13:31:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.