Bug 1310865 - Director might break Swift cluster when replacing / adding new nodes
Director might break Swift cluster when replacing / adding new nodes
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates (Show other bugs)
unspecified
Unspecified Linux
unspecified Severity unspecified
: beta
: 11.0 (Ocata)
Assigned To: Christian Schwede (cschwede)
Mike Abrams
: Triaged, ZStream
Depends On:
Blocks: 1300189 1319901 1321088 1385483
  Show dependency treegraph
 
Reported: 2016-02-22 15:38 EST by Christian Schwede (cschwede)
Modified: 2017-05-17 15:27 EDT (History)
21 users (show)

See Also:
Fixed In Version: openstack-tripleo-heat-templates-6.0.0-0.20170218023452.edbaaa9.el7ost
Doc Type: Bug Fix
Doc Text:
Cause: Swift rings became inconsistent when new storage or controller nodes were added or existing ones replaced. Consequence: Unavailability of data and increased and infinite replication between storage nodes, leading to higher load and network traffic. Fix: A new process stores a copy of them on the undercloud after each deployment, and retrieves them before any new deployment or update to ensure consistency across all nodes. This removes the need to manually maintaining and copying them across nodes. Result: Simplified deployment of new or replaced nodes using Swift storage.
Story Points: ---
Clone Of:
: 1321088 (view as bug list)
Environment:
Last Closed: 2017-05-17 15:27:09 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 295314 None None None 2016-03-24 11:27 EDT
OpenStack gerrit 414460 None None None 2017-01-16 08:29 EST

  None (edit)
Description Christian Schwede (cschwede) 2016-02-22 15:38:57 EST
Description of problem:

director-managed Swift clusters might break if nodes are replaced or new nodes are added.

Version-Release number of selected component (if applicable):

Probably all.

How reproducible:

Most likely always when new nodes are added to an existing cluster.

Steps to Reproduce:
1. Deploy Swift cluster using multiple nodes and director.
2. Remove one (or more) Swift storage nodes.
3. Redeploy to rebalance the rings.
4. Add new nodes to the existing cluster and deploy them.

Actual results:

Differing Swift rings (/etc/swift/[account|container|object].[ring.gz|builder]) on new nodes.

Expected results:

Rings (/etc/swift/[account|container|object].[ring.gz|builder]) are identical on all nodes.

Additional info:

Note: this concern is based on reading the code in puppet-swift and tripleo-heat-templates, and due to lack of (hardware) resources I was not able to verify this. If it applies (and I'm quite sure that it does), it will have a severe impact on Swift clusters when changing the cluster topology.

The rings in Swift define where data is stored within a cluster. They are used both when storing and retrieving data, and also for background process like replicators. Objects might not be found and/or replicators might replicate data in an endless circle, overloading the cluster when rings are not identical on all nodes.

When a new node is added by tripleo it will be configured using puppet-swift. There are no existing ring files on new nodes, and therefore no "history". However, the ring-builder depends on the "history" of previous runs, and since there is a different history on already existing nodes it is very likely that the balanced rings are different.

AFAICT, there is a ringserver and ringsync class in puppet-swift, to avoid these situations (to ensure every node uses the same ring source), but it is not used in director.

A possible (somewhat dirty) workaround is to copy the existing ringfiles to new nodes before puppet runs.

I recommend to check the rings on all nodes if they are identical, especially after changing the cluster topology.
Comment 2 Christian Schwede (cschwede) 2016-03-05 09:32:46 EST
Possible workarounds for this:

1. Disable ring building on the nodes, pls see linked patch review
2. Use a customized template and copy the .builder files from another node before puppet runs
Comment 3 Dan Macpherson 2016-03-21 22:19:15 EDT
@cschwede, do you know if this patch will be backported to OSP 7?
Comment 6 Christian Schwede (cschwede) 2016-04-01 08:02:42 EDT
@Dan: No, I don't think this will be backported to OSP7. However, it was backported upstream to Liberty (https://review.openstack.org/#/c/295426/), and is included in OSP8 (just checked the last puddle; it's included in openstack-tripleo-heat-templates/0.8.14-1.el7ost).
Comment 7 Christian Schwede (cschwede) 2016-04-01 08:24:14 EDT
Today Giulio and me discussed the next steps to improve Swift support in Director.

The idea to solve the issue described in this BZ is to use the ringsync mechanism already provided by puppet-swift. The ring will be managed on one node, and other nodes will fetch the .ring.gz from that node.

https://github.com/openstack/puppet-swift/blob/master/manifests/ringsync.pp
https://github.com/openstack/puppet-swift/blob/master/manifests/ringserver.pp

It is important that is done on a node with the oldest ring files (including the whole history); for example the first node that was deployed. The managing node will also need information about the IPs and devices on all nodes.

There are a few more RFE that will be worked on in the future. These are (ordered by prio):

https://bugzilla.redhat.com/show_bug.cgi?id=1276691 multi disks on swift node 
There is already workaround: https://mojo.redhat.com/community/consulting-customer-training/services-innovation-and-incubation/technical-advanced-content/blog/2015/11/02/director-multiple-disks-for-swift-nodes

https://bugzilla.redhat.com/show_bug.cgi?id=1303093 Add ability to disable Swift from overcloud deployment

https://bugzilla.redhat.com/show_bug.cgi?id=1303093 Permit usage of unmanaged Swift clusters

The ideas in the last two RFEs were used for a customer recently.

https://bugzilla.redhat.com/show_bug.cgi?id=1320185 Allow for customization of the swift nodes disk topology 

This would make it possible to deploy a cluster with a more customized setup  without manually managing the Swift rings. For example: 
- different number of disks per node
- SSDs for account/containers
- different regions and zones based on the datacenter layout.
Comment 8 Christian Schwede (cschwede) 2016-04-01 11:27:46 EDT
There is a wrong BZ reference (Thx Thiago!). Correct one:

https://bugzilla.redhat.com/show_bug.cgi?id=1320209 Permit usage of unmanaged Swift clusters
Comment 9 Jaromir Coufal 2016-04-04 10:34:15 EDT
This should be probably fixed by the patch from comment #5 + documentation. Future work should be a separate bugzilla. Correct?
Comment 10 Mike Burns 2016-04-07 17:11:06 EDT
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.
Comment 12 Christian Schwede (cschwede) 2016-04-08 01:47:13 EDT
It seems that the workaround from upstream (disabling Ring management) is included in openstack-tripleo-heat-templates-0.8.14-7.el7ost.noarch.rpm (from the GA release puddle)? This doesn't fix the bug itself, but at least there is a known workaround for it.
Comment 14 Dave Maley 2016-05-03 12:54:26 EDT
I see we have the osp7 fix for this ON_QA (bug 1321088) however I don't see osp8 or osp9 clones. Are they needed?
Comment 15 Dave Maley 2016-05-03 12:59:52 EDT
nm I see openstack-tripleo-heat-templates-0.8.14-9.el7ost is available in the channel, and according to comment 12 this would include the needed changes.
Comment 20 Christian Schwede (cschwede) 2017-01-16 08:29:35 EST
Added a link to an upstream patch that actually fixes this issue.
Comment 33 errata-xmlrpc 2017-05-17 15:27:09 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245

Note You need to log in before you can comment on or make changes to this bug.