Bug 1310865

Summary: Director might break Swift cluster when replacing / adding new nodes
Product: Red Hat OpenStack Reporter: Christian Schwede (cschwede) <cschwede>
Component: openstack-tripleo-heat-templatesAssignee: Christian Schwede (cschwede) <cschwede>
Status: CLOSED ERRATA QA Contact: Mike Abrams <mabrams>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: augol, cschwede, dbecker, ddomingo, dmaley, egafford, felipe.alfaro, gfidente, jcoufal, jliberma, jraju, jschluet, markmc, mburns, mcornea, morazi, pgrist, rhel-osp-director-maint, scohen, thiago, zaitcev
Target Milestone: betaKeywords: Triaged, ZStream
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-6.0.0-0.20170218023452.edbaaa9.el7ost Doc Type: Bug Fix
Doc Text:
Cause: Swift rings became inconsistent when new storage or controller nodes were added or existing ones replaced. Consequence: Unavailability of data and increased and infinite replication between storage nodes, leading to higher load and network traffic. Fix: A new process stores a copy of them on the undercloud after each deployment, and retrieves them before any new deployment or update to ensure consistency across all nodes. This removes the need to manually maintaining and copying them across nodes. Result: Simplified deployment of new or replaced nodes using Swift storage.
Story Points: ---
Clone Of:
: 1321088 (view as bug list) Environment:
Last Closed: 2017-05-17 19:27:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1300189, 1319901, 1321088, 1385483    

Description Christian Schwede (cschwede) 2016-02-22 20:38:57 UTC
Description of problem:

director-managed Swift clusters might break if nodes are replaced or new nodes are added.

Version-Release number of selected component (if applicable):

Probably all.

How reproducible:

Most likely always when new nodes are added to an existing cluster.

Steps to Reproduce:
1. Deploy Swift cluster using multiple nodes and director.
2. Remove one (or more) Swift storage nodes.
3. Redeploy to rebalance the rings.
4. Add new nodes to the existing cluster and deploy them.

Actual results:

Differing Swift rings (/etc/swift/[account|container|object].[ring.gz|builder]) on new nodes.

Expected results:

Rings (/etc/swift/[account|container|object].[ring.gz|builder]) are identical on all nodes.

Additional info:

Note: this concern is based on reading the code in puppet-swift and tripleo-heat-templates, and due to lack of (hardware) resources I was not able to verify this. If it applies (and I'm quite sure that it does), it will have a severe impact on Swift clusters when changing the cluster topology.

The rings in Swift define where data is stored within a cluster. They are used both when storing and retrieving data, and also for background process like replicators. Objects might not be found and/or replicators might replicate data in an endless circle, overloading the cluster when rings are not identical on all nodes.

When a new node is added by tripleo it will be configured using puppet-swift. There are no existing ring files on new nodes, and therefore no "history". However, the ring-builder depends on the "history" of previous runs, and since there is a different history on already existing nodes it is very likely that the balanced rings are different.

AFAICT, there is a ringserver and ringsync class in puppet-swift, to avoid these situations (to ensure every node uses the same ring source), but it is not used in director.

A possible (somewhat dirty) workaround is to copy the existing ringfiles to new nodes before puppet runs.

I recommend to check the rings on all nodes if they are identical, especially after changing the cluster topology.

Comment 2 Christian Schwede (cschwede) 2016-03-05 14:32:46 UTC
Possible workarounds for this:

1. Disable ring building on the nodes, pls see linked patch review
2. Use a customized template and copy the .builder files from another node before puppet runs

Comment 3 Dan Macpherson 2016-03-22 02:19:15 UTC
@cschwede, do you know if this patch will be backported to OSP 7?

Comment 6 Christian Schwede (cschwede) 2016-04-01 12:02:42 UTC
@Dan: No, I don't think this will be backported to OSP7. However, it was backported upstream to Liberty (https://review.openstack.org/#/c/295426/), and is included in OSP8 (just checked the last puddle; it's included in openstack-tripleo-heat-templates/0.8.14-1.el7ost).

Comment 7 Christian Schwede (cschwede) 2016-04-01 12:24:14 UTC
Today Giulio and me discussed the next steps to improve Swift support in Director.

The idea to solve the issue described in this BZ is to use the ringsync mechanism already provided by puppet-swift. The ring will be managed on one node, and other nodes will fetch the .ring.gz from that node.

https://github.com/openstack/puppet-swift/blob/master/manifests/ringsync.pp
https://github.com/openstack/puppet-swift/blob/master/manifests/ringserver.pp

It is important that is done on a node with the oldest ring files (including the whole history); for example the first node that was deployed. The managing node will also need information about the IPs and devices on all nodes.

There are a few more RFE that will be worked on in the future. These are (ordered by prio):

https://bugzilla.redhat.com/show_bug.cgi?id=1276691 multi disks on swift node 
There is already workaround: https://mojo.redhat.com/community/consulting-customer-training/services-innovation-and-incubation/technical-advanced-content/blog/2015/11/02/director-multiple-disks-for-swift-nodes

https://bugzilla.redhat.com/show_bug.cgi?id=1303093 Add ability to disable Swift from overcloud deployment

https://bugzilla.redhat.com/show_bug.cgi?id=1303093 Permit usage of unmanaged Swift clusters

The ideas in the last two RFEs were used for a customer recently.

https://bugzilla.redhat.com/show_bug.cgi?id=1320185 Allow for customization of the swift nodes disk topology 

This would make it possible to deploy a cluster with a more customized setup  without manually managing the Swift rings. For example: 
- different number of disks per node
- SSDs for account/containers
- different regions and zones based on the datacenter layout.

Comment 8 Christian Schwede (cschwede) 2016-04-01 15:27:46 UTC
There is a wrong BZ reference (Thx Thiago!). Correct one:

https://bugzilla.redhat.com/show_bug.cgi?id=1320209 Permit usage of unmanaged Swift clusters

Comment 9 Jaromir Coufal 2016-04-04 14:34:15 UTC
This should be probably fixed by the patch from comment #5 + documentation. Future work should be a separate bugzilla. Correct?

Comment 10 Mike Burns 2016-04-07 21:11:06 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 12 Christian Schwede (cschwede) 2016-04-08 05:47:13 UTC
It seems that the workaround from upstream (disabling Ring management) is included in openstack-tripleo-heat-templates-0.8.14-7.el7ost.noarch.rpm (from the GA release puddle)? This doesn't fix the bug itself, but at least there is a known workaround for it.

Comment 14 Dave Maley 2016-05-03 16:54:26 UTC
I see we have the osp7 fix for this ON_QA (bug 1321088) however I don't see osp8 or osp9 clones. Are they needed?

Comment 15 Dave Maley 2016-05-03 16:59:52 UTC
nm I see openstack-tripleo-heat-templates-0.8.14-9.el7ost is available in the channel, and according to comment 12 this would include the needed changes.

Comment 20 Christian Schwede (cschwede) 2017-01-16 13:29:35 UTC
Added a link to an upstream patch that actually fixes this issue.

Comment 33 errata-xmlrpc 2017-05-17 19:27:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245