1292225 – [RFE] Non-disruptive scaling-out operations

Bug 1292225 - [RFE] Non-disruptive scaling-out operations

Summary: [RFE] Non-disruptive scaling-out operations

Keywords:
Status:	CLOSED DUPLICATE of bug 1395308
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	rhosp-director
Sub Component:
Version:	7.0 (Kilo)
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	urgent
Target Milestone:	Upstream M1
Target Release:	12.0 (Pike)
Assignee:	Hugh Brock
QA Contact:	Shai Revivo
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-12-16 19:08 UTC by Marius Cornea
Modified:	2016-12-14 21:03 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-12-14 21:03:12 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Marius Cornea 2015-12-16 19:08:09 UTC

Description of problem:
When scaling out an updated overcloud the cluster gets restarted and brings down the control plane for a few minutes (the issue has been described in BZ#1287812)

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-0.8.6-94.el7ost.noarch


Steps to Reproduce:
1. Deploy 7.1 by using 7.1 templates:
openstack overcloud deploy \
    --templates ~/templates/my-overcloud \
    --control-scale 3 --compute-scale 1 --ceph-storage-scale 3 \
    --ntp-server clock.redhat.com \
    --libvirt-type qemu \
    -e ~/templates/my-overcloud/environments/network-isolation.yaml \
    -e ~/templates/network-environment.yaml \
    -e ~/templates/firstboot-environment.yaml \
    -e ~/templates/ceph.yaml 

2. Update the undercloud to 7.2 and run the update procedure to 7.2 with 7.2 templates:
/usr/bin/yes '' | openstack overcloud update stack overcloud -i \
         --templates ~/templates/my-overcloud \
         -e ~/templates/my-overcloud/overcloud-resource-registry-puppet.yaml \
         -e ~/templates/my-overcloud/environments/network-isolation.yaml \
         -e ~/templates/network-environment.yaml \
         -e ~/templates/firstboot-environment.yaml \
         -e ~/templates/ceph.yaml \
         -e ~/templates/my-overcloud/environments/updates/update-from-vip.yaml \
         -e ~/templates/ctrlport.yaml

Wait for the update to complete

3. Scale out with an additional node:

openstack overcloud deploy \
    --templates ~/templates/my-overcloud \
    --control-scale 3 --compute-scale 2 --ceph-storage-scale 3 \
    --ntp-server clock.redhat.com \
    --libvirt-type qemu \
    -e ~/templates/my-overcloud/overcloud-resource-registry-puppet.yaml \
    -e ~/templates/my-overcloud/environments/network-isolation.yaml \
    -e ~/templates/network-environment.yaml \
    -e ~/templates/firstboot-environment.yaml \
    -e ~/templates/ceph.yaml \
    -e ~/templates/my-overcloud/environments/updates/update-from-vip.yaml \
    -e ~/templates/ctrlport.yaml

Actual results:
During the scale out the cluster gets restarted which brings down all the APIs exposed via HAProxy for a few minutes.

Expected results:
The APIs are available when adding a compute node.

Comment 1 James Slagle 2016-01-28 14:49:40 UTC

puppet will restart services even during a scale out attempt due to configuration changes. there is currently no synchronization in place to make sure that happens on one controller node at a time, so outages as you describe are likely to happen.

moving to osp8 as something to consider.

Comment 3 Hugh Brock 2016-02-05 12:30:11 UTC

RFE, removing blocker flag.

Comment 6 Jaromir Coufal 2016-12-14 19:36:45 UTC

Summary of the request:

When scaling out/down, assure that OpenStack services are not interrupted and that changes happen only on the node which is being scaled (not on all the nodes).

Comment 7 Jaromir Coufal 2016-12-14 21:03:12 UTC


*** This bug has been marked as a duplicate of bug 1395308 ***

Note You need to log in before you can comment on or make changes to this bug.