Bug 1231777

Summary: Its possible to scale up beyond the number of free nodes
Product: Red Hat OpenStack Reporter: Amit Ugol <augol>
Component: python-rdomanager-oscpluginAssignee: Jan Provaznik <jprovazn>
Status: CLOSED ERRATA QA Contact: Amit Ugol <augol>
Severity: high Docs Contact:
Priority: medium    
Version: 7.0 (Kilo)CC: brad, calfonso, dmacpher, jslagle, mburns, rhel-osp-director-maint, yeylon
Target Milestone: y1Keywords: Triaged, ZStream
Target Release: 7.0 (Kilo)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-rdomanager-oscplugin-0.0.10 Doc Type: Bug Fix
Doc Text:
The "openstack overcloud deploy" command did not check available nodes for deployment. This caused failed deployments due if there were not enough nodes available. This fix adds a pre-deployment check to the CLI and checks the number of available nodes before creating or updating the Overcloud stack. Now if not enough nodes are available, users get an error message before Heat creates or updates the stack.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-08 12:09:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amit Ugol 2015-06-15 11:56:21 UTC
Issuing scaling up of a node takes place even if the maximum number of new nodes is higher then the number of free nodes.

in my example:

$ ironic node-list
+--------------------------------------+------+----------...
| UUID   | Name | Instance UUID        | Power State     ...
+--------------------------------------+------+----------...
| ...... | None | ...... | power on    | active          ...
| ...... | None | ...... | power on    | active          ...
| ...... | None | ...... | power on    | active          ...
| ...... | None | ...... | power on    | active          ...
| ...... | None | None   | power off   | available       ...
| ...... | None | ...... | power on    | active          ...
| ...... | None | None   | power off   | available       ...
| ...... | None | None   | power off   | available       ...
+--------------------------------------+------+----------...

6 nodes in total, out of which 3 are Ceph nodes.

I ran:
$ openstack overcloud scale stack -r Ceph-Storage-1 -n 9 overcloud overcloud
Scaling out role Ceph-Storage-1 in stack overcloud to 9 nodes

The process to scale up has started, but should not have begun in this case.

Trying lowering this with:
$ openstack overcloud scale stack -r Ceph-Storage-1 -n 4 overcloud overcloud
gives me:
Scaling out role Ceph-Storage-1 in stack overcloud to 4 nodes
ERROR: openstack Role Ceph-Storage-1 has already 9 nodes, can't set lower value

Comment 3 Jay Dobies 2015-06-16 15:13:30 UTC
Moving this to an oscplugin bug since the CLI is the only place that speaks across both Tuskar and Ironic.

Comment 4 Ana Krivokapic 2015-06-18 14:11:23 UTC
Assigning this to Jan as he's the one who implemented stack scaling.

Comment 5 Jan Provaznik 2015-06-23 10:55:30 UTC
The scale command is being replaced with the newly added CLI command which is now  used for any update of stack:

    openstack management plan set $PLAN_UUID -S Compute-1=2
    openstack overcloud deploy --plan-uuid $PLAN_UUID

I can implement a check in the "openstack overcloud deploy" command which would check number of available nodes, but TBH not sure if this is high prio - we don't implement ATM any other tests for other input - e.g. flavor, images,...

Comment 8 Amit Ugol 2015-09-06 08:29:38 UTC
Latest version:
$ openstack overcloud deploy --templates --control-scale 1 --compute-scale 2
Deployment failed:  Not enough nodes - available: 0, requested: 3

Comment 10 errata-xmlrpc 2015-10-08 12:09:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:1862