Bug 1393969 - Using parameters ComputeCount and ControllerCount in templates instead of --control-scale and --compute-scale leads to scale out issues after running openstack node delete
Summary: Using parameters ComputeCount and ControllerCount in templates instead of --c...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-rdomanager-oscplugin
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: RHOS Maint
QA Contact: Shai Revivo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-10 18:06 UTC by Andreas Karis
Modified: 2020-01-17 16:09 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-15 19:00:30 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Andreas Karis 2016-11-10 18:06:44 UTC
Description of problem:
I deployed the following script and I think that I reproduced the issue:
~~~
#!/bin/bash
if [ $PWD != /home/stack ] ; then echo "USAGE: $0 this script needs to be executed in /home/stack"; exit 1 ; fi

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
template_base_dir="$DIR"

ntpserver=10.5.26.10  #RH LAB

openstack overcloud deploy --templates \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e ${template_base_dir}/network-environment.yaml \
-e ${template_base_dir}/enable-debug.yaml \
--ntp-server $ntpserver \
--neutron-network-type vxlan --neutron-tunnel-types vxlan
~~~

With the following network-environment.yaml
~~~
[stack@undercloud-7 ~]$ tail templates/network-environment.yaml 
  StorageNetworkVlanID: 903
  StorageNetCidr: 172.18.0.0/24
  StorageAllocationPools: [{'start': '172.18.0.10', 'end': '172.18.0.200'}]
  
  DnsServers: ["192.0.2.1"]

  ComputeCount: 1
  ControllerCount: 1
  OvercloudComputeFlavor: compute
  OvercloudControlFlavor: control
~~~

~~~
[stack@undercloud-7 ~]$ templates/deploy.sh 
Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates
~~~

This created a 1+1 overcloud.

* I then scaled down by one node, using this script:
~~~
#!/bin/bash
if [ $PWD != /home/stack ] ; then echo "USAGE: $0 this script needs to be executed in /home/stack"; exit 1 ; fi

if ! [ $# -eq 1 ]; then
  echo "Please provide a uuid for a node that you wish to delete"
  exit 1
fi

uuid=$1

echo "Deleting $uuid"

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
template_base_dir="$DIR"

openstack overcloud node delete --stack overcloud --templates \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e ${template_base_dir}/network-environment.yaml ${uuid}
~~~

And I also reduced the ComputeCount to 0 before running this.
~~~
[stack@undercloud-7 templates]$ grep ComputeCount network-environment.yaml 
  ComputeCount: 0
~~~

~~~
[stack@undercloud-7 ~]$ templates/delete-node.sh
~~~

* I then incremented ComputeCount back to 1 and reran the initial deploy script.
~~~
[stack@undercloud-7 templates]$ grep ComputeCount network-environment.yaml 
  ComputeCount: 1
~~~

~~~
[stack@undercloud-7 ~]$ templates/deploy.sh 
Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates
~~~

This indeed did not add a new compute node. I ran the same scale out a second time, and this did not change anything, neither.

*) I tried to raise ComputeCount to 2 and scaled out once more.
~~~
[stack@undercloud-7 ~]$ grep ComputeCount templates/network-environment.yaml 
  ComputeCount: 2
~~~
Result: no new compute nodes

*) After that, change of `openstack overcloud deploy (...)` to `openstack overcloud deploy (...) --compute-scale 1`, but kept `ComputeCount: 2` 
Result: scale out by 2 nodes

This looks either like a bug / or unsupported configuration. The issue was confirmed in a lab environment and at a customer site.

Comment 2 Andreas Karis 2016-11-14 12:47:37 UTC
Some clarification:

i) Parameters are specified in 
parameter_defaults
of file network_environments.yaml

ii) Versions of components:
[root@undercloud-7 ~]# rpm -qa | grep python-rdomanager-oscplugin
python-rdomanager-oscplugin-0.0.10-29.el7ost.noarch


Thanks,

Andreas

Comment 3 James Slagle 2016-11-14 15:13:45 UTC
(In reply to Andreas Karis from comment #2)
> Some clarification:
> 
> i) Parameters are specified in 
> parameter_defaults
> of file network_environments.yaml

In that case, you might try setting the Count parameters under a parameters section instead of parameters_default and see if that is a workaround for this issue.

Comment 4 Andreas Karis 2016-11-14 15:23:44 UTC
Hi James,

We worked around it via the CLI parameters (--control-scale --compute-scale).

Can you confirm that this is fixed for later versions of OSP? (>= 8)?

Thanks,

Andreas

Comment 5 James Slagle 2016-11-15 12:53:21 UTC
(In reply to Andreas Karis from comment #4)
> Hi James,
> 
> We worked around it via the CLI parameters (--control-scale --compute-scale).
> 
> Can you confirm that this is fixed for later versions of OSP? (>= 8)?
> 
> Thanks,
> 
> Andreas

this is fixed in python-tripleoclient in OSP 8. Note the client package was renamed from python-rdomanager-oscplugin to python-tripleoclient between OSP 7 and 8.

Comment 6 Andreas Karis 2016-11-15 16:11:17 UTC
Hi James,

Thanks for the info. You can go ahead and close this ticket :-)

Regards,

Andreas


Note You need to log in before you can comment on or make changes to this bug.