This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1269005 - rhe-osp-director: HA overcloud deployment with 5 controllers fails.
rhe-osp-director: HA overcloud deployment with 5 controllers fails.
Status: CLOSED CURRENTRELEASE
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
7.0 (Kilo)
x86_64 Linux
urgent Severity high
: ---
: 10.0 (Newton)
Assigned To: James Slagle
Alexander Chuzhoy
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-10-05 18:23 EDT by Alexander Chuzhoy
Modified: 2016-10-04 15:03 EDT (History)
17 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
In this release, RHEL OpenStack Platform director only supports a High Availability (HA) overcloud deployment using three controller nodes.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-10-04 15:03:40 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/messages from controller (86.29 KB, application/x-gzip)
2015-10-05 18:27 EDT, Alexander Chuzhoy
no flags Details

  None (edit)
Description Alexander Chuzhoy 2015-10-05 18:23:37 EDT
rhe-osp-director: HA overcloud deployment with 5 controllers fails.


Environment:
instack-undercloud-2.1.2-29.el7ost.noarch


Steps to reproduce:
Attempt to deploy overcloud with 5 controllers:

openstack overcloud deploy --templates --control-scale 5 --compute-scale 1 --ceph-storage-scale 1 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /home/stack/network-environment.yaml --ntp-server x.x.x.x  --timeout 90


Result:
Stack failed with status: Resource CREATE failed: Error: resources.ControllerNodesPostDeployment.resources.ControllerServicesBaseDeployment_Step2.resources[1]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 6                                                                                                                                                                           
ERROR: openstack Heat Stack create failed.   

See this repeating message on controllers:
Oct  5 16:10:31 localhost galera(galera)[38739]: ERROR: MySQL is not running
Comment 2 Alexander Chuzhoy 2015-10-05 18:27 EDT
Created attachment 1080084 [details]
/var/log/messages from controller
Comment 5 Jaromir Coufal 2016-01-07 04:37:05 EST
The doc_text is wrong we have to support 5 controllers, it is just not recommended due to performance issues.
Comment 6 jliberma@redhat.com 2016-01-14 22:56:58 EST
Jarda where can I learn more about the performance issues?  Is this the Galera database replication overhead issue, where 3 controllers seems to be the sweet spot?
Comment 7 Jaromir Coufal 2016-01-27 06:36:07 EST
Hey Jacob, sorry for late answer. I would reach for performance team. I am sure there are multiple constraints - DB would be one of them.
Comment 9 Mike Burns 2016-04-07 16:54:03 EDT
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.
Comment 10 Joe Talerico 2016-04-14 09:51:49 EDT
Jarda / Jacob - The Performance issues we saw were >=  7 controllers, however this was when we still were deploying with OFI. We have yet to do a deployment with > 3 controllers with director. The DB became problematic because we do not dynamically configure max_connections with mariadb based on the # of controllers/services.
Comment 13 David Hill 2016-04-17 18:53:12 EDT
This can be configured through Pacemaker by modifying:
/usr/share/openstack-tripleo-heat-templates/puppet/manifests/overcloud_controller_pacemaker.pp
and replacing:
meta_params     => "master-max=3 ordered=true",
by:
meta_params     => "master-max=5 ordered=true",

I'm trying this right now and I'll keep this case updated.
Comment 15 David Hill 2016-04-17 20:13:11 EDT
This can be configured through Pacemaker by modifying:
/usr/share/openstack-tripleo-heat-templates/puppet/manifests/overcloud_controller_pacemaker.pp
and replacing:
meta_params     => "master-max=3 ordered=true",
by:
meta_params     => "master-max=5 ordered=true",

I tested it and everything's working as expected.    In my test environment, I don't see a major performance hit .
Comment 16 Michele Baldessari 2016-05-24 08:23:38 EDT
Note that this is fixed upstream where we have:
puppet/manifests/overcloud_controller_pacemaker.pp:      meta_params     => "master-max=${galera_nodes_count} ordered=true"

I *think* this was post-kilo, but I can't find old git history in the tht repo.
Comment 17 Jaromir Coufal 2016-10-04 15:03:40 EDT
Thanks Michele!

Note You need to log in before you can comment on or make changes to this bug.