Red Hat Bugzilla – Bug 1269005
rhe-osp-director: HA overcloud deployment with 5 controllers fails.
Last modified: 2016-10-04 15:03:57 EDT
rhe-osp-director: HA overcloud deployment with 5 controllers fails. Environment: instack-undercloud-2.1.2-29.el7ost.noarch Steps to reproduce: Attempt to deploy overcloud with 5 controllers: openstack overcloud deploy --templates --control-scale 5 --compute-scale 1 --ceph-storage-scale 1 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /home/stack/network-environment.yaml --ntp-server x.x.x.x --timeout 90 Result: Stack failed with status: Resource CREATE failed: Error: resources.ControllerNodesPostDeployment.resources.ControllerServicesBaseDeployment_Step2.resources[1]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 6 ERROR: openstack Heat Stack create failed. See this repeating message on controllers: Oct 5 16:10:31 localhost galera(galera)[38739]: ERROR: MySQL is not running
Created attachment 1080084 [details] /var/log/messages from controller
The doc_text is wrong we have to support 5 controllers, it is just not recommended due to performance issues.
Jarda where can I learn more about the performance issues? Is this the Galera database replication overhead issue, where 3 controllers seems to be the sweet spot?
Hey Jacob, sorry for late answer. I would reach for performance team. I am sure there are multiple constraints - DB would be one of them.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
Jarda / Jacob - The Performance issues we saw were >= 7 controllers, however this was when we still were deploying with OFI. We have yet to do a deployment with > 3 controllers with director. The DB became problematic because we do not dynamically configure max_connections with mariadb based on the # of controllers/services.
This can be configured through Pacemaker by modifying: /usr/share/openstack-tripleo-heat-templates/puppet/manifests/overcloud_controller_pacemaker.pp and replacing: meta_params => "master-max=3 ordered=true", by: meta_params => "master-max=5 ordered=true", I'm trying this right now and I'll keep this case updated.
This can be configured through Pacemaker by modifying: /usr/share/openstack-tripleo-heat-templates/puppet/manifests/overcloud_controller_pacemaker.pp and replacing: meta_params => "master-max=3 ordered=true", by: meta_params => "master-max=5 ordered=true", I tested it and everything's working as expected. In my test environment, I don't see a major performance hit .
Note that this is fixed upstream where we have: puppet/manifests/overcloud_controller_pacemaker.pp: meta_params => "master-max=${galera_nodes_count} ordered=true" I *think* this was post-kilo, but I can't find old git history in the tht repo.
Thanks Michele!