Bug 1240824
Summary: | MariaDB max_connections made configurable in yaml files | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Ofer Blaut <oblaut> | |
Component: | rhosp-director | Assignee: | Giulio Fidente <gfidente> | |
Status: | CLOSED ERRATA | QA Contact: | Asaf Hirshberg <ahirshbe> | |
Severity: | urgent | Docs Contact: | ||
Priority: | high | |||
Version: | Director | CC: | ahirshbe, akrzos, dmacpher, fdinitto, gfidente, jliberma, jslagle, jstransk, mandreou, markmc, mbayer, mburns, mwagner, oblaut, ohochman, rhel-osp-director-maint, rohara, rrosa, sclewis, vcojot | |
Target Milestone: | ga | Keywords: | Triaged | |
Target Release: | Director | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | openstack-tripleo-heat-templates-0.8.6-33.el7ost | Doc Type: | Known Issue | |
Doc Text: |
The number of connections to the database scaled depending on the number of Controllers and the cores of each Controller. In HA scenarios with three controllers where each has more than 12 cores, the database connections could reach the default max_connections value set to 1024. This caused services to not respond to requests. As a workaround, increase the max_connections limit with the following command:
$ openstack management plan set [tuskar_plan_uuid] -P "Controller-1::MysqlMaxConnections=4096"
Replace [tuskar_plan_uuid] with the actual plan UUID, which you can find with:
$ openstack management plan list
To increase the max_connections value when deploying with the --templates argument, provide to the deploy command an additional customization environment file containing the following:
parameters:
MysqlMaxConnections: 4096
Add it to the deploy command with:
$ openstack management deploy --plan overcloud -e /path/to/custom_environment_file.yaml
|
Story Points: | --- | |
Clone Of: | ||||
: | 1240826 1247924 (view as bug list) | Environment: | ||
Last Closed: | 2015-08-05 13:58:44 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1266253 |
Description
Ofer Blaut
2015-07-07 21:01:12 UTC
*** Bug 1241050 has been marked as a duplicate of this bug. *** There are 3 things we could do here, I think: 1. make the value configurable generically (template, not hiera) 2. Set a very high default 3. Make the value smarter like the summary says. I think we need #1 regardless #2 would be a temporary stopgap solution #3 is a significant amount of background work. To start with we make it configurable If we are kicking up the number of connections, do we need to look at things like the amount of memory allocated as well ? Each connection requires a certain amount of memory and can possibly use more than that while running queries. hi Mark, we still don't have a formula to scale the value automatically so the patch currently makes max_connections configurable but does not change the setting which is left to 1024. We know that 1024 fails on a 3 node HA configuration with fairly standard hardware. Please make the default 4096 so it works "out of the box". People not using it in production or on their laptops can tune it smaller. The default needs to work for production use. Please note https://bugzilla.redhat.com/show_bug.cgi?id=1241569#c13 where we discovered in a setup consisting of 3 controllers with 96 logical CPUs, that the steady state was over 2000 connections to the DB We also discovered that the haproxy max connections setting for the DB proxy frontend was set to 2000, so we increased that to over 8192 since the DB max connections was set to 8192 hi, thanks for adding some real data into the thread; there are similar problems with rabbitmq as well so I pushed a bunch of additional changes to make parametric the rabbitmq fd limit [1], increase the default mysql max_connections setting [2] and haproxy frontend maxconn [3] 1. https://review.openstack.org/#/c/201796/ 2. https://review.openstack.org/#/c/202516/ 3. https://review.openstack.org/#/c/202525/ With the patch attached to the BZ it is possible to customize the Galera max_connections setting but it is not scaled by default with the number of cores. To customze max_connections one can deploy using a custom environment file containing: parameter_defaults: MysqlMaxConnections: 2048 Use of parameter_defaults: is unsupported when deploying with Tuskar; alternatively the following will work when deploying with Tuskar and also without Tuskar: parameters: MysqlMaxConnections: 2048 this should be edited into a .yaml file passed with the -e argument when deploying or it can be appended into the parameters: section of a pre-existing .yaml (eg. network-isolation.yaml) @gfidente I just took a look at the patches in gerrit, and the value is now configurable but the default is set to 1024. Is that what is being presently QA'd? If so, I have to agree with MarkW in comment #9 that we should set the default higher, otherwise every person doing a realistic deployment is going to have to remember to set this to 4096 or higher. I think the only negative to setting to 4096 by default is the memory consumption of mariadb when it starts up? Is there any other reason why we shouldn't do this by default so that out of the box this is optimized for production usage. @mbayer, can you comment on the above? Secondly... we also found that HAProxy needed it's connection limit adjusted from the default of 2000 per VIP, because the mariadb connections are long-lived. So, even if you set the max connections in mariadb to 4096, if the HAProxy settings are left as defaults, you'll be capped at 2000 simultaneous connections anyhow. @rohara has suggested to either set this value to at or higher than the DB max connections value, or to leave the max connections for HAProxy completely unbounded. @rohara, please chime in. I agree with Perry and Mark. Default should be "enterprise/production ready". Ack, I think it is safe to cherry-pick at least: https://review.openstack.org/#/c/202516/ which increases the db max_connections to 4096. There is also a matching change for haproxy to increase the per-frontend limit to same amount and this should be safe to merge as well (both landed upstream already): https://review.openstack.org/#/c/202525/ The change which increases the RabbitMQ FD limit instead is not passing CI and it needs an update to make puppet echo the actual limit into rabbitmq-env.conf so it might need a little more time before it's ready. I don't disagree with the comments at all. My one question is whether these are release blockers or should we clone this to a bug for the next release to increase the defaults. Or should we respin and likely slip the release? We do not want haproxy refusing connections for galera. If anything is going to enforce maximum connections, if should be the database itself. Therefore haproxy maxconn limit for the galera proxy should be greater than the database max_connections. Also, I am not entirely sure what is setting the max_connections in TripleO (puppet?), but the puppet module we are using in RHOS6 (read: old) will set max_connections based on hardware resources. I think the script included with the puppet-mysql module is mysqltuner.pl. Perhaps this has changed in newer versions of puppet-myself, and perhaps if you explicitly set max_connections as a parameter in the puppet class if will avoid tuning this value. Hi The bug is on_qa and i'm not sure what is expected here ( comment #15) ? yaml files should use default values without need to configure them ? test increase the default values by editing network-isolation.yaml ? Which parameters ? MysqlMaxConnections to 4096 $ sudo grep -i MysqlMaxConnections /usr/share/openstack-tripleo-heat-templates/puppet/controller-puppet.yaml -A 4 Still 1024 haproxy_global_maxconn = 20480 haproxy_default_maxconn = 4096 [stack@puma33 ~]$ sudo grep -ir haproxy_global_maxconn /usr/share/openstack-puppet/modules/tripleo/manifests/loadbalancer.pp -A 4 Still 10000 I came here with the intent to help out on the doctext... didn't quite get there but to that extent I have another workaround to offer for time.now(). From my local deploy of latest poodle it seems Giulio's patches for making this configurable ARE downstream, whereas his patches for making this default to 4096 are not. Besides Giulio's workaround above at comment 13 you can also patch the openstack overcloud deployment plan - I just tested this with (what i got when i deployed latest i think) openstack-heat-templates-0-0.6.20150605git.el7ost.noarch and it works like so: (before deploying an overcloud): # get the plan id: openstack management plan list # set the mysql connections max to 4096: openstack management plan set 44874dcc-9b5b-404e-929d-471e36c7b5e0 -P "Controller-1::MysqlMaxConnections=4096" # deploy your overcloud: openstack management deploy --plan overcloud --control-scale 3 --compute-scale 3 # after 'Overcloud Deployed' is reported, confirm this limit on the controllers: controller_ips=`nova list | grep control | grep ctlplane | awk '{print $12}' | tr "ctlplane=" "\n"` for i in ${controller_ips[@]}; do ssh heat-admin@$i "sudo grep -rni 'max_connections' /etc/*"; echo " ===== "; done hi Ofer, only the patch which allows for customization of max_connections is included in the initial release so to test this the only option we have is to customize the default setting and check if that is increased in galera.cnf. I have pased into doc_text some commands to customize it before the deployment; it would indeed be nice to check if the instructions work as intended. option suggested in comment 13 didn't work parameters: MysqlMaxConnections: 2048 openstack overcloud deploy -e /home/stack/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml --plan overcloud --control-scale 3 --compute-scale 1 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server clock.redhat.com /tmp/tmp7NIm1z/extraconfig/post_deploy/default.yaml /tmp/tmp7NIm1z/puppet/all-nodes-config.yaml ERROR: openstack ERROR: The Parameter (MysqlMaxConnections) was not defined in template. only thing worked is MariaDB [(none)]> show variables like "max_connections"; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | max_connections | 4096 | +-----------------+-------+ 1 row in set (0.00 sec) parameters: Controller-1::NeutronExternalNetworkBridge: "''" Controller-1::MysqlMaxConnections: 4096 please advise openstack-tripleo-heat-templates-0.8.6-45.el7ost.noarch hi Ofer, thanks for the feedback, comment #13 would still only work with standard templated, the instructions in the doc text should be correct though: Controller-1::MysqlMaxConnections is the parameter name to be used when deploying with tuskar and it seems to have worked as per your comment #23, while: MysqlMaxConnections is the parameter name to be used when deploying with regular templates (passing --templates to the deploy command) Workaround works, bug is cloned to https://bugzilla.redhat.com/show_bug.cgi?id=1247924 , for automated solution Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2015:1549 *** Bug 1246084 has been marked as a duplicate of this bug. *** adding the link to the downstream patch that made the default 4096 If we determine that the MysqlMaxConnections should be 8192, and the haproxy maxconn default is still 4096, what is the best way to increase the HAproxy maxconn through the templates in a repeatable way? Is this exposed anywhere? I see haproxy_default_maxconn is set in /usr/share/openstack-puppet/modules/tripleo/manifests/loadbalancer.pp, but is there a way to change it in Heat templates? Is this exposed anywhere or do I have to manually change the puppet manifest? (Gfidente is just getting back from PTO today so i'll reduce the e-mail flood on him by answering this one :) ) It can be set in a way described here: https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/director-installation-and-usage/616-customizing-puppet-configuration-data So e.g. parameter_defaults: controllerExtraConfig: tripleo::loadbalancer::haproxy_default_maxconn: 8192 There are two relevant changes coming in Newton: `controllerExtraConfig` will be deprecated in favor of `ControllerExtraConfig`, and `tripleo::loadbalancer` will become `tripleo::haproxy`. |