Bug 1472249 - MariaDB adjustment parameter missing
Summary: MariaDB adjustment parameter missing
Keywords:
Status: CLOSED DUPLICATE of bug 1483656
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Michael Bayer
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-18 10:04 UTC by Aviv Guetta
Modified: 2020-12-14 09:08 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-10-02 14:49:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1704978 0 None None None 2017-07-18 10:04:15 UTC

Description Aviv Guetta 2017-07-18 10:04:16 UTC
Description of problem:

MariaDB adjustment settings were pushed out of controller.yaml template to /puppet/services/database/mysql.yaml[1].

Unfortunately, 'MysqlInnodbBufferPoolSize' parameter wasn't moved as the other parameters so now it's missing.

'MysqlInnodbBufferPoolSize' should be added and properly exposed at mysql.pp[2].

[1] http://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=58bf3932a86f2f5582937e3da8cb74dfd29c116b
[2] https://github.com/openstack/puppet-tripleo/blob/stable/newton/manifests/profile/pacemaker/database/mysql.pp

Version-Release number of selected component (if applicable):
Red Hat OpenStack Platform 10

Comment 1 Fabio Massimo Di Nitto 2017-07-19 08:53:00 UTC
Mike can you please take a look at it?

this sounds like a regression to me, unless hiding the parameter was done on purpose.

Comment 2 Michael Bayer 2017-07-19 14:43:52 UTC
I'd look into adding it to puppet-tripleo in the same way I added innodb_flush_log_at_trx_commit in https://review.openstack.org/#/c/479849/.   I'm assuming we don't need the .yaml flag here in tripleo-heat-templates as it will be accessible via a hiera variable / ControllerExtraConfig.

Comment 3 Michael Bayer 2017-07-19 15:01:54 UTC
I'm being told by @dprince that the rationale for the above removal was that because this setting only added to hiera, but was otherwise unconsumed, that the MysqlInnodbBufferPoolSize heat template setting had no actual effect.   If true, that would mean you'd not see this setting within the customer's galera.cnf / my.cnf.d/* settings and then this would not be a regression.   Is this something that can be easily confirmed on the customer side (e.g. can we see their /etc/my.cnf* ? )

Comment 4 Michael Bayer 2017-07-19 18:54:06 UTC
So I've confirmed w/ two different tripleo engineers that this configuration variable never did anything and tripleo has never had the ability to change this admittedly important setting.  So here we need to pursue this via the RFE process and it would be targeted first at Queens.

Comment 10 Anthony Herr 2017-08-02 12:52:42 UTC
What is the customers tolerance for upgrading to OSP12/13 if this change is made in the product at that time?

Comment 12 Anthony Herr 2017-08-02 15:23:49 UTC
Is the customer willing to continue to perform manual operation with the expectation that this will be enhanced in OSP 12/13?  I understand that the customer is of high strategic value.  The issue is backporting new features, as much as there was an expectation that the feature was in place we understand that it was not, there is the potential that the enhancement will break something else.  The reason we go through exhaustive testing during the release is to ensure new features do not impact old ones.  I am reluctant to authorize this if this is a currently supported work around, especially if the customer is comfortable with that work around.

Comment 15 Radosław Śmigielski 2017-08-03 09:00:46 UTC
Anthony, let me disagree on the backporting, MysqlInnodbBufferPoolSize was in OSP 8.0 (mitaka) and it's gone now. I mean it looks like it was slowly vanishing over time in OSP 9.0 and is missing in OSP 10.0. So for me it looks like a regression and not like new feature backporting.
This is what git grep shows on OSP 8.0:

❯ git grep MysqlInnodbBufferPoolSize
deprecated/overcloud-source.yaml:  MysqlInnodbBufferPoolSize:
deprecated/overcloud-source.yaml:          innodb_buffer_pool_size: {get_param: MysqlInnodbBufferPoolSize}
deprecated/undercloud-source.yaml:  MysqlInnodbBufferPoolSize:
deprecated/undercloud-source.yaml:          innodb_buffer_pool_size: {get_param: MysqlInnodbBufferPoolSize}
os-apply-config/controller.yaml:  MysqlInnodbBufferPoolSize:
os-apply-config/controller.yaml:        mysql_innodb_buffer_pool_size: {get_param: MysqlInnodbBufferPoolSize}
overcloud.yaml:  MysqlInnodbBufferPoolSize:
overcloud.yaml:          MysqlInnodbBufferPoolSize: {get_param: MysqlInnodbBufferPoolSize}
puppet/controller.yaml:  MysqlInnodbBufferPoolSize:
puppet/controller.yaml:        mysql_innodb_buffer_pool_size: {get_param: MysqlInnodbBufferPoolSize}



OSP 10 (Newton) is Red Hat long support release and we working on a support version of product which is base on OSP 10 so going with 12/13 now is not an option for us.

The default MariaDB innoDB buffer size is 128MB, with this size you can't  really scale OC to more than 20 computes and with that number controllers will have really hard time to handle user's load. This is for me a really major problem.

Comment 16 Fabio Massimo Di Nitto 2017-08-03 12:27:07 UTC
(In reply to Radosław Śmigielski from comment #15)
> Anthony, let me disagree on the backporting, MysqlInnodbBufferPoolSize was
> in OSP 8.0 (mitaka) and it's gone now.

Based on our information, this is not correct either.

The option was there in OSP8 but was never functional and has never been working upstream or downstream in OSP8. According to the information provided to us by Yolanda, you had a forked version of puppet modules to handle it internally.
Upstream removed the feature and since it was not functional in the first place it did not cause any regression, except in your environment.

We proposed a patch upstream to re-include the feature as mentioned above.

Also, with this default setting we have been able to deploy 3 controllers with over 300 compute nodes without any problem.
So keeping aside the backport request, it would be also interesting to understand why you are hitting this limit of 20 computes nodes. The RCA might be completely different and the mysql option only masking / hiding the problem.

Comment 17 Michael Bayer 2017-08-03 14:31:46 UTC
There have also been lots of database mis-configurations and inefficient programming patterns that have been corrected since earlier OSP versions, things like poor performance of the DB driver under high concurrency, connection pool settings that led to lots of requests waiting too long, haproxy settings that would time out connections too early leading to disconnects and transaction contention between galera nodes, things like that.  Openstack is not actually a database-intensive application from a MySQL perspective.

Comment 18 Michael Bayer 2017-08-03 16:09:48 UTC
this is proposed as a hiera parameter innodb_buffer_pool_size for pike, ocata and newton upstream: https://review.openstack.org/#/q/Iabdcb6f76510becb98cba35c95db550ffce44ff3,n,z

Comment 19 Radosław Śmigielski 2017-08-04 08:54:42 UTC
> We proposed a patch upstream to re-include the feature as mentioned above.
So I was the first one who gave +1 to that patch :) 
https://review.openstack.org/#/c/490046/

>> Also, with this default setting we have been able to deploy 3 controllers
>> with over 300 compute nodes without any problem.
I bet all your controllers were running on SSD? and not on traditional drives?

With small InnoDB buffer MariaDB needs to do much more fsync() and at some point it hits limit of IOPS of local drive. So having bigger buffer is really essential.
The default value 128MB of InnodbBufferPoolSize is very conservative and in general MariaDB/MySQL good practice is to give 80% of memory on dedicated server. We don't run MariaDb on dedicated server but still 128MB is way too low.

Comment 20 Michael Bayer 2017-08-04 13:46:27 UTC
(In reply to Radosław Śmigielski from comment #19)
> > We proposed a patch upstream to re-include the feature as mentioned above.
> So I was the first one who gave +1 to that patch :) 
> https://review.openstack.org/#/c/490046/
> 
> >> Also, with this default setting we have been able to deploy 3 controllers
> >> with over 300 compute nodes without any problem.
> I bet all your controllers were running on SSD? and not on traditional
> drives?
> 
> With small InnoDB buffer MariaDB needs to do much more fsync() and at some
> point it hits limit of IOPS of local drive. So having bigger buffer is
> really essential.

the innodb_buffer_pool_size is only about caching pages from disk files in memory for reads, whereas fsyncs are for flushing newly written data to the filesystem; there is no documented correlation between these two settings. If your problem is excess fsync() you want to be looking at innodb_flush_log_at_trx_commit=2 assuming a Galera cluster is in use, which is also a setting we've recently added to tripleo; this makes it so that the fsync() call occurs only once per second rather than once-per-commit and can provide extremely dramatic performance improvements immediately, at the cost of a slight degradation of durability, which is ameliorated by the fact that a galera cluster is replicating writesets to all nodes.

> The default value 128MB of InnodbBufferPoolSize is very conservative and in
> general MariaDB/MySQL good practice is to give 80% of memory on dedicated
> server. 

that also doesn't apply in this case because we bundle the galera/mysql instances on a controller that has dozens of other Python and C-based services running.    mysqld will still be the big memory user for the controller node in which galera is active but 80% would be way too much.

Comment 21 Michael Bayer 2017-10-02 14:19:04 UTC
note that bz#1483656 is the RHOS 10 patch , this issue needs to be either targeted correctly or marked as a dupe.

Comment 22 Michael Bayer 2017-10-02 14:49:15 UTC
For ocata, puppet-tripleo 6.5.1 has the feature https://docs.openstack.org/releasenotes/puppet-tripleo/ocata.html, downstream is built and tagged to rhos 11 at https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=598578.   for pike, feature is merged upstream and released in 7.3.0 and downstream is onto 7.4.x now.

*** This bug has been marked as a duplicate of bug 1483656 ***


Note You need to log in before you can comment on or make changes to this bug.