Bug 1991644 - [RFE] Allow tuning of galera gcache size
Summary: [RFE] Allow tuning of galera gcache size
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 16.2 (Train)
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: z2
: 16.2 (Train on RHEL 8.4)
Assignee: Luca Miccini
QA Contact: dabarzil
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-09 15:16 UTC by Cristian Muresanu
Modified: 2022-03-23 22:11 UTC (History)
10 users (show)

Fixed In Version: puppet-tripleo-11.7.0-2.20211224004900.be47189.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-23 22:11:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 804155 0 None None None 2021-08-11 06:03:16 UTC
Red Hat Issue Tracker OSP-6925 0 None None None 2021-11-18 15:01:43 UTC
Red Hat Product Errata RHBA-2022:1001 0 None None None 2022-03-23 22:11:30 UTC

Description Cristian Muresanu 2021-08-09 15:16:03 UTC
Description of problem:

Would it be possible to make `$wsrep_provider_options` in `https://opendev.org/openstack/puppet-tripleo/src/branch/master/manifests/profile/pacemaker/database/mysql_bundle.pp` a hash and collaps it to a string after the `deep_merge` in line 346? 
I think this would make it possible to set individual `wsrep_provider_options` without messing around with `gmcast.listen_addr` or potential tls settings. 

Reason: 
Because of the high instance create/delete rate in some of our environments the default `gcache.size` of 128mb leaves us with maintenance windows of less than 30 minutes. 
Hardware replacements regularly take us more than one hour. 
To avoid SST we increased the `gcache.size` see attached `mysql.yml`[1].  
The `gmcast.listen_addr` isn't a parameter we want to change but as far as i can tell there is currently not way to set an individual option because `wsrep_provider_options` is a string. 

[1] mysql.yml
~~~
parameter_defaults:
  ControllerExtraConfig:
    tripleo::profile::base::database::mysql::mysql_server_options:
      mysqld:
        wsrep_provider_options: "gcache.size=512mb;gmcast.listen_addr=tcp://%{tripleo::profile::pacemaker::database::mysql_bundle::gmcast_listen_addr}:4567;"
~~~

We need to set gcache.size to an appropriate value to prevent SSTs from happening when we need to reboot a single controller.
We would like to do that by simply setting wsrep_provider_options and having that merged with the default config instead of overwriting.

The issue we are trying to fix with the above mentioned procedure happens in OpenStack regions with high instance ceate/delete rate during maintenance.

If you know of a different clean way to avoid an SST/increase `gcache.size` i would be happy to give it a try.

Comment 1 Luca Miccini 2021-08-10 06:18:32 UTC
I think the request is valid, I'll see how we can expose the gcache.size parameter in puppet-tripleo.

Comment 4 Luca Miccini 2021-08-10 07:35:10 UTC
https://review.opendev.org/c/openstack/puppet-tripleo/+/804045

ControllerExtraConfig:
  tripleo::profile::pacemaker::database::mysql_bundle::gcache_size: 512M

Comment 15 dabarzil 2022-02-22 12:40:34 UTC
tested on:
[root@controller-0 ~]# rpm -qa |grep puppet-tripleo
puppet-tripleo-11.7.0-2.20211224004901.el8ost.noarch


[root@controller-0 ~]# grep gcache.size /var/lib/config-data/puppet-generated/mysql/etc/my.cnf.d/galera.cnf
wsrep_provider_options = gcache.size=512M;gmcast.listen_addr=tcp://172.17.1.140:4567;

Comment 20 errata-xmlrpc 2022-03-23 22:11:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1001


Note You need to log in before you can comment on or make changes to this bug.