Hide Forgot
Description of problem: In puppet-tripleo-5.2.0-2.el7ost.noarch we are missing the following patch: commit 39d88a49bf83a7a3437edc82f42986596356d331 Author: Juan Antonio Osorio Robles <jaosorior> Date: Wed Oct 5 10:48:32 2016 +0300 Enable usage of "short names" for Galera cluster We're not able to use FQDNs yet, so to work around this, we give precedence to a "short name" list we'll get from t-h-t. Change-Id: I4ef7786474c229d5212a0deb2ca02ee992b030d8 Related-Bug: #1628521 diff --git a/manifests/profile/pacemaker/database/mysql.pp b/manifests/profile/pacemaker/database/mysql.pp index 0169e1600a3b..7464854ee608 100644 --- a/manifests/profile/pacemaker/database/mysql.pp +++ b/manifests/profile/pacemaker/database/mysql.pp @@ -45,7 +45,12 @@ class tripleo::profile::pacemaker::database::mysql ( # use only mysql_node_names when we land a patch in t-h-t that # switches to autogenerating these values from composable services - $galera_node_names_lookup = hiera('mysql_node_names', hiera('galera_node_names', $::hostname)) + # The galera node names need to match the pacemaker node names... so if we + # want to use FQDNs for this, the cluster will not finish bootstrapping, + # since all the nodes will be marked as slaves. For now, we'll stick to the + # short name which is already registered in pacemaker until we get around + # this issue. + $galera_node_names_lookup = hiera('mysql_short_node_names', hiera('mysql_node_names', $::hostname)) if is_array($galera_node_names_lookup) { $galera_nodes = downcase(join($galera_node_names_lookup, ',')) } else { The symptoms are that the galera db will not come up correctly: Notice: /Stage[main]/Glance::Deps/Anchor[glance::db::end]: Dependency Exec[galera-ready] has failures: true Notice: /Stage[main]/Glance::Deps/Anchor[glance::dbsync::begin]: Dependency Exec[galera-ready] has failures: true Notice: /Stage[main]/Glance::Deps/Anchor[glance::dbsync::end]: Dependency Exec[galera-ready] has failures: true Notice: /Stage[main]/Glance::Deps/Anchor[glance::service::begin]: Dependency Exec[galera-ready] has failures: true Notice: /Firewall[998 log all]: Dependency Exec[galera-ready] has failures: true Notice: /Firewall[999 drop all]: Dependency Exec[galera-ready] has failures: true The reason for this is that the names passed to the galera resource agents are not the hostnames: Oct 18 03:11:21 localhost galera(galera)[20794]: ERROR: MySQL is not running Oct 18 03:11:21 localhost galera(galera)[20794]: INFO: Waiting on node <overcloud-controller01.internalapi.localdomain> to report database status before Master instances can start. Oct 18 03:11:21 localhost galera(galera)[20794]: INFO: Waiting on node <overcloud-controller02.internalapi.localdomain> to report database status before Master instances can start. Oct 18 03:11:21 localhost galera(galera)[20794]: INFO: Waiting on node <overcloud-controller03.internalapi.localdomain> to report database status before Master instances can start. So we will constantly have the logging of the above until clustercheck will just fail. I.e. we cannot have galera traffic on any dedicated network until https://bugzilla.redhat.com/show_bug.cgi?id=1381836 is fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html