Bug 1381836 - galera resource agent cannot bootstrap cluster when galera and pacemaker nodes' name differ
Summary: galera resource agent cannot bootstrap cluster when galera and pacemaker node...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: resource-agents
Version: 7.3
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: 7.4
Assignee: Damien Ciabrini
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-05 07:44 UTC by Damien Ciabrini
Modified: 2018-07-20 08:31 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-20 08:31:38 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github ClusterLabs resource-agents pull 903 None None None 2016-12-19 14:04:21 UTC

Description Damien Ciabrini 2016-10-05 07:44:37 UTC
Description of problem:

In order to bootstrap a Galera cluster, the galera resource agent needs to retrieve the last sequence number of each galera node to find out which node is the most up-to-date.

For each galera node, the last sequence number is stored as a pacemaker attribute linked to the corresponding pacemaker node. There is currently
a 1-1 mapping between galera node's name and pacemaker node's name, and
the resource agent enforces that.

In case the galera server has to be started on a different network than
the one pacemaker is using, it is likely that both galera and pacemaker name
will differ (e.g. gcomm://overcloud-controller-0.internalapi.localdomain vs overcloud-controller-0.localdomain). If so, the resource agent will bail out
and the bootstrap will never finish.

Version-Release number of selected component (if applicable):
RHEL 7

How reproducible:
Always

Steps to Reproduce:
1. have galera nodes' name differ from pacemaker node
2. setup a galera resource with the appropriate wsrep_cluster_address
3. enable the galera resource

Actual results:
galera resource stuck is state "Slave", will never go "Master" in pacemaker

Expected results:
galera resource go "Master", i.e. galera cluster is bootstrapped

Comment 2 Juan Antonio Osorio 2017-06-28 14:16:36 UTC
This was already addressed by the resource agent by Damien's introduction of the cluster_host_map parameter.


Note You need to log in before you can comment on or make changes to this bug.