Bug 1687879

Summary: Cannot join a cluster any more when using IPv6 in gmcast.listen_addr
Product: Red Hat Enterprise Linux 8 Reporter: Damien Ciabrini <dciabrin>
Component: galeraAssignee: Michal Schorm <mschorm>
Status: CLOSED ERRATA QA Contact: Anna Khaitovich <akhaitov>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 8.0CC: akhaitov, chjones, databases-maint, hhorak, lhh, lmiksik, mbayer, michele
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: OtherQA
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: source code bug; see comment 0. Consequence: One cannot force Galera to use a specific NIC anymore for replication traffic with IPv6; see comment 0. Fix: Rebase Result: IPv6 works as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-04 10:04:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Damien Ciabrini 2019-03-12 14:27:22 UTC
In our OpenStack clouds deployment, we configure galera to listen on a specific IP address to run database traffic on a predictable network. We have to option to use either IPv4 or IPv6 in the mysql configuration.

Since Galera 25.3.21 [1], it is now mandatory to enclose IPv6 addresses,
in square brackets when used in option gmcast.listen_addr, e.g.:

wsrep_provider_options = gmcast.listen_addr=tcp://[fd00:50c5:8119:5564:0:ee39:f850:c5f3]:4567;

Unfortunately, this change introduced a bug that makes
gcomm::AsioTcpSocket::connect(const gu::URI& uri) fail, due to the way
it configures the source address of the socket that is created to
connect to the galera cluster.

Before this IPv6 rework commit [1], it was implicitly expected that the source
address (local variable bind_ip) always receives an IP address as
configured in the GMCast object [2]: But since this IPv6 rework, the
string passed as a source address may now contain brackets when an
IPv6 is used, and this is considered an invalid input when calling
asio::ip::address::from_string() (which internally maps to inet_pton).

Consequently, one cannot force Galera to use a specific NIC anymore for
replication traffic with IPv6.

Comment 1 Damien Ciabrini 2019-03-12 14:58:47 UTC
Additional info:

Issue reported upstream in Codership's github and in MariaDB jira [1]
[1] https://jira.mariadb.org/browse/MDEV-18890

As for the justification of the criticality:

Currently in upstream RDO (OpenStack) we're using a version of Galera
[1] that matches what will be provided by RHEL-8, and we had stop
specifying the IPv6 address to bind to when using Galera. Which means
that any host having access to the network could potentially connect
to the Galera cluster (I spare you the firewall details), which is a
regression compared to what we had in since RHOS-8 when we first
enable IPv6 configuration of galera.

Our hope is to revert to the original behaviour as soon as Galera is
fixed upstream to avoid any regression when we ship downstream.

[1] https://cbs.centos.org/koji/buildinfo?buildID=24424

Comment 2 Damien Ciabrini 2019-05-24 07:26:33 UTC
Galera 25.3.26 includes a fix for the reported problem [1]

I tested it locally and I confirm that it fixes our issue for joining a cluster with IPv6 only.

I used the following galera configuration:

[mysql]
...
bind_address=ra2.v6.ratester
wsrep_cluster_address="gcomm://ra1.v6.ratester,ra2.v6.ratester,ra3.v6.ratester"
wsrep_provider_options = gmcast.listen_addr=tcp://[fd00:be38:af4c:a9a6:0:ee39:f850:c5f2]:4567;
...

Having FQDN resolving to ipv6 address, and a ipv6 address inside bracket make the connection work for us.

[1] https://github.com/codership/documentation/blob/master/release-notes/release-notes-galera-25.3.26.txt#L26