Bug 1391470
Summary: | Galera resource agent cannot recover WSREP last commit with recent mariadb version | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Damien Ciabrini <dciabrin> |
Component: | resource-agents | Assignee: | Damien Ciabrini <dciabrin> |
Status: | CLOSED ERRATA | QA Contact: | Asaf Hirshberg <ahirshbe> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 7.3 | CC: | agk, cluster-maint, dciabrin, fdinitto, michele, mnovacek, oalbrigt, royoung, rscarazz, ushkalim |
Target Milestone: | rc | ||
Target Release: | 7.4 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | resource-agents-3.9.5-84.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-08-01 14:55:11 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Damien Ciabrini
2016-11-03 11:55:49 UTC
For the time being in upstream we will back out the commit that changed this behaviour in "mysqld_safe --wsrep-recover" via https://bugs.launchpad.net/tripleo/+bug/1638864 Once we land a fix in the RA we can remove the revert of the offending patch Proposed fix in https://github.com/ClusterLabs/resource-agents/pull/884 has been merged in upstream. Proposed high level steps for testing: - Test with RDO to verify that fix works with mariadb 10: . Deploy a Newton stack with 3-node HA controllers . Once done, stop galera service: "pcs resource disable galera" . install the test resource-agents package . downgrade mariadb* packages to 10.1.18-2 on all controllers, to reintroduce the breaking change in mariadb [1] . restart galera: "pcs resource enable galera" . ensure that all galera nodes have restarted with "pcs status" - Test with RHEL to verify that fix doesn't introduce regression with mariadb 5.5 . Deploy an OSP10 with 3-node HA controllers . Once done, stop galera service: "pcs resource disable galera" . install the test resource-agents package . restart galera: "pcs resource enable galera" . ensure that all galera nodes have restarted with "pcs status" [1] see https://bugs.launchpad.net/tripleo/+bug/1638864 (In reply to Damien Ciabrini from comment #7) > Proposed high level steps for testing: > - Test with RDO to verify that fix works with mariadb 10: > . Deploy a Newton stack with 3-node HA controllers > . Once done, stop galera service: "pcs resource disable galera" > . install the test resource-agents package > . downgrade mariadb* packages to 10.1.18-2 on all controllers, to > reintroduce the breaking change in mariadb [1] > . restart galera: "pcs resource enable galera" > . ensure that all galera nodes have restarted with "pcs status" I made this validation test on the latest CentOS (7.3.1611), stopping galera, downgrading on all the controllers the galera packages (yum -y downgrade mariadb*-10.1.18-2.el7.x86_64) and then restarting with success the galera resource with pcs. > - Test with RHEL to verify that fix doesn't introduce regression with > mariadb 5.5 > . Deploy an OSP10 with 3-node HA controllers > . Once done, stop galera service: "pcs resource disable galera" > . install the test resource-agents package > . restart galera: "pcs resource enable galera" > . ensure that all galera nodes have restarted with "pcs status" > > > [1] see https://bugs.launchpad.net/tripleo/+bug/1638864 Verified on osp-10 latest on rhel, using steps from comment #8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1844 |