Bug 1451097
Summary: | First galera cluster bootstrap may fail if cluster has no data | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Damien Ciabrini <dciabrin> | ||||
Component: | resource-agents | Assignee: | Damien Ciabrini <dciabrin> | ||||
Status: | CLOSED ERRATA | QA Contact: | Asaf Hirshberg <ahirshbe> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 7.4 | CC: | agk, cfeist, chjones, cluster-maint, fdinitto, mbayer, mkrcmari, oalbrigt, royoung, rscarazz, tlavigne, ushkalim | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | resource-agents-3.9.5-99.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1451414 (view as bug list) | Environment: | |||||
Last Closed: | 2017-08-01 15:00:11 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1451414 | ||||||
Attachments: |
|
Description
Damien Ciabrini
2017-05-15 19:07:37 UTC
Here [1] the sosreports of two different deploys failed the same way. [1] http://file.rdu.redhat.com/~rscarazz/BZ1451097/ Created attachment 1279321 [details]
commit a09b17f79ef3dcde8843ee25bb52b506b6948e8e for downstream
One need to apply in sequence the following commits from upstream: a09b17f79ef3dcde8843ee25bb52b506b6948e8e (downstream version attached) 1fe6d7e9aca159a04e984d693dfcc47580bf5def 0265bcebb24427f5c76cfaa58c281c3aaead7d52 Build with patches from comment #5. Instruction for testing: 1. create a 3-node pacemaker cluster pcs cluster setup --name foo centos1 centos2 centos3 --force pcs cluster start --all 2. on all nodes, start from a clean mysql database in /var/lib/mysql rm -rf /var/lib/mysql mkdir /var/lib/mysql chown mysql. /var/lib/mysql restorecon /var/lib/mysql 3. create a galera resource, don't start it yet pcs resource create galera galera enable_creation=true wsrep_cluster_address='gcomm://centos1,centos2,centos3' meta master-max=3 --master --disable 4. monitor the cluster after the resource is enabled crm_mon -RrA pcs resource enable galera The last-commit attribute from all nodes will be set to -1 because no WSREP commit has been integrated yet. With the fix, on such start condition, the 3 nodes will always chose centos3 as a bootstrap node, as expected. Additional comment, if the test is being run on a OpenStack HA overcloud, one should run the additional step on all nodes: 2b. let the resource agent use the default user for polling state rm /etc/sysconfig/clustercheck Verified on resource-agents-3.9.5-99.el7.x86_64 with steps from comment #8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1844 |