Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1508632

Summary: galera is in 'failed' state after rebooting overcloud nodes.
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: galeraAssignee: Damien Ciabrini <dciabrin>
Status: CLOSED DUPLICATE QA Contact: Shai Revivo <srevivo>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 12.0 (Pike)CC: mbayer, srevivo
Target Milestone: rc   
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-01 20:32:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexander Chuzhoy 2017-11-01 19:56:41 UTC
galera is in 'failed' state after rebooting overcloud nodes.


Environment:
galera-25.3.16-3.el7ost.x86_64
mariadb-galera-common-5.5.42-7.el7ost.x86_64
mariadb-galera-server-5.5.42-7.el7ost.x86_64
openstack-tripleo-heat-templates-7.0.3-0.20171023134948.el7ost.noarch
instack-undercloud-7.4.2-0.20171010064304.el7ost.noarch
openstack-puppet-modules-11.0.0-0.20170828113154.el7ost.noarch



Steps to reproduce:

1. Deploy overcloud with:
openstack overcloud deploy --templates \
--libvirt-type kvm \
-e /home/stack/templates/nodes_data.yaml \
-e  /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/enable-tls.yaml \
-e /home/stack/virt/public_vip.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \
-e /home/stack/inject-trust-anchor-hiera.yaml \
-e /home/stack/rhos12.yaml


2. Reboot all OC nodes.

3. Check the status of pcs resources.


Result:

[root@overcloud-controller-2 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: overcloud-controller-2 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum
Last updated: Wed Nov  1 19:52:57 2017
Last change: Wed Nov  1 19:13:16 2017 by root via cibadmin on overcloud-controller-0

12 nodes configured
37 resources configured

Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
GuestOnline: [ galera-bundle-0@overcloud-controller-0 galera-bundle-1@overcloud-controller-1 galera-bundle-2@overcloud-controller-2 rabbitmq-bundle-0@overcloud-controller-0 rabbitmq-bundle-1@overcloud-controller-1 rabbitmq-bundle-2@overcloud-controller-2 redis-bundle-0@overcloud-controller-0 redis-bundle-1@overcloud-controller-1 redis-bundle-2@overcloud-controller-2 ]

Full list of resources:

 Docker container set: rabbitmq-bundle [192.168.24.1:8787/rhosp12/openstack-rabbitmq-docker:pcmklatest]
   rabbitmq-bundle-0    (ocf::heartbeat:rabbitmq-cluster):      Started overcloud-controller-0
   rabbitmq-bundle-1    (ocf::heartbeat:rabbitmq-cluster):      Started overcloud-controller-1
   rabbitmq-bundle-2    (ocf::heartbeat:rabbitmq-cluster):      Started overcloud-controller-2
 Docker container set: galera-bundle [192.168.24.1:8787/rhosp12/openstack-mariadb-docker:pcmklatest]
   galera-bundle-0      (ocf::heartbeat:galera):        FAILED Master overcloud-controller-0 (blocked)
   galera-bundle-1      (ocf::heartbeat:galera):        Master overcloud-controller-1
   galera-bundle-2      (ocf::heartbeat:galera):        Master overcloud-controller-2
 Docker container set: redis-bundle [192.168.24.1:8787/rhosp12/openstack-redis-docker:pcmklatest]
   redis-bundle-0       (ocf::heartbeat:redis): Master overcloud-controller-0
   redis-bundle-1       (ocf::heartbeat:redis): Slave overcloud-controller-1
   redis-bundle-2       (ocf::heartbeat:redis): Slave overcloud-controller-2
 ip-192.168.24.8        (ocf::heartbeat:IPaddr2):       Started overcloud-controller-0
 ip-10.0.0.101  (ocf::heartbeat:IPaddr2):       Started overcloud-controller-1
 ip-172.17.1.19 (ocf::heartbeat:IPaddr2):       Started overcloud-controller-2
 ip-172.17.1.16 (ocf::heartbeat:IPaddr2):       Started overcloud-controller-0
 ip-172.17.3.10 (ocf::heartbeat:IPaddr2):       Started overcloud-controller-1
 ip-172.17.4.12 (ocf::heartbeat:IPaddr2):       Started overcloud-controller-2
 Docker container set: haproxy-bundle [192.168.24.1:8787/rhosp12/openstack-haproxy-docker:pcmklatest]
   haproxy-bundle-docker-0      (ocf::heartbeat:docker):        Started overcloud-controller-0
   haproxy-bundle-docker-1      (ocf::heartbeat:docker):        Started overcloud-controller-1
   haproxy-bundle-docker-2      (ocf::heartbeat:docker):        Started overcloud-controller-2
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started overcloud-controller-0

Failed Actions:
* galera_promote_0 on galera-bundle-0 'unknown error' (1): call=442, status=complete, exitreason='MySQL server failed to start (pid=1863) (rc=0), please check your installation',
    last-rc-change='Wed Nov  1 19:42:27 2017', queued=0ms, exec=6164ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled



Expected result:

Galera isn't failed.

Comment 1 Alexander Chuzhoy 2017-11-01 20:32:32 UTC

*** This bug has been marked as a duplicate of bug 1499677 ***