Description of problem: See the upstream bug.
Hi, doing a OSP9 to OSP10 upgrade fails as the cluster doesn't restart. We can see: + node_states=' galera (ocf::heartbeat:galera): Started overcloud-controller-0 * galera_start_0 on overcloud-controller-2 '\''not installed'\'' (5): call=240, status=complete, exitreason='\''Datadir /var/lib/mysql doesn'\''t exist'\'', * galera_start_0 on overcloud-controller-1 '\''not installed'\'' (5): call=240, status=complete, exitreason='\''Datadir /var/lib/mysql doesn'\''t exist'\'',' + echo ' galera (ocf::heartbeat:galera): Started overcloud-controller-0 * galera_start_0 on overcloud-controller-2 '\''not installed'\'' (5): call=240, status=complete, exitreason='\''Datadir /var/lib/mysql doesn'\''t exist'\'', * galera_start_0 on overcloud-controller-1 '\''not installed'\'' (5): call=240, status=complete, exitreason='\''Datadir /var/lib/mysql doesn'\''t exist'\'',' And I can confirm that in the node other than bootstrap node, the /var/lib/mysql directory has completely as the backup directory.
Deployed RHOS 9 latest Upgraded to RHOS 10 with latest puddle (2016-11-14.1) I no longer see this issue. [stack@undercloud-0 ~]$ ssh heat-admin.2.10 Last login: Tue Nov 15 19:04:39 2016 from gateway [heat-admin@controller-0 ~]$ sudo -i [root@controller-0 ~]# pcs status Cluster name: tripleo_cluster Stack: corosync Current DC: controller-2 (version 1.1.15-11.el7_3.2-e174ec8) - partition with quorum Last updated: Tue Nov 15 19:08:40 2016 Last change: Tue Nov 15 01:10:37 2016 by root via crm_resource on controller-0 3 nodes and 19 resources configured Online: [ controller-0 controller-1 controller-2 ] Full list of resources: ip-fd00.fd00.fd00.4000..10 (ocf::heartbeat:IPaddr2): Started controller-0 ip-192.0.2.6 (ocf::heartbeat:IPaddr2): Started controller-1 Clone Set: haproxy-clone [haproxy] Started: [ controller-0 controller-1 controller-2 ] Master/Slave Set: galera-master [galera] Masters: [ controller-0 controller-1 controller-2 ] ip-2620.52.0.13b8.5054.ff.fe3e.1 (ocf::heartbeat:IPaddr2): Started controller-2 Clone Set: rabbitmq-clone [rabbitmq] Started: [ controller-0 controller-1 controller-2 ] Master/Slave Set: redis-master [redis] Masters: [ controller-0 ] Slaves: [ controller-1 controller-2 ] ip-fd00.fd00.fd00.3000..10 (ocf::heartbeat:IPaddr2): Started controller-0 ip-fd00.fd00.fd00.2000..10 (ocf::heartbeat:IPaddr2): Started controller-1 ip-fd00.fd00.fd00.2000..11 (ocf::heartbeat:IPaddr2): Started controller-2 openstack-cinder-volume (systemd:openstack-cinder-volume): Started controller-0 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html