Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1516415

Summary: RHOS 10 (newton): overcloud deployment -> Galera unable to detect last known write sequence number
Product: Red Hat OpenStack Reporter: Francisco Javier Lopez Y Grueber <flg>
Component: mariadb-galeraAssignee: Damien Ciabrini <dciabrin>
Status: CLOSED NOTABUG QA Contact: Udi Shkalim <ushkalim>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: chjones, fdinitto, flg, mbayer, mschuppe, srevivo
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-27 13:44:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Screenshot of the journal messages
none
SOSREPORT MYSQL,PACEMAKER,COROSYNC
none
output mysqd_save --wsrep-recover
none
oc2
none
oc0
none
oc1
none
hosts
none
oc2
none
oc1
none
oc0
none
templates in use
none
Verification Undercloud Domain Settings none

Description Francisco Javier Lopez Y Grueber 2017-11-22 15:10:44 UTC
Created attachment 1357598 [details]
Screenshot of the journal messages

Description of problem:

Overcloud deployment fails due to the galera cluster problems. 

Main message: 

 Galera unable to detect last known write sequence number
~
crmd: Result of start operation of galera on ${node} (unknown error) 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Delete Stack and redeploy.
2. Error occurs after 1.5h 
3.

Actual results:

Deployment fails as non of the openstack services can be installed properly. 

Expected results:

Deployment succeeds

Additional info:

Comment 1 Fabio Massimo Di Nitto 2017-11-22 15:29:49 UTC
Please provide full sosreports.

Comment 2 Francisco Javier Lopez Y Grueber 2017-11-22 15:48:06 UTC
Created attachment 1357643 [details]
SOSREPORT MYSQL,PACEMAKER,COROSYNC

Comment 3 Francisco Javier Lopez Y Grueber 2017-11-22 16:04:41 UTC
Created attachment 1357648 [details]
output mysqd_save --wsrep-recover

Comment 4 Francisco Javier Lopez Y Grueber 2017-11-22 16:19:15 UTC
Created attachment 1357662 [details]
oc2

Comment 5 Francisco Javier Lopez Y Grueber 2017-11-22 16:20:32 UTC
Created attachment 1357665 [details]
oc0

Comment 6 Francisco Javier Lopez Y Grueber 2017-11-22 16:24:44 UTC
Created attachment 1357666 [details]
oc1

Comment 7 Francisco Javier Lopez Y Grueber 2017-11-22 16:54:47 UTC
Created attachment 1357678 [details]
hosts

Comment 8 Michael Bayer 2017-11-22 20:43:13 UTC
we would need the sosreports to be complete via the customer portal (including ps commands, all installed rpms, etc) so that we can pull them into collab-shell and additionally we need to see the full overcloud deploy command as well as all configurations and heat templates used to create the stack.    Additionally if we can get a directory listing of all /var/lib/mysql.

Comment 9 Francisco Javier Lopez Y Grueber 2017-11-23 11:21:05 UTC
Created attachment 1358142 [details]
oc2

Comment 10 Francisco Javier Lopez Y Grueber 2017-11-23 11:21:44 UTC
Created attachment 1358143 [details]
oc1

Comment 11 Francisco Javier Lopez Y Grueber 2017-11-23 11:22:23 UTC
Created attachment 1358144 [details]
oc0

Comment 12 Francisco Javier Lopez Y Grueber 2017-11-23 11:23:09 UTC
Created attachment 1358145 [details]
templates in use

Comment 13 Francisco Javier Lopez Y Grueber 2017-11-23 11:26:51 UTC
Created attachment 1358148 [details]
Verification Undercloud Domain Settings

domain relevant settings verified on undercloud

Comment 14 Damien Ciabrini 2017-11-23 12:51:15 UTC
Francisco, the last sosreports that you have uploaded lack important files for investigation, they only include galera/gaproxy/cluster logs

we need _all_  logs that sosreports can provide, e.g. processes running, network settings etc. Could you get those uploaded?

Comment 15 Francisco Javier Lopez Y Grueber 2017-11-23 13:02:53 UTC
Hi, please see connected for full sosreports. These only contain, rpm,yum,corosync, pacemaker and system.

Comment 18 Martin Schuppert 2017-11-23 13:58:10 UTC
Most likely the issue is that the CloudDomain does not match what is configured for the dhcp_domain in nova.conf of the undercloud.

Comment 19 Damien Ciabrini 2017-11-27 13:44:08 UTC
Closing this particular bz as per commant #18 it appears deployment error was due to a misconfiguration.