Bug 1516415 - RHOS 10 (newton): overcloud deployment -> Galera unable to detect last known write sequence number
Summary: RHOS 10 (newton): overcloud deployment -> Galera unable to detect last known ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: mariadb-galera
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Damien Ciabrini
QA Contact: Udi Shkalim
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-22 15:10 UTC by Francisco Javier Lopez Y Grueber
Modified: 2021-03-11 16:24 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-27 13:44:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshot of the journal messages (181.94 KB, application/pdf)
2017-11-22 15:10 UTC, Francisco Javier Lopez Y Grueber
no flags Details
SOSREPORT MYSQL,PACEMAKER,COROSYNC (349.80 KB, application/x-xz)
2017-11-22 15:48 UTC, Francisco Javier Lopez Y Grueber
no flags Details
output mysqd_save --wsrep-recover (205.31 KB, image/png)
2017-11-22 16:04 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc2 (1.21 MB, application/x-xz)
2017-11-22 16:19 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc0 (1.28 MB, application/x-xz)
2017-11-22 16:20 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc1 (490.92 KB, application/x-xz)
2017-11-22 16:24 UTC, Francisco Javier Lopez Y Grueber
no flags Details
hosts (216.27 KB, image/png)
2017-11-22 16:54 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc2 (1.23 MB, application/x-xz)
2017-11-23 11:21 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc1 (1.23 MB, application/x-xz)
2017-11-23 11:21 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc0 (1.24 MB, application/x-xz)
2017-11-23 11:22 UTC, Francisco Javier Lopez Y Grueber
no flags Details
templates in use (425.28 KB, application/x-gzip)
2017-11-23 11:23 UTC, Francisco Javier Lopez Y Grueber
no flags Details
Verification Undercloud Domain Settings (164.83 KB, image/png)
2017-11-23 11:26 UTC, Francisco Javier Lopez Y Grueber
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1464114 0 urgent CLOSED Failed to connect to mysql at overcloud deploy [pacemaker] 2021-02-22 00:41:40 UTC
Red Hat Knowledge Base (Article) 2089051 0 None None None 2017-11-23 13:04:26 UTC

Description Francisco Javier Lopez Y Grueber 2017-11-22 15:10:44 UTC
Created attachment 1357598 [details]
Screenshot of the journal messages

Description of problem:

Overcloud deployment fails due to the galera cluster problems. 

Main message: 

 Galera unable to detect last known write sequence number
~
crmd: Result of start operation of galera on ${node} (unknown error) 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Delete Stack and redeploy.
2. Error occurs after 1.5h 
3.

Actual results:

Deployment fails as non of the openstack services can be installed properly. 

Expected results:

Deployment succeeds

Additional info:

Comment 1 Fabio Massimo Di Nitto 2017-11-22 15:29:49 UTC
Please provide full sosreports.

Comment 2 Francisco Javier Lopez Y Grueber 2017-11-22 15:48:06 UTC
Created attachment 1357643 [details]
SOSREPORT MYSQL,PACEMAKER,COROSYNC

Comment 3 Francisco Javier Lopez Y Grueber 2017-11-22 16:04:41 UTC
Created attachment 1357648 [details]
output mysqd_save --wsrep-recover

Comment 4 Francisco Javier Lopez Y Grueber 2017-11-22 16:19:15 UTC
Created attachment 1357662 [details]
oc2

Comment 5 Francisco Javier Lopez Y Grueber 2017-11-22 16:20:32 UTC
Created attachment 1357665 [details]
oc0

Comment 6 Francisco Javier Lopez Y Grueber 2017-11-22 16:24:44 UTC
Created attachment 1357666 [details]
oc1

Comment 7 Francisco Javier Lopez Y Grueber 2017-11-22 16:54:47 UTC
Created attachment 1357678 [details]
hosts

Comment 8 Michael Bayer 2017-11-22 20:43:13 UTC
we would need the sosreports to be complete via the customer portal (including ps commands, all installed rpms, etc) so that we can pull them into collab-shell and additionally we need to see the full overcloud deploy command as well as all configurations and heat templates used to create the stack.    Additionally if we can get a directory listing of all /var/lib/mysql.

Comment 9 Francisco Javier Lopez Y Grueber 2017-11-23 11:21:05 UTC
Created attachment 1358142 [details]
oc2

Comment 10 Francisco Javier Lopez Y Grueber 2017-11-23 11:21:44 UTC
Created attachment 1358143 [details]
oc1

Comment 11 Francisco Javier Lopez Y Grueber 2017-11-23 11:22:23 UTC
Created attachment 1358144 [details]
oc0

Comment 12 Francisco Javier Lopez Y Grueber 2017-11-23 11:23:09 UTC
Created attachment 1358145 [details]
templates in use

Comment 13 Francisco Javier Lopez Y Grueber 2017-11-23 11:26:51 UTC
Created attachment 1358148 [details]
Verification Undercloud Domain Settings

domain relevant settings verified on undercloud

Comment 14 Damien Ciabrini 2017-11-23 12:51:15 UTC
Francisco, the last sosreports that you have uploaded lack important files for investigation, they only include galera/gaproxy/cluster logs

we need _all_  logs that sosreports can provide, e.g. processes running, network settings etc. Could you get those uploaded?

Comment 15 Francisco Javier Lopez Y Grueber 2017-11-23 13:02:53 UTC
Hi, please see connected for full sosreports. These only contain, rpm,yum,corosync, pacemaker and system.

Comment 18 Martin Schuppert 2017-11-23 13:58:10 UTC
Most likely the issue is that the CloudDomain does not match what is configured for the dhcp_domain in nova.conf of the undercloud.

Comment 19 Damien Ciabrini 2017-11-27 13:44:08 UTC
Closing this particular bz as per commant #18 it appears deployment error was due to a misconfiguration.


Note You need to log in before you can comment on or make changes to this bug.