Bug 2116381 - OSP 17.0 - Galera resource fails to restart on specific node.
Summary: OSP 17.0 - Galera resource fails to restart on specific node.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 17.0 (Wallaby)
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: z1
: 17.0
Assignee: Damien Ciabrini
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-08-08 12:45 UTC by Julia Marciano
Modified: 2023-01-25 12:29 UTC (History)
8 users (show)

Fixed In Version: puppet-tripleo-14.2.3-0.20221122110114.41752a3.el9ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-25 12:28:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1984264 0 None None None 2022-08-10 16:16:28 UTC
OpenStack gerrit 852778 0 None MERGED galera: make gcache recovery configurable 2022-09-21 16:24:13 UTC
OpenStack gerrit 858622 0 None MERGED galera: make gcache recovery configurable 2022-10-25 15:02:46 UTC
Red Hat Issue Tracker OSP-18080 0 None None None 2022-08-08 12:55:30 UTC
Red Hat Product Errata RHBA-2023:0271 0 None None None 2023-01-25 12:29:08 UTC

Description Julia Marciano 2022-08-08 12:45:02 UTC
Description of problem:
Reported by Florella fyanac.

Every time we try to restart galera resource with "pcs resource restart ..." it fails and we see this in the logs: http://pastebin.test.redhat.com/1069412

It happens on one galera node.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
galera is failed on one node:
 * galera-bundle-2	(ocf:heartbeat:galera):	 FAILED Promoted controller-0 

Expected results:
Galera cluster is healthy - all galera resources are in 'Promoted' state.

Additional info:

Comment 3 Luca Miccini 2022-08-08 14:52:01 UTC
This has been tracked down to a corruption of galera gcache file, preventing one of the nodes from starting.
We will try to figure out a way to deal with this condition, maybe in the pacemaker resource agent.

Comment 16 errata-xmlrpc 2023-01-25 12:28:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 17.0.1 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:0271


Note You need to log in before you can comment on or make changes to this bug.