Bug 2156886

Summary: OSPdO 17.0 After a controller reset , Galera is an unhealthy state
Product: Red Hat OpenStack Reporter: pkomarov
Component: osp-director-operator-containerAssignee: Damien Ciabrini <dciabrin>
Status: NEW --- QA Contact: pkomarov
Severity: medium Docs Contact:
Priority: medium    
Version: 17.0 (Wallaby)CC: dciabrin, jmarcian, lmiccini
Target Milestone: ---Keywords: Triaged
Target Release: ---Flags: pkomarov: needinfo? (dciabrin)
ifrangs: needinfo? (dciabrin)
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description pkomarov 2022-12-29 08:36:17 UTC
Description of problem:
after a fresh OSPdO 17.0 deployment Galera is an unhealthy state: 

  * Container bundle set: galera-bundle [cluster.common.tag/mariadb:pcmklatest]:
    * galera-bundle-0	(ocf:heartbeat:galera):	 Unpromoted controller-1
    * galera-bundle-1	(ocf:heartbeat:galera):	 FAILED Promoted controller-2 (blocked)
    * galera-bundle-2	(ocf:heartbeat:galera):	 Unpromoted controller-0
 

Version-Release number of selected component (if applicable):


 DIRECTOR_OPERATOR_CSV_VERSION=17.0.35-17.0

osp_release_auto_version: 17.0-RHEL-9

osp_release_defaults:
  base_image_url: http://download.devel.redhat.com/brewroot/packages/rhel-guest-image/9.0/20221216.0/images/rhel-guest-image-9.0-20221216.0.x86_64.qcow2

Additional info:
sosreports,all overcloud_nodes /var/log, are at :  http://file.tlv.redhat.com/~pkomarov/sos_reports_2126730

Comment 2 pkomarov 2022-12-29 13:51:28 UTC
I did a pcs cluster restart on all controllers , and reran the HA controller reboot tests 
Now it did not reproduce : 
    * galera-bundle-0   (ocf:heartbeat:galera):  Promoted controller-1
    * galera-bundle-1   (ocf:heartbeat:galera):  Promoted controller-2
    * galera-bundle-2   (ocf:heartbeat:galera):  Promoted controller-0
Not sure what is the root cause though..