1975271 – Minor update does not restart ha resource when it is in failed stated

Bug 1975271 - Minor update does not restart ha resource when it is in failed stated

Summary: Minor update does not restart ha resource when it is in failed stated

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-heat-templates
Sub Component:
Version:	16.2 (Train)
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	z2
Target Release:	16.2 (Train on RHEL 8.4)
Assignee:	OSP Team
QA Contact:	Joe H. Rahme
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-06-23 11:23 UTC by Damien Ciabrini
Modified:	2022-03-23 22:28 UTC (History)
CC List:	3 users (show)
Fixed In Version:	openstack-tripleo-heat-templates-11.6.1-2.20220116004909.64b2e88.el8ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-03-23 22:28:33 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	795704	None	MERGED	HA minor update: fix bad pcs invocation	2021-06-23 11:23:51 UTC
Red Hat Issue Tracker	OSP-5410	None	None	None	2022-01-26 07:59:41 UTC
Red Hat Product Errata	RHSA-2022:0995	None	None	None	2022-03-23 22:28:57 UTC

Description Damien Ciabrini 2021-06-23 11:23:52 UTC

Description of problem:
During minor update of a HA controller, we have a script called pacemaker_restart_bundle.sh that is in charge of restarting the local instance of an HA resource after some configuration change.

The way the resource is restarted depends on the type of the resource and its current state in the cluster.

When the resource is in failed state, the script invokes pcs with an invalid command line, which in turns logs an error and does not restart the resource as expected, e.g.:

    "stdout: Wed Jun  9 16:45:29 UTC 2021: openstack-cinder-volume is currently not running on 'controller-0', cleaning up its state to restart it if necessary",
        "",
        "stderr: Error: Specified option '--node' is not supported in this command"
    ]
}


Version-Release number of selected component (if applicable):
16.2

How reproducible:
When a resource a in failed state on a node

Steps to Reproduce:
1. make a clone resource fail to start on a controller, this will block it and leaves it in failed state.
2. perform a minor upgrade on that node


Actual results:
The resource stays in failed state

Expected results:
The resource should have been given a chance to restart

Additional info:

Comment 13 errata-xmlrpc 2022-03-23 22:28:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenStack Platform 16.2 (openstack-tripleo-heat-templates) security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0995

Note You need to log in before you can comment on or make changes to this bug.