Bug 1571549

Summary:	Updating pcs resource bundle names during OSP13 upgrade fails if resource doesn't exist
Product:	Red Hat OpenStack	Reporter:	Marios Andreou <mandreou>
Component:	openstack-tripleo-heat-templates	Assignee:	Marios Andreou <mandreou>
Status:	CLOSED ERRATA	QA Contact:	Yurii Prokulevych <yprokule>
Severity:	high	Docs Contact:
Priority:	high
Version:	13.0 (Queens)	CC:	ccamacho, dciabrin, mbultel, mburns, scohen
Target Milestone:	beta	Keywords:	Triaged
Target Release:	13.0 (Queens)	Flags:	scohen: needinfo+
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	openstack-tripleo-heat-templates-8.0.2-5.el7ost	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-06-27 13:53:37 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Marios Andreou 2018-04-25 06:20:46 UTC

Description of problem:
The fix for BZ 1564449 at https://review.openstack.org/#/q/Ic87a66753b104b9f15db70fdccbd66d88cef94df
allows us to update the name for pcs resource bundle resources if this is changed as part of the upgrade configuration. If the upgrade is interrupted the target pacemaker resource bundle may not even have been created yet. This groups stop/update/start the bundle resource and adds a new conditional to check if the cluster resource exists before trying to update the container image being used. Otherwise a re-run of the upgrade tasks may fail if the cluster resource doesn't exist.

How reproducible:
Everytime

Steps to Reproduce:
1. Deploy OSP12
2. Before starting the upgrade take a bundle resource down (e.g. pcs or even remove or stop the cluster alltogether). Note resources could already be down if this is a rerun of the upgrade.
3. Upgrade to OSP13 and change name of container image for a bundle resource (e.g. rabbit).

Actual results:

u'TASK [Disable the cinder_volume cluster resource before container upgrade] *****',
u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (5 retries left).']
[u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (4 retries left).',
u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (3 retries left).',
u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (2 retries left).',
u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (1 retries left).']
[u'fatal: [192.168.24.14]: FAILED! => {"attempts": 5, "changed": false, "error": "Error: resource/clone/master/group/bundle \'openstack-cinder-volume\' does not exist\\n", "msg": "Failed, to set the resource openstack-cinder-volume to the state disable", "output": "", "rc": 1}',
u'',
u'PLAY RECAP *********************************************************************',
u'192.168.24.14 : ok=124 changed=76 unreachable=0 failed=1 ',

more debug info: some selected tasks I could pick out of the logs:

Expected results:

Shouldn't try to update a cluster resource if it isn't currently defined/active.

Comment 3 Marios Andreou 2018-04-27 13:28:20 UTC

Information for build openstack-tripleo-heat-templates-8.0.2-5.el7ost https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=679206

Comment 9 Yurii Prokulevych 2018-05-31 12:11:39 UTC

Verified with openstack-tripleo-heat-templates-8.0.2-28.el7ost.noarch

Resource rabbitmq-bundle was disabled during one run and completely removed during another one.

Comment 11 errata-xmlrpc 2018-06-27 13:53:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086