Bug 1571549 - Updating pcs resource bundle names during OSP13 upgrade fails if resource doesn't exist
Summary: Updating pcs resource bundle names during OSP13 upgrade fails if resource doe...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 13.0 (Queens)
Assignee: Marios Andreou
QA Contact: Yurii Prokulevych
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-25 06:20 UTC by Marios Andreou
Modified: 2018-06-27 13:55 UTC (History)
6 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.0.2-5.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:53:37 UTC
Target Upstream Version:
scohen: needinfo+


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 563588 None stable/queens: MERGED tripleo-heat-templates: Make pcs resource bundle image name update tolerant of rerun (Ifc6f78d73bc71a5b5edfadfbfacaa3560... 2018-04-27 19:03:36 UTC
Red Hat Product Errata RHEA-2018:2086 None None None 2018-06-27 13:55:00 UTC

Description Marios Andreou 2018-04-25 06:20:46 UTC
Description of problem:
The fix for BZ 1564449 at https://review.openstack.org/#/q/Ic87a66753b104b9f15db70fdccbd66d88cef94df 
allows us to update the name for pcs resource bundle resources if this is changed as part of the upgrade configuration. If the upgrade is interrupted the target pacemaker resource bundle may not even have been created yet. This groups stop/update/start the bundle resource and adds a new conditional to check if the cluster resource exists before trying to update the container image being used. Otherwise a re-run of the upgrade tasks may fail if the cluster resource doesn't exist. 

How reproducible:
Everytime

Steps to Reproduce:
1. Deploy OSP12
2. Before starting the upgrade take a bundle resource down (e.g. pcs or even remove or stop the cluster alltogether). Note resources could already be down if this is a rerun of the upgrade. 
3. Upgrade to OSP13 and change name of container image for a bundle resource (e.g. rabbit). 


Actual results:

 u'TASK [Disable the cinder_volume cluster resource before container upgrade] *****',
 u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (5 retries left).']
[u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (4 retries left).',
 u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (3 retries left).',
 u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (2 retries left).',
 u'FAILED - RETRYING: Disable the cinder_volume cluster resource before container upgrade (1 retries left).']
[u'fatal: [192.168.24.14]: FAILED! => {"attempts": 5, "changed": false, "error": "Error: resource/clone/master/group/bundle \'openstack-cinder-volume\' does not exist\\n", "msg": "Failed, to set the resource openstack-cinder-volume to the state disable", "output": "", "rc": 1}',
 u'',
 u'PLAY RECAP *********************************************************************',
 u'192.168.24.14              : ok=124  changed=76   unreachable=0    failed=1   ',

more debug info: some selected tasks I could pick out of the logs:

Expected results:

Shouldn't try to update a cluster resource if it isn't currently defined/active.

Comment 3 Marios Andreou 2018-04-27 13:28:20 UTC
Information for build openstack-tripleo-heat-templates-8.0.2-5.el7ost https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=679206

Comment 9 Yurii Prokulevych 2018-05-31 12:11:39 UTC
Verified with openstack-tripleo-heat-templates-8.0.2-28.el7ost.noarch

Resource rabbitmq-bundle was disabled during one run and completely removed during another one.

Comment 11 errata-xmlrpc 2018-06-27 13:53:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.