1280094 – Uncaught exceptions can leave stacks hanging UPDATE_IN_PROGRESS

Bug 1280094 - Uncaught exceptions can leave stacks hanging UPDATE_IN_PROGRESS

Summary: Uncaught exceptions can leave stacks hanging UPDATE_IN_PROGRESS

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-heat
Sub Component:
Version:	7.0 (Kilo)
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	z4
Target Release:	7.0 (Kilo)
Assignee:	Ryan Brown
QA Contact:	Amit Ugol
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-11-10 22:48 UTC by Steve Baker
Modified:	2023-02-22 23:02 UTC (History)
CC List:	12 users (show)
Fixed In Version:	openstack-heat-2015.1.2-3.el7ost
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1293117 1293316 1293421 (view as bug list)
Environment:
Last Closed:	2016-02-18 16:41:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1492433	None	None	None	Never
OpenStack gerrit	244982	None	None	None	Never
Red Hat Issue Tracker	OSP-16725	None	None	None	2022-07-09 08:12:30 UTC
Red Hat Product Errata	RHSA-2016:0266	normal	SHIPPED_LIVE	Moderate: openstack-heat bug fix and security advisory	2016-02-18 21:41:02 UTC

Description Steve Baker 2015-11-10 22:48:58 UTC

If an exception occurs after a stack has been put into UPDATE_IN_PROGRESS, but not while actually running the tasks that perform the actual resource changes, then the operation can be aborted leaving the stack stuck in the IN_PROGRESS state.

While this shouldn't happen (and is always a bug if it does), we have encountered bugs of this type when attempting to upgrade rhos-director from 7.0 to 7.x

We should ensure that any exceptions that happen after the stack goes into the IN_PROGRESS state should result in it being moved to FAILED with the reason.

Comment 2 Steve Baker 2015-11-10 22:50:40 UTC

If the upstream fix shapes up in time I'll be asking that this be a blocker for 7.2

Comment 4 Zane Bitter 2015-12-03 00:07:16 UTC

Oops: https://bugs.launchpad.net/heat/+bug/1521881

It appears the patch was harmless, but did not fix the issue. (This is really hard to test, because it requires another bug to trigger it.)

Comment 6 Steve Baker 2015-12-14 20:28:13 UTC

We really need the heat-engine.log stack trace of the exception which caused this so that we can replicate it locally by deliberately raising a similar exception.

Comment 7 Amit Ugol 2015-12-16 14:20:40 UTC

alas, no. Local issues at the lab broke the vms they were in. I am re-creating the system to be exactly as it was.
If you have a system ready for testing, (7.0 GA) then please try as well but do not do the workaround for bug #1272347.

Comment 10 Jaromir Coufal 2015-12-16 16:46:15 UTC

Doc/workaround works for me.

Comment 12 Amit Ugol 2015-12-16 16:58:38 UTC

To break confusion, there were a few issues here that caused a specific state while updating which ended in a failure. Those issues seem to be fixed here, with only one (?) issues that remains when I missed a workaround (see comment 7). Since we MUST do the workaround anyway when upgrading from 7.0 GA and from what I have tested thus far, it is safe to assume that we don't hit the original issue.

Comment 14 Amit Ugol 2015-12-20 10:34:43 UTC

Yes, and I'm testing it. If a few more times this issue will not reproduce itself, then I will mark as verified.

Comment 16 Amit Ugol 2015-12-20 12:28:10 UTC

marking this as verified because it is enough for rhos 7. cloning it to rhos 8.

Comment 18 errata-xmlrpc 2016-02-18 16:41:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-0266.html

Note You need to log in before you can comment on or make changes to this bug.