Bug 1280094 - Uncaught exceptions can leave stacks hanging UPDATE_IN_PROGRESS
Summary: Uncaught exceptions can leave stacks hanging UPDATE_IN_PROGRESS
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-heat
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: z4
: 7.0 (Kilo)
Assignee: Ryan Brown
QA Contact: Amit Ugol
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-11-10 22:48 UTC by Steve Baker
Modified: 2023-02-22 23:02 UTC (History)
12 users (show)

Fixed In Version: openstack-heat-2015.1.2-3.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1293117 1293316 1293421 (view as bug list)
Environment:
Last Closed: 2016-02-18 16:41:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1492433 0 None None None Never
OpenStack gerrit 244982 0 None None None Never
Red Hat Issue Tracker OSP-16725 0 None None None 2022-07-09 08:12:30 UTC
Red Hat Product Errata RHSA-2016:0266 0 normal SHIPPED_LIVE Moderate: openstack-heat bug fix and security advisory 2016-02-18 21:41:02 UTC

Description Steve Baker 2015-11-10 22:48:58 UTC
If an exception occurs after a stack has been put into UPDATE_IN_PROGRESS, but not while actually running the tasks that perform the actual resource changes, then the operation can be aborted leaving the stack stuck in the IN_PROGRESS state.

While this shouldn't happen (and is always a bug if it does), we have encountered bugs of this type when attempting to upgrade rhos-director from 7.0 to 7.x

We should ensure that any exceptions that happen after the stack goes into the IN_PROGRESS state should result in it being moved to FAILED with the reason.

Comment 2 Steve Baker 2015-11-10 22:50:40 UTC
If the upstream fix shapes up in time I'll be asking that this be a blocker for 7.2

Comment 4 Zane Bitter 2015-12-03 00:07:16 UTC
Oops: https://bugs.launchpad.net/heat/+bug/1521881

It appears the patch was harmless, but did not fix the issue. (This is really hard to test, because it requires another bug to trigger it.)

Comment 6 Steve Baker 2015-12-14 20:28:13 UTC
We really need the heat-engine.log stack trace of the exception which caused this so that we can replicate it locally by deliberately raising a similar exception.

Comment 7 Amit Ugol 2015-12-16 14:20:40 UTC
alas, no. Local issues at the lab broke the vms they were in. I am re-creating the system to be exactly as it was.
If you have a system ready for testing, (7.0 GA) then please try as well but do not do the workaround for bug #1272347.

Comment 10 Jaromir Coufal 2015-12-16 16:46:15 UTC
Doc/workaround works for me.

Comment 12 Amit Ugol 2015-12-16 16:58:38 UTC
To break confusion, there were a few issues here that caused a specific state while updating which ended in a failure. Those issues seem to be fixed here, with only one (?) issues that remains when I missed a workaround (see comment 7). Since we MUST do the workaround anyway when upgrading from 7.0 GA and from what I have tested thus far, it is safe to assume that we don't hit the original issue.

Comment 14 Amit Ugol 2015-12-20 10:34:43 UTC
Yes, and I'm testing it. If a few more times this issue will not reproduce itself, then I will mark as verified.

Comment 16 Amit Ugol 2015-12-20 12:28:10 UTC
marking this as verified because it is enough for rhos 7. cloning it to rhos 8.

Comment 18 errata-xmlrpc 2016-02-18 16:41:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-0266.html


Note You need to log in before you can comment on or make changes to this bug.