Bug 1280094 - Uncaught exceptions can leave stacks hanging UPDATE_IN_PROGRESS
Uncaught exceptions can leave stacks hanging UPDATE_IN_PROGRESS
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-heat (Show other bugs)
7.0 (Kilo)
Unspecified Unspecified
unspecified Severity unspecified
: z4
: 7.0 (Kilo)
Assigned To: Ryan Brown
Amit Ugol
: ZStream
Depends On:
  Show dependency treegraph
Reported: 2015-11-10 17:48 EST by Steve Baker
Modified: 2018-02-08 06:02 EST (History)
13 users (show)

See Also:
Fixed In Version: openstack-heat-2015.1.2-3.el7ost
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1293117 1293316 1293421 (view as bug list)
Last Closed: 2016-02-18 11:41:50 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1492433 None None None Never
OpenStack gerrit 244982 None None None Never

  None (edit)
Description Steve Baker 2015-11-10 17:48:58 EST
If an exception occurs after a stack has been put into UPDATE_IN_PROGRESS, but not while actually running the tasks that perform the actual resource changes, then the operation can be aborted leaving the stack stuck in the IN_PROGRESS state.

While this shouldn't happen (and is always a bug if it does), we have encountered bugs of this type when attempting to upgrade rhos-director from 7.0 to 7.x

We should ensure that any exceptions that happen after the stack goes into the IN_PROGRESS state should result in it being moved to FAILED with the reason.
Comment 2 Steve Baker 2015-11-10 17:50:40 EST
If the upstream fix shapes up in time I'll be asking that this be a blocker for 7.2
Comment 4 Zane Bitter 2015-12-02 19:07:16 EST
Oops: https://bugs.launchpad.net/heat/+bug/1521881

It appears the patch was harmless, but did not fix the issue. (This is really hard to test, because it requires another bug to trigger it.)
Comment 6 Steve Baker 2015-12-14 15:28:13 EST
We really need the heat-engine.log stack trace of the exception which caused this so that we can replicate it locally by deliberately raising a similar exception.
Comment 7 Amit Ugol 2015-12-16 09:20:40 EST
alas, no. Local issues at the lab broke the vms they were in. I am re-creating the system to be exactly as it was.
If you have a system ready for testing, (7.0 GA) then please try as well but do not do the workaround for bug #1272347.
Comment 10 Jaromir Coufal 2015-12-16 11:46:15 EST
Doc/workaround works for me.
Comment 12 Amit Ugol 2015-12-16 11:58:38 EST
To break confusion, there were a few issues here that caused a specific state while updating which ended in a failure. Those issues seem to be fixed here, with only one (?) issues that remains when I missed a workaround (see comment 7). Since we MUST do the workaround anyway when upgrading from 7.0 GA and from what I have tested thus far, it is safe to assume that we don't hit the original issue.
Comment 14 Amit Ugol 2015-12-20 05:34:43 EST
Yes, and I'm testing it. If a few more times this issue will not reproduce itself, then I will mark as verified.
Comment 16 Amit Ugol 2015-12-20 07:28:10 EST
marking this as verified because it is enough for rhos 7. cloning it to rhos 8.
Comment 18 errata-xmlrpc 2016-02-18 11:41:50 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.