Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1280094

Summary: Uncaught exceptions can leave stacks hanging UPDATE_IN_PROGRESS
Product: Red Hat OpenStack Reporter: Steve Baker <sbaker>
Component: openstack-heatAssignee: Ryan Brown <rybrown>
Status: CLOSED ERRATA QA Contact: Amit Ugol <augol>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0 (Kilo)CC: augol, jcoufal, jschluet, jstransk, kbasil, mburns, mcornea, rhel-osp-director-maint, sbaker, sclewis, shardy, yeylon
Target Milestone: z4Keywords: ZStream
Target Release: 7.0 (Kilo)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-heat-2015.1.2-3.el7ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1293117 1293316 1293421 (view as bug list) Environment:
Last Closed: 2016-02-18 16:41:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steve Baker 2015-11-10 22:48:58 UTC
If an exception occurs after a stack has been put into UPDATE_IN_PROGRESS, but not while actually running the tasks that perform the actual resource changes, then the operation can be aborted leaving the stack stuck in the IN_PROGRESS state.

While this shouldn't happen (and is always a bug if it does), we have encountered bugs of this type when attempting to upgrade rhos-director from 7.0 to 7.x

We should ensure that any exceptions that happen after the stack goes into the IN_PROGRESS state should result in it being moved to FAILED with the reason.

Comment 2 Steve Baker 2015-11-10 22:50:40 UTC
If the upstream fix shapes up in time I'll be asking that this be a blocker for 7.2

Comment 4 Zane Bitter 2015-12-03 00:07:16 UTC
Oops: https://bugs.launchpad.net/heat/+bug/1521881

It appears the patch was harmless, but did not fix the issue. (This is really hard to test, because it requires another bug to trigger it.)

Comment 6 Steve Baker 2015-12-14 20:28:13 UTC
We really need the heat-engine.log stack trace of the exception which caused this so that we can replicate it locally by deliberately raising a similar exception.

Comment 7 Amit Ugol 2015-12-16 14:20:40 UTC
alas, no. Local issues at the lab broke the vms they were in. I am re-creating the system to be exactly as it was.
If you have a system ready for testing, (7.0 GA) then please try as well but do not do the workaround for bug #1272347.

Comment 10 Jaromir Coufal 2015-12-16 16:46:15 UTC
Doc/workaround works for me.

Comment 12 Amit Ugol 2015-12-16 16:58:38 UTC
To break confusion, there were a few issues here that caused a specific state while updating which ended in a failure. Those issues seem to be fixed here, with only one (?) issues that remains when I missed a workaround (see comment 7). Since we MUST do the workaround anyway when upgrading from 7.0 GA and from what I have tested thus far, it is safe to assume that we don't hit the original issue.

Comment 14 Amit Ugol 2015-12-20 10:34:43 UTC
Yes, and I'm testing it. If a few more times this issue will not reproduce itself, then I will mark as verified.

Comment 16 Amit Ugol 2015-12-20 12:28:10 UTC
marking this as verified because it is enough for rhos 7. cloning it to rhos 8.

Comment 18 errata-xmlrpc 2016-02-18 16:41:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-0266.html