Hide Forgot
Cloned from launchpad blueprint https://blueprints.launchpad.net/heat/+spec/update-failure-recovery. Description: Currently, stack updates are handled in an all-or-nothing kind of way. If a failure occurs, we attempt to roll back to the previous state if rollback is enabled. If the rollback fails or is disabled, we leave the stack in its failed state, but accept the old or new template (respectively) as a true representation of the current state of the stack. (This means that we could lose track of some resources and not be able to delete them.) We also prohibit updates to the stack from this point on; once an update has failed, you can only delete the stack. We need to incrementally update the current template as resources are added, removed or modified. This will give us a valid picture of the true state when a failure occurs, allowing us to safely run updates in the future. Specification URL (additional information): None
The idea is that even if a resource fails during create or update, we should still be able to successfully run another update - with the same or a different template - and have the stack recover to the right state. So some things that would be interesting to test are: - Updating after a create failure - Updating after an update failure with rollback disabled - Updating after a rollback failure - Update failures where the new template has added new parameters - Update failures where the new template has removed existing parameters - Update failures where parameter values are changing BTW one thing to note is that when something fails, we now wait for up to 4 minutes for other in-progress resources to complete rather than killing them immediately, since we hope to be able to recover.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2015-0147.html