Description of problem:
Unable to update overcloud deployment or scale it out any further due to heat resources that think I have a deployment still running.
What I am seeing:
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Deploy overcloud, mariadb runs out of file descriptors which causes the deployment to fail, and leaves heat in a bad state.
heat resources to be reaped/cleaned up.
Running out of file descriptors will be difficult to reproduce. This particular state can be replicated by setting some resources to IN_PROGRESS while their stacks are in an UPDATE_FAILED state.
I'm suggesting a heat-manage command which acts on a single stack and traverses all nested stacks to put any IN_PROGRESS things to FAILED, and clear hooks.
*** Bug 1379716 has been marked as a duplicate of this bug. ***
The command to fix a stack landed in the first Newton milestone:
heat-manage reset_stack_status --help
usage: heat-manage reset_stack_status [-h] stack_id
stack_id Stack id
-h, --help show this help message and exit
This bug is fixed though it did uncover a new one in https://bugs.launchpad.net/heat/+bug/1638476
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.