Bug 1830173 - Running enable-ssh-admin.sh for pre-provisoned nodes will loop forever if workflow fails
Summary: Running enable-ssh-admin.sh for pre-provisoned nodes will loop forever if wor...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.1 (Train)
Hardware: All
OS: All
high
high
Target Milestone: z3
: 16.1 (Train on RHEL 8.2)
Assignee: Emilien Macchi
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-01 01:04 UTC by David Sedgmen
Modified: 2023-09-07 23:05 UTC (History)
6 users (show)

Fixed In Version: openstack-tripleo-common-11.4.1-1.20200917023444.el8ost openstack-tripleo-heat-templates-11.3.2-1.20200914170158.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-15 18:35:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1876646 0 None None None 2020-05-04 02:43:34 UTC
Launchpad 1896505 0 None None None 2020-09-21 15:54:41 UTC
OpenStack gerrit 726209 0 None MERGED Add a 600s timeout when creating enable-ssh-admin workflow 2020-12-10 18:33:02 UTC
OpenStack gerrit 738726 0 None MERGED Check for correct column name for execution show 2020-12-10 18:33:02 UTC
OpenStack gerrit 753095 0 None MERGED enable-ssh-admin: allow to override plan name 2020-12-10 18:33:00 UTC
OpenStack gerrit 753097 0 None MERGED Add a note about enable-ssh-admin and plan_name 2020-12-10 18:33:02 UTC
OpenStack gerrit 753224 0 None MERGED [Train] Check for existence of stack 2020-12-10 18:33:02 UTC
Red Hat Issue Tracker OSP-28424 0 None None None 2023-09-07 23:05:39 UTC
Red Hat Product Errata RHEA-2020:5413 0 None None None 2020-12-15 18:36:06 UTC

Description David Sedgmen 2020-05-01 01:04:32 UTC
Description of problem: If enable-ssh-admin.sh workflow fails the script will loop forever, because it only checks if the workflow is SUCCESSFUL 


########################
function workflow_finished {
    local execution_id="$1"
    openstack workflow execution show -f shell $execution_id | grep 'state="SUCCESS"' > /dev/null
}
........................
echo -n "Waiting for the workflow execution to finish (id $EXECUTION_ID)."
while ! workflow_finished $EXECUTION_ID; do
    sleep $SLEEP_TIME
    echo -n .
done
########################

Actual results:

Stuck in a loop that never times out

Expected results:

To exit if the workflow fails or time out

Comment 5 Alex McLeod 2020-06-16 12:34:04 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.

Comment 6 David Rosenfeld 2020-06-16 13:18:17 UTC
When I tried to verify BZ the heat_agent.log contained these msgs:

No recognized column names in ['state']. Recognized columns are ['ID', 'Workflow ID', 'Workflow name', 'Workflow namespace', 'Description', 'Task Execution ID', 'Root Execution ID', 'State', 'State info', 'Created at', 'Updated at'].

enable-ssh-admin.sh is executing this command:

openstack workflow execution show -f value -c state $execution_id

Believe it needs to look for a capital S instead of lowercase s in State.

Comment 10 David Rosenfeld 2020-09-21 14:47:41 UTC
Moving to ON_DEV. When trying to verify this error message was seen in heat_agent.log:

Waiting for the workflow execution to finish (id e862c392-1470-4367-a0d1-50a1b43c6bd8).Workflow e862c392-1470-4367-a0d1-50a1b43c6bd8 finished with error. Check mistral logs.

Comment 27 David Rosenfeld 2020-11-16 14:16:08 UTC
This is seen in heat_agent.log when timeout occurs: Workflow e4d6b815-880d-4aab-9df6-3fd6099b0221 did not finish after 600 seconds.

In addition the errors: 

No recognized column names in ['state']. Recognized columns are ['ID', 'Workflow ID', 'Workflow name', 'Workflow namespace', 'Description', 'Task Execution ID', 'Root Execution ID', 'State', 'State info', 'Created at', 'Updated at'].

and 

Waiting for the workflow execution to finish (id e862c392-1470-4367-a0d1-50a1b43c6bd8).Workflow e862c392-1470-4367-a0d1-50a1b43c6bd8 finished with error. Check mistral logs.

are no longer seen in heat_agent.log

Comment 35 errata-xmlrpc 2020-12-15 18:35:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.3 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:5413


Note You need to log in before you can comment on or make changes to this bug.