1909579 – Openstack orchestration service list shows down heat engines

Bug 1909579 - Openstack orchestration service list shows down heat engines

Summary: Openstack orchestration service list shows down heat engines

Keywords:
Status:	CLOSED DUPLICATE of bug 1641667
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-heat
Sub Component:
Version:	13.0 (Queens)
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	RHOS Maint
QA Contact:	David Rosenfeld
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-12-21 03:15 UTC by rohit londhe
Modified:	2024-03-25 17:38 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-12-21 06:25:42 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description rohit londhe 2020-12-21 03:15:19 UTC

Description of problem:

After a successful stack update "Openstack orchestration service list" shows dead engine workers.

Referring BZ 1730994, it's clear that there is no impact on the environment but still customer wants to know if any backend operations are making these processes as dead then it should automatically get clean-out from the list without any manual intervention.

Version-Release number of selected component (if applicable):

~~~
heat-cfntools-1.3.0-2.el7ost.noarch                         Fri Nov 27 18:06:42 2020
openstack-heat-api-10.0.3-13.el7ost.noarch                  Fri Nov 27 18:22:27 2020
openstack-heat-api-cfn-10.0.3-13.el7ost.noarch              Fri Nov 27 18:22:31 2020
openstack-heat-common-10.0.3-13.el7ost.noarch               Fri Nov 27 18:22:23 2020
openstack-heat-engine-10.0.3-13.el7ost.noarch               Fri Nov 27 18:22:36 2020
openstack-tripleo-heat-templates-8.4.1-58.el7ost.noarch     Fri Nov 27 18:14:37 2020
puppet-heat-12.4.1-0.20200413050249.d61d033.el7ost.noarch   Fri Nov 27 18:13:51 2020
python2-heatclient-1.14.1-1.el7ost.noarch                   Fri Nov 27 18:14:32 2020
python-heat-agent-1.5.4-1.el7ost.noarch                     Fri Nov 27 18:14:34 2020
~~~

How reproducible:

Yes

Steps to Reproduce:
Perform a minor update on the stack
check the orchestration list output.

Actual results:
We can see dead heat-engine workers/processes in the "Openstack orchestration service list" after a successful stack update.

Expected results:

There should not be dead heat-engine workers/processes if the stack update is successfull.

Additional info:

Comment 3 Rabi Mishra 2020-12-21 06:25:42 UTC

docker stop kills processes after 10s (default) timeout when heat-engines are restarted after an update. That's why you see some heat-engines with status 'down' as those were not stopped gracefully.

We've increased that grace period to 60s in OSP14[1] and above that would reduce the occurrence of the above. In the mean time a cron job with 'heat-manage service clean' should clean those dead engine workers.


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1641667#c8

*** This bug has been marked as a duplicate of bug 1641667 ***

Note You need to log in before you can comment on or make changes to this bug.