1390313 – Post deployment heat service-list reports 4 heat-engine binaries associated to the first controller as down

Bug 1390313 - Post deployment heat service-list reports 4 heat-engine binaries associated to the first controller as down

Summary: Post deployment heat service-list reports 4 heat-engine binaries associated t...

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	rhosp-director
Sub Component:
Version:	10.0 (Newton)
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	11.0 (Ocata)
Assignee:	Thomas Hervé
QA Contact:	Omri Hochman
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1641667
TreeView+	depends on / blocked

Reported:	2016-10-31 16:41 UTC by Marius Cornea
Modified:	2018-10-22 13:27 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1641667 (view as bug list)
Environment:
Last Closed:	2018-06-22 12:45:19 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Marius Cornea 2016-10-31 16:41:02 UTC

Description of problem:
Post deployment heat service-list reports 4 heat-engine binaries associated to the first controller as down:

http://paste.openstack.org/show/587471/

The same issue shows up when running the Heat services on a different role:
http://paste.openstack.org/show/587472/

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-5.0.0-0.9.0rc3.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy HA overcloud 
2. Check heat service-list

Actual results:
Some of the heat-engine associated to the 1st node running the Heat services show as down.

Expected results:
There are no heat-engine binaries reported as down. 

Additional info:

Comment 1 James Slagle 2016-11-01 17:34:16 UTC

steve, i guess this is either composable roles related or related to configuring heat in the overcloud. can someone from DF-Heat take a look?

Comment 2 Steve Baker 2016-11-01 20:56:44 UTC

Something is restarting heat-engine on controller-0 so the old processes are marked as down.

Considering the correct amount of heat-engines on controller-0 are up (4) I don't really see this as an issue. Most likely puppet restarts heat-engine on controller-0 because something has changed, but controller-1 and controller-2 aren't up yet.

One thing we could do is have the heat puppet module create a cron job to run "heat-manage service clean" so that down engines stop showing up in the list, but operators could always run that manually.

Comment 3 Steve Baker 2016-11-02 21:39:55 UTC

I would suggest this is a low-priorty OSP-11 bug

Comment 6 Scott Lewis 2018-06-22 12:45:19 UTC

OSP11 is now retired, see details at https://access.redhat.com/errata/product/191/ver=11/rhel---7/x86_64/RHBA-2018:1828

Note You need to log in before you can comment on or make changes to this bug.