Description of problem: Post deployment heat service-list reports 4 heat-engine binaries associated to the first controller as down: http://paste.openstack.org/show/587471/ The same issue shows up when running the Heat services on a different role: http://paste.openstack.org/show/587472/ Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-5.0.0-0.9.0rc3.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy HA overcloud 2. Check heat service-list Actual results: Some of the heat-engine associated to the 1st node running the Heat services show as down. Expected results: There are no heat-engine binaries reported as down. Additional info:
steve, i guess this is either composable roles related or related to configuring heat in the overcloud. can someone from DF-Heat take a look?
Something is restarting heat-engine on controller-0 so the old processes are marked as down. Considering the correct amount of heat-engines on controller-0 are up (4) I don't really see this as an issue. Most likely puppet restarts heat-engine on controller-0 because something has changed, but controller-1 and controller-2 aren't up yet. One thing we could do is have the heat puppet module create a cron job to run "heat-manage service clean" so that down engines stop showing up in the list, but operators could always run that manually.
I would suggest this is a low-priorty OSP-11 bug
OSP11 is now retired, see details at https://access.redhat.com/errata/product/191/ver=11/rhel---7/x86_64/RHBA-2018:1828