Hide Forgot
Description of problem: In the process of scaling out an OverCloud using OSP-d, e.g. adding a compute node, there is a period when the OverCloud become unavailable. This is far from ideal as it affects the uptime of the OpenStack. Handling the services in such a way that one node always would be available to server user requests would be much preferred. Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-0.8.6-121.el7ost.noarch How reproducible: Every time Steps to Reproduce: 1. Add an compute node 2. Listen to your users asking why OpenStack is down Additional info: In some cases I've seen nova-compute timing out the connection to RabbitMQ, causing the compute node to go off-line.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
*** This bug has been marked as a duplicate of bug 1339559 ***
Actually, on closer look, I'm un-duplicating this bug. Bz 1339559 is regarding not restarting services when scaling out. This bug is related, but covers a broader topic, what I would like to see is that when services are restarted, that the restart is orchestrated in a rolling way, such that in a rolling setup, there always is a running control node. With other words, even when a service restart is required, end-users will not experience a total outage
have you tried https://access.redhat.com/solutions/2345231 ?
I don't have any immediate need for a workaround right now but I am monitoring Bz1421883 as the kbase suggests
This has been addressed in newer versions. Please upgrade to 10 where this should no longer be an issue. We won't be fixing this for OSP7