Description of problem: Overcloud deployment hangs. When looking at the service status on the compute and control nodes, the compute node will have nova-compute hung trying to connect to the controller, while the controller will have no OpenStack services running at all.
I am hitting this on a fairly regular basis with basic 1 control, 1 compute deployments. It may be mitigated by HA because if one controller fails the deployment can still continue.
Version-Release number of selected component (if applicable):
How reproducible: Intermittent
Steps to Reproduce:
1. Deploy cloud with director
2. On some percentage of deployments, it will hang with the described symptoms
Actual results: Hung deployment
Expected results: Successful deployment
Additional info: The current theory on this is that pacemaker is timing out starting the services on the controller. The current timeout is 20 seconds, and we were advised that 60 would be a better value.
Don't reproduce the issue.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.