Bug 1242052 - Deployment hangs because no controller services are running
Summary: Deployment hangs because no controller services are running
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: Director
Hardware: Unspecified
OS: Unspecified
Target Milestone: ga
: Director
Assignee: Giulio Fidente
QA Contact: Alexander Chuzhoy
Depends On:
TreeView+ depends on / blocked
Reported: 2015-07-10 18:46 UTC by Ben Nemec
Modified: 2015-08-05 13:59 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-heat-templates-0.8.6-42.el7ost
Doc Type: Bug Fix
Doc Text:
The timeout for Pacemaker service start-up was 20 seconds. Sometimes start-up exceeded this time limit and caused hung deployments. This fix increases the timeout to 60 second. Pacemaker services now start correctly and the deployment completes.
Clone Of:
Last Closed: 2015-08-05 13:59:16 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
OpenStack gerrit 202085 None None None Never
Red Hat Product Errata RHEA-2015:1549 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform director Release 2015-08-05 17:49:10 UTC

Description Ben Nemec 2015-07-10 18:46:34 UTC
Description of problem: Overcloud deployment hangs.  When looking at the service status on the compute and control nodes, the compute node will have nova-compute hung trying to connect to the controller, while the controller will have no OpenStack services running at all.

I am hitting this on a fairly regular basis with basic 1 control, 1 compute deployments.  It may be mitigated by HA because if one controller fails the deployment can still continue.

Version-Release number of selected component (if applicable): 

How reproducible: Intermittent

Steps to Reproduce:
1. Deploy cloud with director
2. On some percentage of deployments, it will hang with the described symptoms

Actual results: Hung deployment

Expected results: Successful deployment

Additional info: The current theory on this is that pacemaker is timing out starting the services on the controller.  The current timeout is 20 seconds, and we were advised that 60 would be a better value.

Comment 5 Alexander Chuzhoy 2015-07-21 14:41:03 UTC


Don't reproduce the issue.

Comment 7 errata-xmlrpc 2015-08-05 13:59:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.