Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1242052 - Deployment hangs because no controller services are running
Deployment hangs because no controller services are running
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
Director
Unspecified Unspecified
high Severity unspecified
: ga
: Director
Assigned To: Giulio Fidente
Alexander Chuzhoy
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-07-10 14:46 EDT by Ben Nemec
Modified: 2015-08-05 09:59 EDT (History)
7 users (show)

See Also:
Fixed In Version: openstack-tripleo-heat-templates-0.8.6-42.el7ost
Doc Type: Bug Fix
Doc Text:
The timeout for Pacemaker service start-up was 20 seconds. Sometimes start-up exceeded this time limit and caused hung deployments. This fix increases the timeout to 60 second. Pacemaker services now start correctly and the deployment completes.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-08-05 09:59:16 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 202085 None None None Never
Red Hat Product Errata RHEA-2015:1549 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform director Release 2015-08-05 13:49:10 EDT

  None (edit)
Description Ben Nemec 2015-07-10 14:46:34 EDT
Description of problem: Overcloud deployment hangs.  When looking at the service status on the compute and control nodes, the compute node will have nova-compute hung trying to connect to the controller, while the controller will have no OpenStack services running at all.

I am hitting this on a fairly regular basis with basic 1 control, 1 compute deployments.  It may be mitigated by HA because if one controller fails the deployment can still continue.


Version-Release number of selected component (if applicable): 


How reproducible: Intermittent


Steps to Reproduce:
1. Deploy cloud with director
2. On some percentage of deployments, it will hang with the described symptoms
3.

Actual results: Hung deployment


Expected results: Successful deployment


Additional info: The current theory on this is that pacemaker is timing out starting the services on the controller.  The current timeout is 20 seconds, and we were advised that 60 would be a better value.
Comment 5 Alexander Chuzhoy 2015-07-21 10:41:03 EDT
Verified:

Environment:
instack-undercloud-2.1.2-21.el7ost.noarch

Don't reproduce the issue.
Comment 7 errata-xmlrpc 2015-08-05 09:59:16 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1549

Note You need to log in before you can comment on or make changes to this bug.