Bug 1285363 - Deployment failure "httpd never started after 200 seconds"
Summary: Deployment failure "httpd never started after 200 seconds"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: y2
: 7.0 (Kilo)
Assignee: Jiri Stransky
QA Contact: Alexander Chuzhoy
URL:
Whiteboard:
: 1284121 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-11-25 13:15 UTC by Jiri Stransky
Modified: 2015-12-21 16:53 UTC (History)
8 users (show)

Fixed In Version: openstack-tripleo-heat-templates-0.8.6-85.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-21 16:53:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 249716 0 None None None Never
Red Hat Product Errata RHSA-2015:2650 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux OpenStack Platform 7 director update 2015-12-21 21:44:54 UTC

Description Jiri Stransky 2015-11-25 13:15:05 UTC
A deployment failed with this message in os-collect-config log:

Nov 24 18:09:38 overcloud-controller-0.localdomain 
os-collect-config[2921]: httpd not yet started, sleeping 3 seconds.
Nov 24 18:09:38 overcloud-controller-0.localdomain 
os-collect-config[2921]: httpd not yet started, sleeping 3 seconds.
Nov 24 18:09:38 overcloud-controller-0.localdomain 
os-collect-config[2921]: httpd never started after 200 seconds

However, when the environment was investigated, all services were already up and running.

[root@overcloud-controller-0 ~]# pcs status | grep Stopped -C2
[root@overcloud-controller-0 ~]#

There were a few monitor action timeouts in pcmk, but no start/stop timeouts. The actual httpd start time on one of the controllers exceeded the timeout by about 10 seconds, causing the deployment to fail:

Nov 24 18:09:31 overcloud-controller-0.localdomain crmd[29936]: notice: 
Operation httpd_start_0: ok (node=overcloud-controller-0, call=430, 
rc=0, cib-update=246, confirmed=true)

Nov 24 18:09:49 overcloud-controller-1.localdomain crmd[29784]: notice: 
Operation httpd_start_0: ok (node=overcloud-controller-1, call=425, 
rc=0, cib-update=403, confirmed=true)

^^ this one timed out

Nov 24 18:09:07 overcloud-controller-2.localdomain crmd[29500]: notice: 
Operation httpd_start_0: ok (node=overcloud-controller-2, call=422, 
rc=0, cib-update=270, confirmed=true)


The current timeout values are probably too aggressive for slow virtualized environments, and should be bumped up.

Comment 1 Jiri Stransky 2015-11-25 16:50:01 UTC
*** Bug 1284121 has been marked as a duplicate of this bug. ***

Comment 4 Alexander Chuzhoy 2015-12-03 16:07:41 UTC
Verified:

Environment:
openstack-tripleo-heat-templates-0.8.6-85.el7ost.noarch


The reported issue doesn't reproduce. Able to deploy HA.

Comment 8 errata-xmlrpc 2015-12-21 16:53:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2650


Note You need to log in before you can comment on or make changes to this bug.