Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1285363

Summary:	Deployment failure "httpd never started after 200 seconds"
Product:	Red Hat OpenStack	Reporter:	Jiri Stransky <jstransk>
Component:	openstack-tripleo-heat-templates	Assignee:	Jiri Stransky <jstransk>
Status:	CLOSED ERRATA	QA Contact:	Alexander Chuzhoy <sasha>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	7.0 (Kilo)	CC:	dnavale, jcoufal, jslagle, jstransk, mburns, rhel-osp-director-maint, sasha, yeylon
Target Milestone:	y2
Target Release:	7.0 (Kilo)
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	openstack-tripleo-heat-templates-0.8.6-85.el7ost	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2015-12-21 16:53:00 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Jiri Stransky 2015-11-25 13:15:05 UTC

A deployment failed with this message in os-collect-config log:

Nov 24 18:09:38 overcloud-controller-0.localdomain 
os-collect-config[2921]: httpd not yet started, sleeping 3 seconds.
Nov 24 18:09:38 overcloud-controller-0.localdomain 
os-collect-config[2921]: httpd not yet started, sleeping 3 seconds.
Nov 24 18:09:38 overcloud-controller-0.localdomain 
os-collect-config[2921]: httpd never started after 200 seconds

However, when the environment was investigated, all services were already up and running.

[root@overcloud-controller-0 ~]# pcs status | grep Stopped -C2
[root@overcloud-controller-0 ~]#

There were a few monitor action timeouts in pcmk, but no start/stop timeouts. The actual httpd start time on one of the controllers exceeded the timeout by about 10 seconds, causing the deployment to fail:

Nov 24 18:09:31 overcloud-controller-0.localdomain crmd[29936]: notice: 
Operation httpd_start_0: ok (node=overcloud-controller-0, call=430, 
rc=0, cib-update=246, confirmed=true)

Nov 24 18:09:49 overcloud-controller-1.localdomain crmd[29784]: notice: 
Operation httpd_start_0: ok (node=overcloud-controller-1, call=425, 
rc=0, cib-update=403, confirmed=true)

^^ this one timed out

Nov 24 18:09:07 overcloud-controller-2.localdomain crmd[29500]: notice: 
Operation httpd_start_0: ok (node=overcloud-controller-2, call=422, 
rc=0, cib-update=270, confirmed=true)


The current timeout values are probably too aggressive for slow virtualized environments, and should be bumped up.

Comment 1 Jiri Stransky 2015-11-25 16:50:01 UTC

*** Bug 1284121 has been marked as a duplicate of this bug. ***

Comment 4 Alexander Chuzhoy 2015-12-03 16:07:41 UTC

Verified:

Environment:
openstack-tripleo-heat-templates-0.8.6-85.el7ost.noarch


The reported issue doesn't reproduce. Able to deploy HA.

Comment 8 errata-xmlrpc 2015-12-21 16:53:00 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2650