Bug 1243611 - Ceph osd build commands time out
Summary: Ceph osd build commands time out
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-puppet-modules
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ga
: 8.0 (Liberty)
Assignee: Gilles Dubreuil
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-15 23:19 UTC by Graeme Gillies
Modified: 2016-04-07 21:02 UTC (History)
6 users (show)

Fixed In Version: openstack-puppet-modules-7.0.3-1.el7ost
Doc Type: Bug Fix
Doc Text:
Previously, there was no default time out, resulting in some stages of Ceph cluster set-up that look longer than the default 5 minutes (300 seconds). With this update, a time out parameter is added for relevant operations. The default time out parameter value is set at 600 seconds. You can modify the default value, if necessary. As a result, the installation is more resilient, especially when some of the Ceph setup operations take longer than average.
Clone Of:
Environment:
Last Closed: 2016-04-07 21:02:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 253317 0 None None None Never
Red Hat Product Errata RHEA-2016:0603 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 8 Enhancement Advisory 2016-04-08 00:53:53 UTC

Description Graeme Gillies 2015-07-15 23:19:24 UTC
Sometimes when doing multinode deployments on slow hosts, or a lot of hosts at once, you see errors similar to the following

http://fpaste.org/244826/14370016/

Basically one of the exec stanzas in the ceph puppet module times out after 5 minutes. This causes the deployment to fail.

Changing this value to something higher (600 seconds) or disabling timeout (setting it to 0) causes the deployment to succeed.

Can we please review the timeout setting on this and all exec stanzas in the ceph module, to ensure they are sufficiently long enough for slower environments?

Regards,

Graeme

Comment 3 Gilles Dubreuil 2015-12-04 04:37:21 UTC
For the record, the provided link doesn't exist, maybe attach the errors output or provide a long standing paste.

That said the error and fix is straightforward.

Added a timeout value of 600 to all relevant exec in the puppet-ceph module (see external trackers).

Comment 5 Graeme Gillies 2016-01-03 23:20:23 UTC
Hi,

Apologies for not using a longer lived pastebin. Unfortunately I've been trying to reproduce the problem to give you an output which is useful, but at this stage have actually been unable to reproduce the problem at all.

You mention that you have modified all relevant execs in puppet, do you still need the output from when the problem persists? If so I'll keep trying to get a hold of it

Regards,

Graeme

Comment 7 Yogev Rabl 2016-02-03 13:13:32 UTC
verified installation of Ceph OSDs on multiple nodes with no time outs

openstack-puppet-modules-7.0.3-1.el7ost.noarch

Comment 9 errata-xmlrpc 2016-04-07 21:02:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0603.html


Note You need to log in before you can comment on or make changes to this bug.