Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1243611 - Ceph osd build commands time out
Ceph osd build commands time out
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-puppet-modules (Show other bugs)
7.0 (Kilo)
Unspecified Unspecified
unspecified Severity unspecified
: ga
: 8.0 (Liberty)
Assigned To: Gilles Dubreuil
Yogev Rabl
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-07-15 19:19 EDT by Graeme Gillies
Modified: 2016-04-07 17:02 EDT (History)
6 users (show)

See Also:
Fixed In Version: openstack-puppet-modules-7.0.3-1.el7ost
Doc Type: Bug Fix
Doc Text:
Previously, there was no default time out, resulting in some stages of Ceph cluster set-up that look longer than the default 5 minutes (300 seconds). With this update, a time out parameter is added for relevant operations. The default time out parameter value is set at 600 seconds. You can modify the default value, if necessary. As a result, the installation is more resilient, especially when some of the Ceph setup operations take longer than average.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-04-07 17:02:26 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 253317 None None None Never
Red Hat Product Errata RHEA-2016:0603 normal SHIPPED_LIVE Red Hat OpenStack Platform 8 Enhancement Advisory 2016-04-07 20:53:53 EDT

  None (edit)
Description Graeme Gillies 2015-07-15 19:19:24 EDT
Sometimes when doing multinode deployments on slow hosts, or a lot of hosts at once, you see errors similar to the following

http://fpaste.org/244826/14370016/

Basically one of the exec stanzas in the ceph puppet module times out after 5 minutes. This causes the deployment to fail.

Changing this value to something higher (600 seconds) or disabling timeout (setting it to 0) causes the deployment to succeed.

Can we please review the timeout setting on this and all exec stanzas in the ceph module, to ensure they are sufficiently long enough for slower environments?

Regards,

Graeme
Comment 3 Gilles Dubreuil 2015-12-03 23:37:21 EST
For the record, the provided link doesn't exist, maybe attach the errors output or provide a long standing paste.

That said the error and fix is straightforward.

Added a timeout value of 600 to all relevant exec in the puppet-ceph module (see external trackers).
Comment 5 Graeme Gillies 2016-01-03 18:20:23 EST
Hi,

Apologies for not using a longer lived pastebin. Unfortunately I've been trying to reproduce the problem to give you an output which is useful, but at this stage have actually been unable to reproduce the problem at all.

You mention that you have modified all relevant execs in puppet, do you still need the output from when the problem persists? If so I'll keep trying to get a hold of it

Regards,

Graeme
Comment 7 Yogev Rabl 2016-02-03 08:13:32 EST
verified installation of Ceph OSDs on multiple nodes with no time outs

openstack-puppet-modules-7.0.3-1.el7ost.noarch
Comment 9 errata-xmlrpc 2016-04-07 17:02:26 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0603.html

Note You need to log in before you can comment on or make changes to this bug.