Bug 701445 - Oz 5 second timeout can cause EC2 build failure in some cases
Summary: Oz 5 second timeout can cause EC2 build failure in some cases
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: CloudForms Cloud Engine
Classification: Retired
Component: imagefactory
Version: 0.3.1
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: rc
Assignee: Chris Lalancette
QA Contact: wes hayutin
URL:
Whiteboard:
Depends On:
Blocks: ce-beta ce-ami
TreeView+ depends on / blocked
 
Reported: 2011-05-02 21:07 UTC by Justin Clift
Modified: 2015-07-13 04:35 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Justin Clift 2011-05-02 21:07:48 UTC
Description of problem:

  Oz has (at time of writing) a hard coded 5 second connection timeout for ssh connections.

  Occasionally can lead to failures when pushing an image to EC2.  (ie when general network slowness is occurring)


Workaround:

  An effective workaround for this, is to increase the hard coded 5 second timeout to a higher value.

  This value is in two places in the Oz RedHat.py source file:

    /usr/lib/python2.7/site-packages/oz/RedHat.py

  Doing a search in the file for "ConnectTimeout=" will locate them both.  Make the desired change
  (ie from 5 -> 120 or whatever), then restart the Image Factory service to make it take effect:

  $ sudo service imagefactory restart

Comment 1 Chris Lalancette 2011-05-03 13:49:50 UTC
On Oz git HEAD, I've increased the default timeout to 10 seconds.  Unfortunately I haven't yet made it configurable, since I haven't figured out a good way to expose it to the user.  That's a future work item.

Chris Lalancette

Comment 2 wes hayutin 2011-06-16 19:01:04 UTC
is this issue ready for QE?

Comment 3 Chris Lalancette 2011-06-16 19:53:07 UTC
Not yet.  I don't think the change made it into a package yet; once we update the oz package to 0.4.0 (or 0.5.0, once it is out), it should be ready to test.

Chris Lalancette

Comment 4 Chris Lalancette 2011-06-23 14:19:12 UTC
We updated the testing repository to oz 0.4.0, so we should be good to test this out now.

Chris Lalancette

Comment 5 wes hayutin 2011-09-28 14:51:11 UTC
built and start ec2 and vsphere instances.. have not  hit this issue in a very very long time..
verified..


[root@unused ~]# rpm -qa | grep aeolus
aeolus-conductor-0.4.0-0.20110926211730git1cc372b.fc15.noarch
aeolus-conductor-daemons-0.4.0-0.20110926211730git1cc372b.fc15.noarch
aeolus-all-0.4.0-0.20110926211730git1cc372b.fc15.noarch
aeolus-conductor-doc-0.4.0-0.20110926211730git1cc372b.fc15.noarch
aeolus-conductor-devel-0.4.0-0.20110926211730git1cc372b.fc15.noarch
rubygem-aeolus-image-0.1.0-3.20110919115936gitd1d24b4.fc15.noarch
aeolus-configure-2.0.2-4.20110926142838git5044e56.fc15.noarch

Comment 7 wes hayutin 2011-12-08 14:03:28 UTC
closing out old bugs

Comment 8 wes hayutin 2011-12-08 14:15:13 UTC
perm close


Note You need to log in before you can comment on or make changes to this bug.