Bug 832072

Summary: vds_bootstrap.py Failed to add vdsm host to engine if connection to repository server is slow
Product: [Retired] oVirt Reporter: Mark Wu <wudxw>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED WONTFIX QA Contact:
Severity: low Docs Contact:
Priority: unspecified    
Version: 3.1 RCCC: abaron, acathrow, bazulay, dyasny, iheim, mgoldboi, ykaul
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-30 22:50:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mark Wu 2012-06-14 12:51:33 UTC
Description of problem:
When I tried to add a vdsm host to ovirt-engine, I found it could fail sometime because of ssh timeout. It looks that the default ssh timeout on ovirt engine side is 600s:

engine-config -g SSHInactivityTimoutSeconds
SSHInactivityTimoutSeconds: 600 version: general

After checking the log of vds_bootstrap.py, I found almost all the time was spent for downloading yum metadata because vds_bootstrap.py cleans up yum cache before installing required packages. If the network connection to yum repository/rhn server is slow, the downloading time will exceed the ssh timeout. And then engine will close the ssh connnection and fail to add vdsm host.

So perhaps we need add a ssh keepalive mechamism in vds_bootstrap.py to make it more robuster.

Version-Release number of selected component (if applicable):
vdsm-4.10.0-0.15.git64f4c1f.fc17

How reproducible:
It depends on the speed of network connection to repository/rhn server.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Yaniv Kaul 2012-06-14 12:53:28 UTC
Note that if the connection is slow, we'll time out the installation anyway (because we suspect 'yum install' is stuck).
However, re-installing should hopefully succeed.

Comment 2 Dan Kenigsberg 2012-06-14 12:57:19 UTC
10 minutes timeout seems quite long, and a repeated installation would succeed, so I'm inclined to say this is NOTABUG.

Note that there's probably a lot of yum noise going over ssh, so the is not an ssh keepalive issue.

Please reopen if you think otherwise.

Comment 3 Mark Wu 2012-06-15 06:47:48 UTC
The problem is that "re-installing" always get the same result because vds_bootstrap.py always cleanup yum cache before install packages. So every time it needs download yum metadata. If it fails one time, probably it will fail again and again. Do you still think it's not a problem?

Comment 4 Yaniv Kaul 2012-06-15 08:02:15 UTC
(In reply to comment #3)
> The problem is that "re-installing" always get the same result because
> vds_bootstrap.py always cleanup yum cache before install packages. So every
> time it needs download yum metadata. If it fails one time, probably it will
> fail again and again. Do you still think it's not a problem?

It takes > 10 minutes just to download the metadata? How limited is the bandwidth? I thought it might take more than 10 minutes to install all required packages - and that's not that bad, because it'll install some and in the next retry would install the remaining and so on.

Comment 5 Mark Wu 2012-06-18 00:50:23 UTC
I believe the network connection in enterprise environment shouldn't be that bad.
So change it to low priority.

Comment 6 Itamar Heim 2013-01-30 22:50:49 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.