Red Hat Bugzilla – Bug 832072
vds_bootstrap.py Failed to add vdsm host to engine if connection to repository server is slow
Last modified: 2013-01-30 17:50:49 EST
Description of problem:
When I tried to add a vdsm host to ovirt-engine, I found it could fail sometime because of ssh timeout. It looks that the default ssh timeout on ovirt engine side is 600s:
engine-config -g SSHInactivityTimoutSeconds
SSHInactivityTimoutSeconds: 600 version: general
After checking the log of vds_bootstrap.py, I found almost all the time was spent for downloading yum metadata because vds_bootstrap.py cleans up yum cache before installing required packages. If the network connection to yum repository/rhn server is slow, the downloading time will exceed the ssh timeout. And then engine will close the ssh connnection and fail to add vdsm host.
So perhaps we need add a ssh keepalive mechamism in vds_bootstrap.py to make it more robuster.
Version-Release number of selected component (if applicable):
It depends on the speed of network connection to repository/rhn server.
Steps to Reproduce:
Note that if the connection is slow, we'll time out the installation anyway (because we suspect 'yum install' is stuck).
However, re-installing should hopefully succeed.
10 minutes timeout seems quite long, and a repeated installation would succeed, so I'm inclined to say this is NOTABUG.
Note that there's probably a lot of yum noise going over ssh, so the is not an ssh keepalive issue.
Please reopen if you think otherwise.
The problem is that "re-installing" always get the same result because vds_bootstrap.py always cleanup yum cache before install packages. So every time it needs download yum metadata. If it fails one time, probably it will fail again and again. Do you still think it's not a problem?
(In reply to comment #3)
> The problem is that "re-installing" always get the same result because
> vds_bootstrap.py always cleanup yum cache before install packages. So every
> time it needs download yum metadata. If it fails one time, probably it will
> fail again and again. Do you still think it's not a problem?
It takes > 10 minutes just to download the metadata? How limited is the bandwidth? I thought it might take more than 10 minutes to install all required packages - and that's not that bad, because it'll install some and in the next retry would install the remaining and so on.
I believe the network connection in enterprise environment shouldn't be that bad.
So change it to low priority.
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.