Bug 1034777

Summary: SSL connection times out on read operation when installing from disk
Product: Red Hat Enterprise Virtualization Manager Reporter: Lukas Svaty <lsvaty>
Component: ovirt-hosted-engine-setupAssignee: Sandro Bonazzola <sbonazzo>
Status: CLOSED ERRATA QA Contact: movciari
Severity: high Docs Contact:
Priority: high    
Version: 3.3.0CC: adingman, asegurap, danken, dfediuck, fsimonce, iheim, msivak, pstehlik, scohen, ybronhei
Target Milestone: ---Keywords: Triaged
Target Release: 3.3.0   
Hardware: All   
OS: Linux   
Whiteboard: integration
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-21 16:55:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1037584    
Bug Blocks:    
Attachments:
Description Flags
setup log
none
vdsm log might be also useful none

Description Lukas Svaty 2013-11-26 13:42:14 UTC
Created attachment 829270 [details]
setup log

Description of problem:
When installing engine vm from disk SSL connection times out 
[ INFO  ] Uploading volume to data domain (could take a few minutes depending on archive size)
[ ERROR ] Failed to execute stage 'Misc configuration': The read operation timed out


Version-Release number of selected component (if applicable):
is24.1
ovirt-hosted-engine-setup-1.0.0-0.9.beta4.el6ev 

How reproducible:
100%

Steps to Reproduce:
1. Run ovirt-hosted-engine-setup
2. create engine VM from .ovf disk

Actual results:
SSL times out on read operation

Expected results:
Timout for SSL should be raised to suit installation from disk

Additional info:
adding vdsm.log and ovirt-engine-setup.log

Comment 1 Lukas Svaty 2013-11-26 13:42:55 UTC
Created attachment 829271 [details]
vdsm log might be also useful

Comment 2 Sandro Bonazzola 2013-11-29 15:13:56 UTC
This may be a VDSM bug. Federico can you look into it?

Comment 3 Federico Simoncelli 2013-11-29 16:07:08 UTC
You shouldn't be using uploadVolume (never used and tested). It has been removed recently:

http://gerrit.ovirt.org/21425

What are you using it for? The supported api is downloadImage (that is capable of downloading an image using http).

Comment 4 Sandro Bonazzola 2013-12-02 16:06:13 UTC
(In reply to Federico Simoncelli from comment #3)

> What are you using it for? The supported api is downloadImage (that is
> capable of downloading an image using http).

Well, actually using curl... so also file:// is a supported URL (luckily)

Comment 6 Federico Simoncelli 2013-12-05 14:51:53 UTC
The issue with uploadVolume is entirely on the client side (ovirt-hosted-engine-setup) as when VDSM tries to answer gets the following error (Broken pipe):

Thread-61::DEBUG::2013-11-26 14:15:23,998::sp::2009::Storage.Misc.excCmd::(uploadVolume) SUCCESS: <err> = ''; <rc> = 0
Thread-61::INFO::2013-11-26 14:15:23,998::logUtils::47::dispatcher::(wrapper) Run and protect: uploadVolume, Return response: None
...
Thread-61::ERROR::2013-11-26 14:15:24,019::SecureXMLRPCServer::168::root::(handle_error) client ('127.0.0.1', 36387)
Traceback (most recent call last):
  File "/usr/lib64/python2.6/SocketServer.py", line 570, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib64/python2.6/SocketServer.py", line 332, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib64/python2.6/SocketServer.py", line 627, in __init__
    self.handle()
  File "/usr/lib64/python2.6/BaseHTTPServer.py", line 329, in handle
    self.handle_one_request()
  File "/usr/lib64/python2.6/BaseHTTPServer.py", line 323, in handle_one_request
    method()
  File "/usr/lib64/python2.6/SimpleXMLRPCServer.py", line 490, in do_POST
    self.send_response(200)
  File "/usr/lib64/python2.6/BaseHTTPServer.py", line 384, in send_response
    self.send_header('Server', self.version_string())
  File "/usr/lib64/python2.6/BaseHTTPServer.py", line 390, in send_header
    self.wfile.write("%s: %s\r\n" % (keyword, value))
  File "/usr/lib64/python2.6/socket.py", line 324, in write
    self.flush()
  File "/usr/lib64/python2.6/socket.py", line 303, in flush
    self._sock.sendall(buffer(data, write_offset, buffer_size))
  File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 222, in write
    return self._write_bio(data)
  File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 205, in _write_bio
    return m2.ssl_write(self.ssl, data, self._timeout)
SSLError: (32, 'Broken pipe')


This is because the client (ovirt-hosted-engine-setup) closed the connection in:

2013-11-26 14:10:47 INFO otopi.plugins.ovirt_hosted_engine_setup.vm.boot_disk boot_disk.commit:96 Uploading volume to data domain (could take a few minutes depending on archive size)
2013-11-26 14:11:48 DEBUG otopi.context context._executeMethod:137 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/otopi/context.py", line 127, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/vm/boot_disk.py", line 403, in _misc
    dst=self._image_path,
  File "/usr/lib/python2.6/site-packages/otopi/transaction.py", line 168, in __exit__
    self.commit()
  File "/usr/lib/python2.6/site-packages/otopi/transaction.py", line 160, in commit
    element.commit()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/vm/boot_disk.py", line 109, in commit
    ohostedcons.StorageEnv.IMAGE_SIZE_GB
  File "/usr/share/vdsm/vdsClient.py", line 1000, in uploadVolume
    volUUID, srcPath, size, *args[6:])
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1237, in request
    errcode, errmsg, headers = h.getreply()
  File "/usr/lib64/python2.6/httplib.py", line 1064, in getreply
    response = self._conn.getresponse()
  File "/usr/lib64/python2.6/httplib.py", line 990, in getresponse
    response.begin()
  File "/usr/lib64/python2.6/httplib.py", line 391, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.6/httplib.py", line 349, in _read_status
    line = self.fp.readline()
  File "/usr/lib64/python2.6/socket.py", line 433, in readline
    data = recv(1)
  File "/usr/lib64/python2.6/ssl.py", line 215, in recv
    return self.read(buflen)
  File "/usr/lib64/python2.6/ssl.py", line 136, in read
    return self._sslobj.read(len)
SSLError: The read operation timed out
2013-11-26 14:11:48 ERROR otopi.context context._executeMethod:146 Failed to execute stage 'Misc configuration': The read operation timed out


I suppose you have to adjust the socket timeout somehow.

Comment 7 Sandro Bonazzola 2013-12-05 15:58:01 UTC
(In reply to Federico Simoncelli from comment #6)
> The issue with uploadVolume is entirely on the client side
> (ovirt-hosted-engine-setup) as when VDSM tries to answer gets the following
> error (Broken pipe):

[cut]

> I suppose you have to adjust the socket timeout somehow.

Any idea on how to access underlying socket for adjusting timeout there?
I think I can try to use socket.settimeout with something like 1 hour before timeout (just to be sure it fails if something goes really bad, shouldn't take that much).

However, 

> You shouldn't be using uploadVolume (never used and tested). It has been
> removed recently:
> 
> http://gerrit.ovirt.org/21425
> 
> What are you using it for? The supported api is downloadImage (that is
> capable of downloading an image using http).

So I'm not sure it's really a good idea using a deprecated api.

Comment 8 Sandro Bonazzola 2013-12-06 11:07:31 UTC
(In reply to Sandro Bonazzola from comment #7)

> > I suppose you have to adjust the socket timeout somehow.
> 
> Any idea on how to access underlying socket for adjusting timeout there?
> I think I can try to use socket.settimeout with something like 1 hour before
> timeout (just to be sure it fails if something goes really bad, shouldn't
> take that much).

It doesn't seem easy at all to access the underlying socket here.
xmlrpclib doesn't allow to access it through attributes or method calls.

Comment 9 Sandro Bonazzola 2013-12-09 14:36:16 UTC
(In reply to Federico Simoncelli from comment #6)
> SSLError: The read operation timed out
> 2013-11-26 14:11:48 ERROR otopi.context context._executeMethod:146 Failed to
> execute stage 'Misc configuration': The read operation timed out
> 
> 
> I suppose you have to adjust the socket timeout somehow.


It seems that a timeout is set server side to 160 seconds in vdsm/kaxmlrpclib.py [1]

I can suggest to set it to something higher like 1800 (30 minutes) or allow to specify it in a configuration file.
I don't think this can be solved on client side.

[1] http://gerrit.ovirt.org/gitweb?p=vdsm.git;a=blob;f=vdsm/kaxmlrpclib.py;h=735a9e913f86cb8add6a2d12d029c7c1b81dd465;hb=HEAD

Comment 11 Sandro Bonazzola 2013-12-17 15:50:55 UTC
Solved importing the OVA / OVF image using qemu-img for ensuring
it's in RAW format, without using VDSM API for the copy.

Comment 12 Sandro Bonazzola 2013-12-17 15:53:11 UTC
Dropping oVirt gerrit 21923 from this bug, it will be used if / when downloadImage will be implemented.

Comment 13 Sandro Bonazzola 2013-12-17 15:53:50 UTC
patch merged on upstream master and 1.0 branch.

Comment 15 errata-xmlrpc 2014-01-21 16:55:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0083.html