Description of problem: In my tests or in the manual deployment sometime I encounter the cases that request to the python SDK stuck and the code need to wait half hour for the timeout. 2017-03-19 13:24:26 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in installing state 2017-03-19 13:24:27 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in installing state 2017-03-19 13:24:29 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in installing state 2017-03-19 13:24:30 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in installing state 2017-03-19 13:24:31 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in installing state 2017-03-19 13:24:32 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in installing state 2017-03-19 13:54:34 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:86 Error fetching host state: [ERROR]::oVirt API connection failure, (28, 'Operation timed out after 1800000 milliseconds with 0 out of -1 bytes received') 2017-03-19 13:54:34 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in state 2017-03-19 13:54:35 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in up state 2017-03-19 13:54:35 INFO otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:103 The VDSM Host is now operational 2017-03-19 13:54:35 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._closeup:672 Setting CPU for the cluster Version-Release number of selected component (if applicable): ovirt-hosted-engine-setup-2.1.0.4-1.el7ev.noarch How reproducible: 5-10% Steps to Reproduce: 1. Run hosted engine deploy 2. 3. Actual results: Deployment can be stuck on the python SDK command for half hour Expected results: In case if the python SDK command stuck add the reasonable timeout(I believe one minute will more than enough) Additional info: I checked the python SDK documentation and you can add the timeout parameter when you initialize the new instance.
Unfortunately, it may take 15m to install a host, if the RPMs download is slllllooowww..
In the case of the hosted-engine host deploy we sample host status each second, so it will not be a problem, also HE host has already the most part of the relevant packages. So maybe we can reduce the timeout time. What do you think?
(In reply to Artyom from comment #2) > In the case of the hosted-engine host deploy we sample host status each > second, so it will not be a problem, also HE host has already the most part > of the relevant packages. So maybe we can reduce the timeout time. > What do you think? I see no reason to query every second actually, but I also don't see the real value in reducing this timeout.
The real value is to save some time in the case when the python API request is stuck like you can see from the log that I provided. My host deployment finished after 2 minutes, but because some problem with the python SDK I needed to wait the half hour. Also, I do not ask general timeout reduce, I ask timeout reduce only in this specific case when we deploy first HE host to the engine.
(In reply to Artyom from comment #4) > The real value is to save some time in the case when the python API request > is stuck like you can see from the log that I provided. My host deployment > finished after 2 minutes, but because some problem with the python SDK I > needed to wait the half hour. Also, I do not ask general timeout reduce, I > ask timeout reduce only in this specific case when we deploy first HE host > to the engine. I suggest finding the real problem in the SDK.
Artyom is right, we were passing the wrong value on API constructor from the python SDK resulting in unacceptably long timeout.
Verified on ovirt-hosted-engine-setup-2.1.0.6-1.el7ev.noarch