Bug 1399236

Summary: Installation of host stuck in Stage: Termination
Product: [oVirt] ovirt-host-deploy Reporter: Petr Kubica <pkubica>
Component: CoreAssignee: Piotr Kliczewski <pkliczew>
Status: CLOSED DUPLICATE QA Contact: Pavel Stehlik <pstehlik>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 1.4.1CC: bugs, oourfali, pkliczew, pkubica
Target Milestone: ovirt-3.6.10   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-29 08:05:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
engine logs
none
host1 vdsm logs
none
host2 vdsm logs none

Description Petr Kubica 2016-11-28 15:41:39 UTC
Created attachment 1225321 [details]
engine logs

Description of problem:
I tried to add host with vdsm-4.17.36-1.el7ev.noarch to rhevm-3.6.10-0.1.el6.noarch and it stuck during installation of that host in Stage Termination.
In vdsm.log I see only one error immediately after last log in host-deploy log:

JsonRpc (StompReactor)::ERROR::2016-11-28 14:57:32,770::betterAsyncore::124::vds.dispatcher::(recv) SSL error during reading data: unexpected eof

after restarting ovirt-engine service there are many java exceptions in engine.log and the installing host is non-operational (restarted vdsmd as well)

Version-Release number of selected component (if applicable):
vdsm-4.17.36-1.el7ev.noarch (RHEL7.3)
rhevm-3.6.10-0.1.el6.noarch

How reproducible:
Always

Steps to Reproduce:
1. Install engine
2. Add host

Actual results:
host stuck in status installing

Additional info:
added log from two tested host and from engine

Comment 1 Petr Kubica 2016-11-28 15:42:21 UTC
Created attachment 1225322 [details]
host1 vdsm logs

Comment 2 Petr Kubica 2016-11-28 15:42:43 UTC
Created attachment 1225323 [details]
host2 vdsm logs

Comment 3 Oved Ourfali 2016-11-28 18:27:53 UTC
It complains about invalid fingerprint. Did you try it with a fresh 3.6 engine install on a mother physical host? on another host? 
Maybe there is some configuration issue causing the invalid fingerprint error. 

Some other failures I saw were due to restarting the engine while operations are running.

Comment 4 Piotr Kliczewski 2016-11-28 19:02:21 UTC
I checked your engine and you are using very old jsonrpc - 1.1.14. Please make sure that you are using the latest build.

I also noticed that there was 1 hr difference between the hosts and the engine and was fixed after some time.

2016-11-28 15:27:13,877 INFO  [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (org.ovirt.thread.pool-6-thread-11) [1649fa88] Lock freed to object 'EngineLock:{exclusiveLocks='[ed91cf48-1085-4f5a-a5e2-ea9839f66cd1=<VDS, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2016-11-28 14:27:20,647 INFO  [org.ovirt.engine.core.uutils.config.ShellLikeConfd] (ServerService Thread Pool -- 45) [] Loaded file '/usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.conf'.

Comment 5 Oved Ourfali 2016-11-29 08:05:20 UTC
Please verify host deployment in the next 3.6 build.

*** This bug has been marked as a duplicate of bug 1388778 ***

Comment 6 Petr Kubica 2016-12-01 11:28:58 UTC
(In reply to Oved Ourfali from comment #3)
> It complains about invalid fingerprint. Did you try it with a fresh 3.6
> engine install on a mother physical host? on another host? 
> Maybe there is some configuration issue causing the invalid fingerprint
> error. 
> 
> Some other failures I saw were due to restarting the engine while operations
> are running.

At first time I tried to install a host but it doesn't work so I think there was an issue with host so I reinstall that host. At second time I clicked on reinstall that host -> but it had invalid fingerprint, so I must remove that host and add him again and I see the same issue (stuck).


(In reply to Piotr Kliczewski from comment #4)
> I checked your engine and you are using very old jsonrpc - 1.1.14. Please
> make sure that you are using the latest build.
> 
> I also noticed that there was 1 hr difference between the hosts and the
> engine and was fixed after some time.
> 
> 2016-11-28 15:27:13,877 INFO 
> [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
> (org.ovirt.thread.pool-6-thread-11) [1649fa88] Lock freed to object
> 'EngineLock:{exclusiveLocks='[ed91cf48-1085-4f5a-a5e2-ea9839f66cd1=<VDS,
> ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
> 2016-11-28 14:27:20,647 INFO 
> [org.ovirt.engine.core.uutils.config.ShellLikeConfd] (ServerService Thread
> Pool -- 45) [] Loaded file
> '/usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.conf'.

I'll check it