Bug 1348091

Summary: Host deploy fails on Engine 3.6.z with otopi/ovirt-host-deploy ver 1.5.0
Product: [oVirt] ovirt-engine Reporter: Ala Hino <ahino>
Component: Host-DeployAssignee: Sandro Bonazzola <sbonazzo>
Status: CLOSED EOL QA Contact: Jiri Belka <jbelka>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: ahino, amureini, bugs, didi, ylavi
Target Milestone: ---Keywords: TestOnly
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-05 07:18:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ala Hino 2016-06-20 07:02:24 UTC
Description of problem:
Failed to add host on 3.6 using otopi/ovirt-host-deploy ver 1.5.0

How reproducible:
100%

Steps to Reproduce:
1. Add host

Actual results:
Fails

Expected results:
Succeeds

Comment 1 Ala Hino 2016-06-20 07:58:43 UTC
Additional info:

RPMs
====
ovirt-setup-lib-1.0.2-1.el7ev.noarch
ovirt-release36-3.6.6-1.noarch
ovirt-engine-wildfly-10.0.0-1.fc23.x86_64
ovirt-engine-jboss-as-7.1.1-1.fc20.x86_64
ovirt-release-master-4.0.0-0.3.master.20160527093215.gitcbc61db.noarch
ovirt-engine-wildfly-overlay-10.0.0-1.fc23.noarch
ovirt-host-deploy-1.5.0-1.el7ev.noarch
libgovirt-0.3.3-1.fc22.x86_64
otopi-1.5.0-3.el7ev.noarch

Exception (when adding a host)
=========
2016-06-20 10:51:12,078 ERROR [org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [7b79c0ff] Cannot parse input: java.lang.RuntimeException: Invalid data recieved during bootstrap
	at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:370) [otopi.jar:]
	at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:405) [otopi.jar:]
	at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase._threadMain(VdsDeployBase.java:304) [bll.jar:]
	at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.access$800(VdsDeployBase.java:45) [bll.jar:]
	at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase$12.run(VdsDeployBase.java:386) [bll.jar:]
	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_91]

2016-06-20 10:51:12,079 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [7b79c0ff] Error during deploy dialog: java.lang.RuntimeException: Invalid data recieved during bootstrap
	at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:370) [otopi.jar:]
	at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:405) [otopi.jar:]
	at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase._threadMain(VdsDeployBase.java:304) [bll.jar:]
	at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.access$800(VdsDeployBase.java:45) [bll.jar:]
	at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase$12.run(VdsDeployBase.java:386) [bll.jar:]
	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_91]

2016-06-20 10:51:12,084 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (org.ovirt.thread.pool-9-thread-16) [7b79c0ff] Error during host rich-el72-host01.usersys.redhat.com install: java.lang.RuntimeException: Invalid data recieved during bootstrap
	at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:370) [otopi.jar:]
	at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:405) [otopi.jar:]
	at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase._threadMain(VdsDeployBase.java:304) [bll.jar:]
	at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.access$800(VdsDeployBase.java:45) [bll.jar:]
	at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase$12.run(VdsDeployBase.java:386) [bll.jar:]
	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_91]

Comment 2 Yedidyah Bar David 2016-06-20 08:38:41 UTC
Can you please 'rm /var/cache/ovirt-engine/ovirt-host-deploy.tar' on engine machine and try again to add a host?

Comment 3 Yedidyah Bar David 2016-06-20 09:40:15 UTC
otopi-java-1.4 and otopi-1.5 are not compatible.

engine 3.6 should be compatible with both - but I didn't test this.

If you have engine installed from rpm, it should use otopi from rpm too. So if you get a similar failure on a 3.6 rpm engine after updating otopi to 1.5, you might fix it by restarting the engine (so that it uses the updated otopi-java).

If you install engine 3.6 from dev env, you use whatever otopi that's required by its pom.xml, which in 3.6 is 1.2 iirc, which is incompatible with 1.5. So to fix your dev env, you can apply a patch like [1], or keep otopi 1.4.

[1] https://gerrit.ovirt.org/58300

For now, moving to QE to test the following flow. If it fails, we'll consider what to do. Otherwise, it means it only affects dev env, and above explanation applies.

1. Install and setup engine 3.6. Deploy a host, to make sure this works.
2. Update otopi and otopi-java to 1.5 (e.g. from 4.0 repos)
3. Deploy a host. Does it work?
4. Remove /var/cache/ovirt-engine/ovirt-host-deploy.tar and then deploy a host. Does it work?
5. Restart engine and deploy a host. Does it work?

Comment 4 Sandro Bonazzola 2016-06-20 09:44:39 UTC
Also, I don't think that mixing 3.6 and 4.0 rpms is a good thing in general.
Note that this situation won't happen downstream since 3.6 is on el6 only and 4.0 is on el7 only.

Comment 5 Ala Hino 2016-06-20 10:25:51 UTC
(In reply to Yedidyah Bar David from comment #2)
> Can you please 'rm /var/cache/ovirt-engine/ovirt-host-deploy.tar' on engine
> machine and try again to add a host?

Got same result, i.e. failed to add a host. However, I'd definitely verify the scenario suggested in comment #3

Comment 6 Jiri Belka 2016-06-28 14:18:44 UTC
> For now, moving to QE to test the following flow. If it fails, we'll
> consider what to do. Otherwise, it means it only affects dev env, and above
> explanation applies.
> 
> 1. Install and setup engine 3.6. Deploy a host, to make sure this works.

Yes.

> 2. Update otopi and otopi-java to 1.5 (e.g. from 4.0 repos)
> 3. Deploy a host. Does it work?

No.

> 4. Remove /var/cache/ovirt-engine/ovirt-host-deploy.tar and then deploy a
> host. Does it work?

No.

> 5. Restart engine and deploy a host. Does it work?

No.

#1 clearly reveals the reporter is mixing rpms from various OS/distributions, I did test on clean CentOS 7.2.1511 with 3.6 ovirt repos and 4.0 repos.

Comment 7 Red Hat Bugzilla Rules Engine 2016-06-28 14:18:49 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 9 Yedidyah Bar David 2016-06-28 15:07:24 UTC
The first attempts failed as expected and reported above.

The last attempt failed differently:

2016-06-28 17:55:44,813 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [6994e2ae] Error during deploy dialog: java.lang.NullPointerException
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeployMiscUnit$6.call(VdsDeployMiscUnit.java:73) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeployMiscUnit$6.call(VdsDeployMiscUnit.java:68) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase._nextCustomizationEntry(VdsDeployBase.java:251) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.processEvent(VdsDeployBase.java:639) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeploy.processEvent(VdsDeploy.java:35) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase._threadMain(VdsDeployBase.java:319) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.access$800(VdsDeployBase.java:45) [bll.jar:]
        at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase$12.run(VdsDeployBase.java:386) [bll.jar:]
        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_91]

This is due to an incompatible change in host-deploy, not otopi.

Since there is no attempt to keep compatibility - normal user must update all at once - I think we can simply close notabug.

Two other things we might want to do:

1. Version-lock also host-deploy, to prevent this from normal users
2. Test and document dev env of 3.6 and 4.0 engines on a single machine, each with its own otopi and host-deploy (not from rpm). Not that important, as developers can use different VMs for this, but still useful.

Comment 10 Yaniv Lavi 2016-06-30 08:53:30 UTC
Does a yum update to the host resolve this issue?

Comment 11 Yaniv Lavi 2016-06-30 08:54:07 UTC
If this is a dev env, does updating the packages\sources help?

Comment 12 Ala Hino 2016-06-30 09:07:38 UTC
This is a dev issue. Actually in this case where I moved from master to 3.6 branch, downgrading the packages helped

Comment 13 Sandro Bonazzola 2016-07-05 07:18:24 UTC
oVirt 3.6 has reached end of life.
Please upgrade to 4.0, host-deploy 1.5.0 will work there.