Bug 987891

Summary: rhevh upgrade - iso is sent but upgrade is not executed
Product: Red Hat Enterprise Virtualization Manager Reporter: Lukas Svaty <lsvaty>
Component: ovirt-engineAssignee: Alon Bar-Lev <alonbl>
Status: CLOSED CURRENTRELEASE QA Contact: Lukas Svaty <lsvaty>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: acathrow, alukiano, bazulay, bdagan, dougsland, iheim, lpeer, lsvaty, Rhev-m-bugs, yeylon
Target Milestone: ---   
Target Release: 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: is8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-21 22:18:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine.log
none
full engine.log none

Description Lukas Svaty 2013-07-24 11:32:54 UTC
Created attachment 777742 [details]
engine.log

Description of problem:
When trying to update rhevh (RHEV Hypervisor - 6.4 - 20130528.0.el6_4) to hypervisor6-6.4-20130709 installation fails. Event in webadmin portal is displayed:
"Host 10.34.62.204 installation failed. Please refer to engine.log and log files under /var/log/ovirt-engine/host-deploy/ on the engine for further details.."

engine-log failed on:
2013-07-24 13:24:09,474 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-50) [5eed22de] Correlation ID: 5eed22de, Call Stack: null, Custom Event ID: -1, Message: Host 10.34.62.204 installation failed. Please refer to engine.log and log files under /var/log/ovirt-engine/host-deploy/ on the engine for further details..

however no host-deploy log was created in the folder /var/log/ovirt-engine/host-deploy/

Host stays in Installing status

Version-Release number of selected component (if applicable):
is6

How reproducible:
100%

Steps to Reproduce:
1. add host to setup
2. on setup install image
yum install http://download.devel.redhat.com/brewroot/packages/rhev-hypervisor6/6.4/20130709.0.el6_4/noarch/rhev-hypervisor6-6.4-20130709.0.el6_4.noarch.rpm
3. set host to maintenance
4. try to install new iso

Actual results:
Installation failed and host stayed in 'isntalling' status

Expected results:
Installation should succeed and host should be set to 'up' status
or
Installation should fail and host should be set back to 'maintenance' status

Additional info:

Comment 1 Alon Bar-Lev 2013-07-24 12:02:00 UTC
Please attach full engine log.
Please attach /tmp/*.log at ovirt-node side.

Thanks!

Comment 2 Lukas Svaty 2013-07-24 12:41:49 UTC
Created attachment 777761 [details]
full engine.log

full engine.log

Comment 4 Alon Bar-Lev 2013-07-24 14:00:39 UTC
It took forever to download logs...

Oh!

You are trying to upgrade...

You are right, at upgrade we won't get logs at /var/log/ovirt-engine/host-deploy, I will fix that message.

I do see that it tries to:
mkdir -p '/data/updates'

But then I don't see anything helpful, why this command failed, can it be that there is not enough free space at root filesystem?

But I see that a new installation is working.

Can you reproduce this every time you try to upgrade/reinstall that node?

Comment 5 Lukas Svaty 2013-07-24 14:11:10 UTC
sorry for so big logs :)

I can reproduce it all the time I try update from portal...

First issue should be with hypervisor / host-deploy that the update fails..

Second issue is with webadmin portal that the Host actually stays in 'installing' status (should be duplicated to (product: ovirt-engine-webadmin-portal)

hypervisor should have enough space:

[root@localhost ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/live-rw   1.3G  369M  879M  30% /
/dev/mapper/HostVG-Config
                      7.8M  1.7M  5.7M  23% /config
tmpfs                 3.9G   12K  3.9G   1% /dev/shm
/dev/mapper/35000cca39cc68e8ap3
                      237M  152M   73M  68% /dev/.initramfs/live
10.34.63.202:/home/iso/shared
                      442G  166G  254G  40% /rhev/data-center/mnt/10.34.63.202:_home_iso_shared
df: `/rhev/data-center/mnt/10.35.64.106:_fastpass_ls-rhevm33__nfs__2013__07__10__11__13__40__167293': Stale file handle
10.34.63.202:/mnt/export/nfs/lv3/lsvaty/nfs01
                      256G  218G   26G  90% /rhev/data-center/mnt/10.34.63.202:_mnt_export_nfs_lv3_lsvaty_nfs01

Comment 6 Alon Bar-Lev 2013-07-29 09:53:23 UTC
2013-07-29 05:46:59,170 INFO  [org.ovirt.engine.core.bll.OVirtNodeUpgrade] (pool-5-thread-15) [5612509a] E001: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
        at org.ovirt.engine.core.utils.ssh.SSHClient._validateDigest(SSHClient.java:97) [utils.jar:]

So we use standard[1] j2se class that is missing at rhel?!?!

[1] http://docs.oracle.com/javase/7/docs/api/javax/xml/bind/DatatypeConverter.html#parseHexBinary%28java.lang.String%29

Comment 7 Alon Bar-Lev 2013-07-29 10:13:55 UTC
BTW: this is something new in 3.2, this function worked at 3.1.

Comment 8 Alon Bar-Lev 2013-07-29 10:40:01 UTC
*** Bug 989216 has been marked as a duplicate of this bug. ***

Comment 9 Alon Bar-Lev 2013-07-29 10:49:57 UTC
host-deploy: ssh: use apache commons Hex instead of j2se


although javax.xml.bind.DatatypeConverter is part of j2se[1], it is missing
from jboss rhev environment.

java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
        at org.ovirt.engine.core.utils.ssh.SSHClient._validateDigest(SSHClient.java:97) [utils.jar:]
        at org.ovirt.engine.core.utils.ssh.SSHClient.sendFile(SSHClient.java:625) [utils.jar:]
        at org.ovirt.engine.core.utils.ssh.SSHDialog.sendFile(SSHDialog.java:361) [utils.jar:]

resolution is to use apache commons Hex to decode hex instead.

[1] http://docs.oracle.com/javase/7/docs/api/javax/xml/bind/DatatypeConverter.html#parseHexBinary%28java.lang.String%29

Bug-Url: https://bugzilla.redhat.com/show_bug.cgi?id=987891
Change-Id: Ie57b334d152da9e20ce556f801d48b836378fa7b
Signed-off-by: Alon Bar-Lev <alonbl>

Comment 10 Alon Bar-Lev 2013-07-29 10:51:37 UTC
host-deploy: sync node upgrade to vds deploy


during the move to network processing within engine, the node upgrade
and vds deploy became out of sync, it could have been easily been solved
by adding handleError(e, VDSStatus.InstallFailed) at exception handler,
however better to sync the two.

Change-Id: Ie1c2733597f0d7408dd0b863b42412b479dfcd15
Signed-off-by: Alon Bar-Lev <alonbl>

Comment 11 Alon Bar-Lev 2013-07-29 13:40:32 UTC
(In reply to Alon Bar-Lev from comment #7)
> BTW: this is something new in 3.2, this function worked at 3.1.

Sorry, I was completely out of sync!

This is something new in 3.3, worked in 3.2 and 3.1.

Comment 12 Lukas Svaty 2013-08-06 11:04:05 UTC
verified in is8

Comment 13 Itamar Heim 2014-01-21 22:18:25 UTC
Closing - RHEV 3.3 Released

Comment 14 Itamar Heim 2014-01-21 22:24:47 UTC
Closing - RHEV 3.3 Released