Created attachment 777742 [details] engine.log Description of problem: When trying to update rhevh (RHEV Hypervisor - 6.4 - 20130528.0.el6_4) to hypervisor6-6.4-20130709 installation fails. Event in webadmin portal is displayed: "Host 10.34.62.204 installation failed. Please refer to engine.log and log files under /var/log/ovirt-engine/host-deploy/ on the engine for further details.." engine-log failed on: 2013-07-24 13:24:09,474 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-50) [5eed22de] Correlation ID: 5eed22de, Call Stack: null, Custom Event ID: -1, Message: Host 10.34.62.204 installation failed. Please refer to engine.log and log files under /var/log/ovirt-engine/host-deploy/ on the engine for further details.. however no host-deploy log was created in the folder /var/log/ovirt-engine/host-deploy/ Host stays in Installing status Version-Release number of selected component (if applicable): is6 How reproducible: 100% Steps to Reproduce: 1. add host to setup 2. on setup install image yum install http://download.devel.redhat.com/brewroot/packages/rhev-hypervisor6/6.4/20130709.0.el6_4/noarch/rhev-hypervisor6-6.4-20130709.0.el6_4.noarch.rpm 3. set host to maintenance 4. try to install new iso Actual results: Installation failed and host stayed in 'isntalling' status Expected results: Installation should succeed and host should be set to 'up' status or Installation should fail and host should be set back to 'maintenance' status Additional info:
Please attach full engine log. Please attach /tmp/*.log at ovirt-node side. Thanks!
Created attachment 777761 [details] full engine.log full engine.log
It took forever to download logs... Oh! You are trying to upgrade... You are right, at upgrade we won't get logs at /var/log/ovirt-engine/host-deploy, I will fix that message. I do see that it tries to: mkdir -p '/data/updates' But then I don't see anything helpful, why this command failed, can it be that there is not enough free space at root filesystem? But I see that a new installation is working. Can you reproduce this every time you try to upgrade/reinstall that node?
sorry for so big logs :) I can reproduce it all the time I try update from portal... First issue should be with hypervisor / host-deploy that the update fails.. Second issue is with webadmin portal that the Host actually stays in 'installing' status (should be duplicated to (product: ovirt-engine-webadmin-portal) hypervisor should have enough space: [root@localhost ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/live-rw 1.3G 369M 879M 30% / /dev/mapper/HostVG-Config 7.8M 1.7M 5.7M 23% /config tmpfs 3.9G 12K 3.9G 1% /dev/shm /dev/mapper/35000cca39cc68e8ap3 237M 152M 73M 68% /dev/.initramfs/live 10.34.63.202:/home/iso/shared 442G 166G 254G 40% /rhev/data-center/mnt/10.34.63.202:_home_iso_shared df: `/rhev/data-center/mnt/10.35.64.106:_fastpass_ls-rhevm33__nfs__2013__07__10__11__13__40__167293': Stale file handle 10.34.63.202:/mnt/export/nfs/lv3/lsvaty/nfs01 256G 218G 26G 90% /rhev/data-center/mnt/10.34.63.202:_mnt_export_nfs_lv3_lsvaty_nfs01
2013-07-29 05:46:59,170 INFO [org.ovirt.engine.core.bll.OVirtNodeUpgrade] (pool-5-thread-15) [5612509a] E001: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter at org.ovirt.engine.core.utils.ssh.SSHClient._validateDigest(SSHClient.java:97) [utils.jar:] So we use standard[1] j2se class that is missing at rhel?!?! [1] http://docs.oracle.com/javase/7/docs/api/javax/xml/bind/DatatypeConverter.html#parseHexBinary%28java.lang.String%29
BTW: this is something new in 3.2, this function worked at 3.1.
*** Bug 989216 has been marked as a duplicate of this bug. ***
host-deploy: ssh: use apache commons Hex instead of j2se although javax.xml.bind.DatatypeConverter is part of j2se[1], it is missing from jboss rhev environment. java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter at org.ovirt.engine.core.utils.ssh.SSHClient._validateDigest(SSHClient.java:97) [utils.jar:] at org.ovirt.engine.core.utils.ssh.SSHClient.sendFile(SSHClient.java:625) [utils.jar:] at org.ovirt.engine.core.utils.ssh.SSHDialog.sendFile(SSHDialog.java:361) [utils.jar:] resolution is to use apache commons Hex to decode hex instead. [1] http://docs.oracle.com/javase/7/docs/api/javax/xml/bind/DatatypeConverter.html#parseHexBinary%28java.lang.String%29 Bug-Url: https://bugzilla.redhat.com/show_bug.cgi?id=987891 Change-Id: Ie57b334d152da9e20ce556f801d48b836378fa7b Signed-off-by: Alon Bar-Lev <alonbl>
host-deploy: sync node upgrade to vds deploy during the move to network processing within engine, the node upgrade and vds deploy became out of sync, it could have been easily been solved by adding handleError(e, VDSStatus.InstallFailed) at exception handler, however better to sync the two. Change-Id: Ie1c2733597f0d7408dd0b863b42412b479dfcd15 Signed-off-by: Alon Bar-Lev <alonbl>
(In reply to Alon Bar-Lev from comment #7) > BTW: this is something new in 3.2, this function worked at 3.1. Sorry, I was completely out of sync! This is something new in 3.3, worked in 3.2 and 3.1.
verified in is8
Closing - RHEV 3.3 Released