Bug 1129238 - [engine-backend] bad handling with OSError
Summary: [engine-backend] bad handling with OSError
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.4.1-1
Hardware: x86_64
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.5.0
Assignee: Yaniv Bronhaim
QA Contact:
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-12 11:12 UTC by Elad
Modified: 2016-02-10 19:16 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-08-25 05:50:26 UTC
oVirt Team: Infra


Attachments (Terms of Use)
logs from engine and host (109.36 KB, application/x-gzip)
2014-08-12 11:12 UTC, Elad
no flags Details

Description Elad 2014-08-12 11:12:29 UTC
Created attachment 926039 [details]
logs from engine and host

Description of problem:
Added a rhel7 host to my setup. The installation failed due to an error in vdsm:
OSError: [Errno 2] No such file or directory: '/var/run/vdsm/client.log'

The error wasn't caught properly by engine, as seen in engine.log:

2014-08-12 12:05:26,862 ERROR [org.ovirt.engine.core.utils.ssh.SSHDialog] (org.ovirt.thread.pool-4-thread-49) SSH error running command root@10.35.102.11:'umask 0077; MYTMP="$(mktemp -t ovirt-XXXXXXXXXX)"; trap "c
hmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; rm -fr "${MYTMP}" && mkdir "${MYTMP}" && tar --warning=no-timestamp -C "${MYTMP}" -x &&  "${MYTMP}"/setup DIALOG/dialect=str:ma
chine DIALOG/customization=bool:True': java.io.IOException: Command returned failure code 1 during SSH session 'root@10.35.102.11'
        at org.ovirt.engine.core.utils.ssh.SSHClient.executeCommand(SSHClient.java:527) [utils.jar:]
        at org.ovirt.engine.core.utils.ssh.SSHDialog.executeCommand(SSHDialog.java:318) [utils.jar:]


Version-Release number of selected component (if applicable):
rhev-3.4.2-av11

engine:
rhevm-3.4.2-0.1.el6ev.noarch

host:
Red Hat Enterprise Linux Server release 7.0 (Maipo)
vdsm-4.14.13-1.el7ev.x86_64


How reproducible:
while https://bugzilla.redhat.com/show_bug.cgi?id=1129232 is reproduced

Steps to Reproduce:
1. Create a DC and cluster in rhevm
2. Attach a rhel7 host to the setup

Actual results:
vdsm fails with:
OSError: [Errno 2] No such file or directory: '/var/run/vdsm/client.log'

and the error isn't treated properly by engine. As far as I understand, this is a log issue, didn't see any other undesirable behavior. Host becomes non-operational and the following message is shown in webadmin:

Failed to install Host green-b. Failed to execute stage 'Closing up': Command '/bin/systemctl' failed to execute.

Expected results:
The error from vdsm should be treated and reported nicely in the logs


Additional info: logs from engine and host

Comment 1 Elad 2014-08-12 11:17:46 UTC
The bug was opened on 3.4.1, but it occurs in 3.4.2. There is no such option.

Comment 2 Yaniv Bronhaim 2014-08-20 00:30:46 UTC
I'm not sure the error you mention related to the same exception in vdsm.
 but anyhow, vdsm stopped to response after this exception iiuc, so the ssh communication might dropped and it reports about installation failure or that service could not start properly. what else do we except to see if vdsm doesn't response or cannot start? I think the current behavior and report are fine in such cases from engine's prospective

Comment 3 Elad 2014-08-20 07:39:33 UTC
As I see it, the issue here is the ugly message in the log

Comment 4 Yaniv Bronhaim 2014-08-22 13:44:56 UTC
but at the end you've got - 
2014-08-12 12:05:26,870 ERROR [org.ovirt.engine.core.bll.InstallVdsCommand] (org.ovirt.thread.pool-4-thread-49) [7db12cab] Host installation failed for host db0b69b9-5b0e-4e88-9e18-934e24580492, gre
en-b.: java.io.IOException: Command returned failure code 1 during SSH session 'root@10.35.102.11'
        at org.ovirt.engine.core.utils.ssh.SSHClient.executeCommand(SSHClient.java:527) [utils.jar:]


which is the exact problem.. it leads you to check vdsm.log or the host-deploy log and understand what went wrong

I don't see how else we can handle that.

oved, what do you think?

Comment 5 Oved Ourfali 2014-08-25 05:50:26 UTC
I agree. The message here seems good to me, under these circumstances. We are explaining exactly what happened. Closing it as wontfix, although I think it is notabug at all, but can't argue with the niceness of error messages...


Note You need to log in before you can comment on or make changes to this bug.