1129238 – [engine-backend] bad handling with OSError

Bug 1129238 - [engine-backend] bad handling with OSError

Summary: [engine-backend] bad handling with OSError

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-engine
Sub Component:
Version:	3.4.1-1
Hardware:	x86_64
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	3.5.0
Assignee:	Yaniv Bronhaim
QA Contact:
Docs Contact:
URL:
Whiteboard:	infra
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-08-12 11:12 UTC by Elad
Modified:	2016-02-10 19:16 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-08-25 05:50:26 UTC
oVirt Team:	Infra
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
logs from engine and host (109.36 KB, application/x-gzip) 2014-08-12 11:12 UTC, Elad	no flags	Details
View All

Description Elad 2014-08-12 11:12:29 UTC

Created attachment 926039 [details]
logs from engine and host

Description of problem:
Added a rhel7 host to my setup. The installation failed due to an error in vdsm:
OSError: [Errno 2] No such file or directory: '/var/run/vdsm/client.log'

The error wasn't caught properly by engine, as seen in engine.log:

2014-08-12 12:05:26,862 ERROR [org.ovirt.engine.core.utils.ssh.SSHDialog] (org.ovirt.thread.pool-4-thread-49) SSH error running command root.102.11:'umask 0077; MYTMP="$(mktemp -t ovirt-XXXXXXXXXX)"; trap "c
hmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; rm -fr "${MYTMP}" && mkdir "${MYTMP}" && tar --warning=no-timestamp -C "${MYTMP}" -x &&  "${MYTMP}"/setup DIALOG/dialect=str:ma
chine DIALOG/customization=bool:True': java.io.IOException: Command returned failure code 1 during SSH session 'root.102.11'
        at org.ovirt.engine.core.utils.ssh.SSHClient.executeCommand(SSHClient.java:527) [utils.jar:]
        at org.ovirt.engine.core.utils.ssh.SSHDialog.executeCommand(SSHDialog.java:318) [utils.jar:]


Version-Release number of selected component (if applicable):
rhev-3.4.2-av11

engine:
rhevm-3.4.2-0.1.el6ev.noarch

host:
Red Hat Enterprise Linux Server release 7.0 (Maipo)
vdsm-4.14.13-1.el7ev.x86_64


How reproducible:
while https://bugzilla.redhat.com/show_bug.cgi?id=1129232 is reproduced

Steps to Reproduce:
1. Create a DC and cluster in rhevm
2. Attach a rhel7 host to the setup

Actual results:
vdsm fails with:
OSError: [Errno 2] No such file or directory: '/var/run/vdsm/client.log'

and the error isn't treated properly by engine. As far as I understand, this is a log issue, didn't see any other undesirable behavior. Host becomes non-operational and the following message is shown in webadmin:

Failed to install Host green-b. Failed to execute stage 'Closing up': Command '/bin/systemctl' failed to execute.

Expected results:
The error from vdsm should be treated and reported nicely in the logs


Additional info: logs from engine and host

Comment 1 Elad 2014-08-12 11:17:46 UTC

The bug was opened on 3.4.1, but it occurs in 3.4.2. There is no such option.

Comment 2 Yaniv Bronhaim 2014-08-20 00:30:46 UTC

I'm not sure the error you mention related to the same exception in vdsm.
 but anyhow, vdsm stopped to response after this exception iiuc, so the ssh communication might dropped and it reports about installation failure or that service could not start properly. what else do we except to see if vdsm doesn't response or cannot start? I think the current behavior and report are fine in such cases from engine's prospective

Comment 3 Elad 2014-08-20 07:39:33 UTC

As I see it, the issue here is the ugly message in the log

Comment 4 Yaniv Bronhaim 2014-08-22 13:44:56 UTC

but at the end you've got - 
2014-08-12 12:05:26,870 ERROR [org.ovirt.engine.core.bll.InstallVdsCommand] (org.ovirt.thread.pool-4-thread-49) [7db12cab] Host installation failed for host db0b69b9-5b0e-4e88-9e18-934e24580492, gre
en-b.: java.io.IOException: Command returned failure code 1 during SSH session 'root.102.11'
        at org.ovirt.engine.core.utils.ssh.SSHClient.executeCommand(SSHClient.java:527) [utils.jar:]


which is the exact problem.. it leads you to check vdsm.log or the host-deploy log and understand what went wrong

I don't see how else we can handle that.

oved, what do you think?

Comment 5 Oved Ourfali 2014-08-25 05:50:26 UTC

I agree. The message here seems good to me, under these circumstances. We are explaining exactly what happened. Closing it as wontfix, although I think it is notabug at all, but can't argue with the niceness of error messages...

Note You need to log in before you can comment on or make changes to this bug.