Bug 1306535 - Hosted engine fails with no ERROR displayed
Hosted engine fails with no ERROR displayed
Status: CLOSED NEXTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node-plugin-hosted-engine (Show other bugs)
3.5.7
Unspecified Unspecified
medium Severity medium
: ovirt-4.0.0-rc
: ---
Assigned To: Ryan Barry
cshao
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-11 03:23 EST by Roman Hodain
Modified: 2016-08-23 07:08 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-08-23 07:08:19 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Node
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Roman Hodain 2016-02-11 03:23:41 EST
Description of problem:
When an ERROR occurs during the hosted engine deployment from the node TUI. The error message is removed and replaced by 

    [screen is terminating]
    Hit <Return> to return to the TUI

It is not possible to determinate what the error is unless you look into the logs. User cannot know where to look in case of this behaviour.

Version-Release number of selected component (if applicable):
    Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160105.1.el7ev)

How reproducible:
100%

Steps to Reproduce:
1. Install the hypervisor with low disk space
      Filesystem                  Size  Used Avail Use% Mounted on
      /dev/mapper/HostVG-Data     2.3G  1.9G  308M  86% /data

2. Try to deploy the hosted engine appliance

Actual results:
No information on the screen apart of the two lines above.

Expected results:
The error message remains on the screen or the tool points to user to the right place.

Additional info:
This issue happened due to:

2016-02-11 08:09:33 DEBUG otopi.plugins.ovirt_hosted_engine_setup.vm.boot_disk boot_disk._customization:460 Error checking TMPDIR spac
e
Traceback (most recent call last):
  File "/usr/share/ovirt-hosted-engine-setup/plugins/ovirt-hosted-engine-setup/vm/boot_disk.py", line 454, in _customization
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/domains.py", line 127, in check_available_space
InsufficientSpaceError: Error: mount point /data/ovirt-hosted-engine-setup/tmp-setup contains only 307Mb of available space while a mi
nimum of 51200Mb is required
2016-02-11 08:09:33 DEBUG otopi.plugins.ovirt_hosted_engine_setup.vm.boot_disk boot_disk._customization:462 Error: mount point /data/o
virt-hosted-engine-setup/tmp-setup contains only 307Mb of available space while a minimum of 51200Mb is required
2016-02-11 08:09:33 DEBUG otopi.context context._executeMethod:152 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 142, in _executeMethod
  File "/usr/share/ovirt-hosted-engine-setup/plugins/ovirt-hosted-engine-setup/vm/boot_disk.py", line 466, in _customization
RuntimeError: Not enough space in the temporary directory
2016-02-11 08:09:33 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Environment customization': Not enough spa
ce in the temporary directory

The problem is irrelevant, but without the knowledge where to look. The investigation is impossible.
Comment 1 Doron Fediuck 2016-02-17 03:39:12 EST
Fabian,
in 3.6 this should become irrelevant as we're using the appliance.
Can you please verify?
Comment 2 Roman Hodain 2016-02-17 03:50:48 EST
(In reply to Doron Fediuck from comment #1)
> Fabian,
> in 3.6 this should become irrelevant as we're using the appliance.
> Can you please verify?

Well this issue happened during the appliance deployment.
Comment 3 Fabian Deutsch 2016-02-17 14:40:47 EST
According to my understanding the problem is that on Node, any error which is happening during the he setup is not shown on screen (only written to the logs). This can be irritating to the user, because he needs to look into the logs to find the cause.

There are a few questions:

a) Does the HE setup display the error on screen?

b) If a)==yes - Why is the screen erased in the Node case - this should not happen

If a)==no then it's something on the HE setup side
If the problem is b), then it's something on the Node side

Simone, can you tell me the answer to a)?
Comment 4 Simone Tiraboschi 2016-02-18 04:03:51 EST
(In reply to Fabian Deutsch from comment #3)
> According to my understanding the problem is that on Node, any error which
> is happening during the he setup is not shown on screen (only written to the
> logs). This can be irritating to the user, because he needs to look into the
> logs to find the cause.
> 
> There are a few questions:
> 
> a) Does the HE setup display the error on screen?

Yes it does, it should also return a non zero exit code.

> b) If a)==yes - Why is the screen erased in the Node case - this should not
> happen
> 
> If a)==no then it's something on the HE setup side
> If the problem is b), then it's something on the Node side
> 
> Simone, can you tell me the answer to a)?
Comment 5 Fabian Deutsch 2016-02-18 04:37:50 EST
Thanks Simone.

So it is a Node issue.
Comment 6 Fabian Deutsch 2016-05-17 05:34:19 EDT
Ying, can this still be reproduced on NGN?
Comment 7 Ryan Barry 2016-05-26 13:20:17 EDT
(In reply to Fabian Deutsch from comment #6)
> Ying, can this still be reproduced on NGN?

No, this cannot be reproduced in NGN. cockpit-ovirt-dashboard properly handles ERROR from otopi's machine dialog.

There was a bug (bz#1334696) about a lack of feedback if HE setup immediately crashed (I could only reproduce this by manually removing python deps), but that's in POST, and will be merged soon.

We have this problem in legacy node because it's called directly with "screen hosted-engine --deploy", and we don't console.wait_for_keypress() on an exception.

However, to really fix this, we'll probably need a small wrapper script (or to find and dump the last messages from the hosted engine log, which may be too verbose), otherwise we'll just see [screen is terminating]
Comment 8 Ying Cui 2016-06-08 02:13:37 EDT
(In reply to Fabian Deutsch from comment #6)
> Ying, can this still be reproduced on NGN?

Following the reproduce steps in bug description, it can not be reproduced in rhevh-ng(rhev-hypervisor7-ng-4.0-20160607.1,cockpit-ovirt-dashboard-0.10.3-0.0.1.el7ev), currently in rhevh-ng cockpit UI, if not enough space for rhevm-appliance ova, here will display the friendly warning messages in UI to let user set the specify path for ova instead of quitting setup without information in UI.

For this bug, we also need to care how to display in UI with the last messages from ovirt-hosted-engine-setup log as comment 7 said to let user know the failure reason to avoid dropping to shell to check the log directly.
Comment 9 Fabian Deutsch 2016-06-08 04:28:57 EDT
According to comment 8 it sounds as if this bug can go back to ON_QA or even VERIFIED.

However, I understand that we need to make sure (in future bugs) that all relevant errors are propagated to the UI.

We could consider to modify he-setup to also log to the journal, then cockpits log viewing facilities could be used for debugging.
Comment 10 wanghui 2016-06-16 01:59:28 EDT
Due to this issue can not be reproduced in rhev-hypervisor7-ng-4.0-20160609.0.x86_64 now, and the bug#1334696 also been fixed now. This bug can be verified this time.

Note You need to log in before you can comment on or make changes to this bug.