Bug 1338511 - HE does not work when /var is too small (default case)
Summary: HE does not work when /var is too small (default case)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-node
Classification: oVirt
Component: Installation & Update
Version: 4.0
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ovirt-4.0.1
: 4.0
Assignee: Fabian Deutsch
QA Contact: cshao
URL:
Whiteboard:
Depends On: 1344900
Blocks: ovirt-node-ng
TreeView+ depends on / blocked
 
Reported: 2016-05-22 09:26 UTC by Fabian Deutsch
Modified: 2016-08-04 13:33 UTC (History)
11 users (show)

Fixed In Version: ovirt-node-ng-installer-ovirt-4.0-snapshot-2016061419.iso
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-04 13:33:22 UTC
oVirt Team: Node
Embargoed:
rule-engine: ovirt-4.0.0+
ycui: testing_plan_complete?
rule-engine: planning_ack+
fdeutsch: devel_ack+
ycui: testing_ack+


Attachments (Terms of Use)
all log info (6.18 MB, application/x-gzip)
2016-06-16 07:07 UTC, cshao
no flags Details

Description Fabian Deutsch 2016-05-22 09:26:42 UTC
Description of problem:
HE setup will fail if /var is less than 10GB, by default it is currently 4GB, thus setting up HE fails.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
This should probably be solved by custom partitioning.

Comment 1 cshao 2016-05-23 07:35:37 UTC
I can reproduce this issue.

Test version:
RHEVH-7.2-20160520.t.0-RHEVH-x86_64-dvd1.iso
imgbased-0.6-0.1.el7ev.noarch
ovirt-hosted-engine-setup-1.3.6.1-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.5.5-1.el7ev.noarch
rhevm-appliance-20160515.0-1.el7ev.3.6.ova

Test steps:
1. Install RHEVH-ng-7.2-20160520 with default storage layout.
2. Disabled NetworkManager and start hosted-engine setup from cockpit.
3. #rpm -ivh 20160515.0-1.el7ev.3.6.rpm on NGN
4. Setup Hosted Engine step by step
5. Please specify the device to boot the VM from (choose disk for the oVirt engine appliance) (cdrom, disk, pxe) [disk]: [disk]
6. The following appliance have been found on your system:
[1] - The RHEV-M Appliance image (OVA) - 20160515.0-1.el7ev
[2] - Directly select an OVA file
Please select an appliance (1, 2) [1]: 1
7. Press enter key with default directory [/var/tmp]

Test result:
It report "Not enough space in the temporary directory".

Please specify path to a temporary directory with at least 50 GB [/var/tmp]:
Not enough space in the temporary directory

NOTE:
If we specify path to another path, the HE deploy can continue.

Comment 2 cshao 2016-06-16 07:06:26 UTC
Test version:
ovirt-node-ng-installer-ovirt-4.0-snapshot-2016061504.iso
imgbased-0.7.0-0.201606141357gitbd2220e.el7.centos.noarch

Test steps:
1. Install NGN
2. login cockpit-> ovirt page -> hosted engine
3. Start deploy HE.


Test result:
Failed to execute stage 'Environment setup': Couldnt connect to VDSM within 240 seconds
Hosted Engine deployment failed: this system is not reliable, please check the issue, fix and redeploy


# systemctl status vdsmd
● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2016-06-16 18:32:48 CST; 15min ago
  Process: 18681 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
 Main PID: 18766 (vdsm)
   CGroup: /system.slice/vdsmd.service
           └─18766 /usr/bin/python /usr/share/vdsm/vdsm

Jun 16 18:36:26 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof
Jun 16 18:39:35 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof
Jun 16 18:39:42 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof
Jun 16 18:39:47 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof
Jun 16 18:42:56 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof
Jun 16 18:42:59 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof
Jun 16 18:43:10 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof
Jun 16 18:46:19 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof
Jun 16 18:46:26 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof
Jun 16 18:46:31 cshaoh.redhat.com vdsm[18766]: vdsm ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake: unexpected eof


So change bug status to ASSIGNED.

Comment 3 Red Hat Bugzilla Rules Engine 2016-06-16 07:06:31 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 4 cshao 2016-06-16 07:07:10 UTC
Created attachment 1168569 [details]
all log info

Comment 5 cshao 2016-06-16 08:13:22 UTC
(In reply to shaochen from comment #2)
> Test version:
> ovirt-node-ng-installer-ovirt-4.0-snapshot-2016061504.iso
> imgbased-0.7.0-0.201606141357gitbd2220e.el7.centos.noarch
> 
> Test steps:
> 1. Install NGN
> 2. login cockpit-> ovirt page -> hosted engine
> 3. Start deploy HE.
> 
> 
> Test result:
> Failed to execute stage 'Environment setup': Couldnt connect to VDSM within
> 240 seconds
> Hosted Engine deployment failed: this system is not reliable, please check
> the issue, fix and redeploy
> 
> 
> # systemctl status vdsmd
> ● vdsmd.service - Virtual Desktop Server Manager
>    Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor
> preset: enabled)
>    Active: active (running) since Thu 2016-06-16 18:32:48 CST; 15min ago
>   Process: 18681 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
> --pre-start (code=exited, status=0/SUCCESS)
>  Main PID: 18766 (vdsm)
>    CGroup: /system.slice/vdsmd.service
>            └─18766 /usr/bin/python /usr/share/vdsm/vdsm
> 
> Jun 16 18:36:26 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> Jun 16 18:39:35 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> Jun 16 18:39:42 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> Jun 16 18:39:47 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> Jun 16 18:42:56 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> Jun 16 18:42:59 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> Jun 16 18:43:10 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> Jun 16 18:46:19 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> Jun 16 18:46:26 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> Jun 16 18:46:31 cshaoh.redhat.com vdsm[18766]: vdsm
> ProtocolDetector.SSLHandshakeDispatcher ERROR Error during handshake:
> unexpected eof
> 
> 
> So change bug status to ASSIGNED.

#c2 was occurred due to failed to analysis hostname, after add hostname to /etc/hosts, the issue was gone. so please ignore comments 2.


Change back bug status to ON_QA.

Comment 6 cshao 2016-06-16 08:17:59 UTC
Custom set var partition should resolved this issue, but will met Bug 1344900 - Hosted Engine deployment failed. So I will verify this bug after bug 1344900 fixed.

Comment 7 Fabian Deutsch 2016-06-16 09:42:42 UTC
Chen, in comment 5 yuo wrote:
"
#c2 was occurred due to failed to analysis hostname, after add hostname to /etc/hosts, the issue was gone. so please ignore comments 2.
"

What did yuo have to do with the host name?

I want to understand why this step was necessary

Comment 8 cshao 2016-06-17 03:45:15 UTC
(In reply to Fabian Deutsch from comment #7)
> Chen, in comment 5 yuo wrote:
> "
> #c2 was occurred due to failed to analysis hostname, after add hostname to
> /etc/hosts, the issue was gone. so please ignore comments 2.
> "
> 
> What did yuo have to do with the host name?
> 
> I want to understand why this step was necessary

I thought that edit the /etc/hosts file is for analysis hostname, if without analysis, it will failed during register host to engine, so I did it every time.
But actually analysis is unnecessary.

After re-check #c2, seem it is the different error, I will report a new one to trace the new issue

Comment 9 cshao 2016-07-20 09:09:29 UTC
Test version:
redhat-virtualization-host-4.0-20160714.3
imgbased-0.7.2-0.1.el7ev.noarch
redhat-release-virtualization-host-4.0-0.20.el7.x86_64
ovirt-hosted-engine-ha-2.0.1-1.el7ev.noarch
ovirt-hosted-engine-setup-2.0.1-1.el7ev.noarch
rhevm-appliance-20160714.0-1.el7ev.4.0.ova 

Test step:

 During deploy HE(with the default size of /var) via cockpit, it will pop-up "Not enough space in the temporary directory [/var/tmp]". We can specify a larger partition manually(e.g. /home), and deploy HE can successful with the new path.   

So the bug is fixed, change bug status to VERIFIED.


Note You need to log in before you can comment on or make changes to this bug.