Bug 1245143 - [RHEV-H] Failed to deploy Hosted Engine on a storage domain as an additional host
Summary: [RHEV-H] Failed to deploy Hosted Engine on a storage domain as an additional ...
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node-plugin-hosted-engine
Version: 3.5.4
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ovirt-3.6.0-rc
: 3.6.0
Assignee: Fabian Deutsch
QA Contact: cshao
URL:
Whiteboard:
Keywords: Reopened, ZStream
Depends On: 1246863 1249118 1259247 1260470 1267437 1271976 1280268
Blocks: 1059435 1250199 1250400
TreeView+ depends on / blocked
 
Reported: 2015-07-21 10:05 UTC by cshao
Modified: 2016-03-09 14:32 UTC (History)
18 users (show)

(edit)
Clone Of:
: 1250400 (view as bug list)
(edit)
Last Closed: 2016-03-09 14:32:54 UTC


Attachments (Terms of Use)
he-iscsi.tar.gz (5.22 MB, application/x-gzip)
2015-07-21 10:07 UTC, cshao
no flags Details
Layout of the RHEV-M page (23.28 KB, image/png)
2015-07-31 10:57 UTC, Fabian Deutsch
no flags Details
Layout of the Hosted Engine page (23.34 KB, image/png)
2015-07-31 10:58 UTC, Fabian Deutsch
no flags Details
he6.7-software-iscsi-failed (6.75 MB, application/x-gzip)
2015-08-03 08:48 UTC, cshao
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0378 normal SHIPPED_LIVE ovirt-node bug fix and enhancement update for RHEV 3.6 2016-03-09 19:06:36 UTC
oVirt gerrit 44186 master ABANDONED login: Add get-hosted-engine-setup-answers Never
oVirt gerrit 44235 ovirt-3.5 MERGED page: Add additional host flow support Never
oVirt gerrit 44236 master MERGED page: Be more specififc about the Engine password Never
oVirt gerrit 44248 master ABANDONED page: Add additional host flow support Never
oVirt gerrit 44252 ovirt-3.5 MERGED page: Be more specififc about the Engine password Never

Comment 1 cshao 2015-07-21 10:07:57 UTC
Created attachment 1054273 [details]
he-iscsi.tar.gz

/var/log/*.*
sosreport

Comment 3 Fabian Deutsch 2015-07-28 09:54:55 UTC
Can this bug also be reproduced by running ovirt-hosted-engine-setup on a plain RHEL host?

Comment 4 Sandro Bonazzola 2015-07-28 10:00:08 UTC
I think you'have a configuration issue:

2015-07-21 08:56:32 DEBUG otopi.plugins.ovirt_hosted_engine_setup.vdsmd.cpu cpu._customization:137 Compatible CPU models are: [u'model_athlon', u'model_Opteron_G3', u'model_Opteron_G1', u'model_phenom', u'model_Opteron_G2']
2015-07-21 08:56:32 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:215 DIALOG:SEND                 The following CPU types are supported by this host:
2015-07-21 08:56:32 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:215 DIALOG:SEND                 	 - model_Opteron_G3: AMD Opteron G3
2015-07-21 08:56:32 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:215 DIALOG:SEND                 	 - model_Opteron_G2: AMD Opteron G2
2015-07-21 08:56:32 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:215 DIALOG:SEND                 	 - model_Opteron_G1: AMD Opteron G1
2015-07-21 08:56:32 DEBUG otopi.context context._executeMethod:152 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 142, in _executeMethod
  File "/usr/share/ovirt-hosted-engine-setup/plugins/ovirt-hosted-engine-setup/vdsmd/cpu.py", line 194, in _customization
RuntimeError: Invalid CPU type specified: None

You're failing because no CPU type has been specified in the answer file or the hardware is not compatible with the answer file provided by rhev-h.

Moving to node

Comment 5 Fabian Deutsch 2015-07-28 10:46:08 UTC
It doesn't look like we specify the CPU type, thus it is probably the answer to an interactive question:

[fabiand@tee ovirt-node-plugin-hosted-engine (ovirt-3.5)]$ git grep -i cpu
[fabiand@tee ovirt-node-plugin-hosted-engine (ovirt-3.5)]$ 


Chen, can you please provide the answer file, or tell us if you answered the CPU type question?

Comment 9 cshao 2015-07-29 08:44:49 UTC
(In reply to Fabian Deutsch from comment #5)
> It doesn't look like we specify the CPU type, thus it is probably the answer
> to an interactive question:
> 
> [fabiand@tee ovirt-node-plugin-hosted-engine (ovirt-3.5)]$ git grep -i cpu
> [fabiand@tee ovirt-node-plugin-hosted-engine (ovirt-3.5)]$ 
> 
> 
> Chen, can you please provide the answer file, or tell us if you answered the
> CPU type question?

I used the default CPU type: model_Opteron_G3: AMD Opteron G3

Comment 10 Fabian Deutsch 2015-07-29 09:42:30 UTC
Did you just hit <Return> or did you enter anything?

Comment 11 cshao 2015-07-29 09:49:15 UTC
(In reply to Fabian Deutsch from comment #10)
> Did you just hit <Return> or did you enter anything?

Yes, just hit <Return>.

Comment 12 Simone Tiraboschi 2015-07-29 15:45:54 UTC
The issue is there:

2015-07-21 09:09:04 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:215 DIALOG:SEND                 The specified storage location already contains a data domain. Is this an additional host setup (Yes, No)[Yes]? 
2015-07-21 09:09:12 INFO otopi.plugins.ovirt_hosted_engine_setup.storage.storage storage._handleHostId:123 Installing on additional host

So it's an additional host and so some values (for instance the CPU kind) are not asked anymore cause they have to match the configuration of other hosts (otherwise the live-migration will not work and so HA will be faulty).

Normally on additional hosts we ask to download the answerfile from one other  host but here you already appended one
2015-07-21 09:09:23 DEBUG otopi.context context.dumpEnvironment:500 ENV CORE/configFileAppend=str:':/tmp/tmpM3gp2K'
and so we assumed (maybe we could improve here!) that it was correctly and completely generated on the first host while yours misses the CPU type and so the issue.

Comment 13 Simone Tiraboschi 2015-07-29 15:47:05 UTC
Sorry, not sure about closing.
Probably we could still ask to download the answerfile if the one you passed is not complete.

Comment 14 Fabian Deutsch 2015-07-29 16:08:20 UTC
According to the offline discussion a possible fix is to introduce another question to ask the user if the attached file (using configappend) is the answer file from the other host.

The complete fix will also require a change on the node side to set this answer to "no", because in the RHEV-H flow we can not attach the answer file frmo that other host.

Comment 17 Ryan Barry 2015-07-30 14:51:29 UTC
(In reply to Fabian Deutsch from comment #14)
> According to the offline discussion a possible fix is to introduce another
> question to ask the user if the attached file (using configappend) is the
> answer file from the other host.
> 
> The complete fix will also require a change on the node side to set this
> answer to "no", because in the RHEV-H flow we can not attach the answer file
> frmo that other host.

Just a comment --

In the past, we discussed presenting users with a checkbox (or some other UI element) asking whether it's an additional host. That never got added, but the option to pull the answer file from the other host and use it is potentially there

Comment 18 Fabian Deutsch 2015-07-31 10:56:45 UTC
The most recent patches are preparing the following solution:

A new button is added to the RHEV-H TUI to start the deployment of an additional host.
That button will trigger the deployment as described here:
[0] https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.5/html/Installation_Guide/Installing_Additional_Hosts_to_a_Self-Hosted_Environment.html

Note: The TUI layout will also be changed a bit, to clarify what items are relevant for the first host deployment, and the additional host deployment.

The assumption of this approach is, that the SSH is enabled and a root password is set on the first deployed RHEV-H host with self-hosted engine.
This can be achieved by going to the RHEV-M page, setting a password in the shown fields and saving that page.

With the assumption that the patches are applied the flow looks as follows:

1. Install first RHEV-H host
2. Configure networking and setup HE on the first host
3. After the HE setup: Go to the RHEV-M/Engine page, set a password and hit "<Save / Register>"

4. Install an additional RHEV-H host
5. Go to the Hosted Engine page and select "Start additional host setup"
6. Proceed as described in the docs for RHEL-H [0]

The benefit is that the flow is the same on RHEL and RHEV-H.

It has to be verified if this is working as expected.

Comment 19 Fabian Deutsch 2015-07-31 10:57:47 UTC
Created attachment 1058029 [details]
Layout of the RHEV-M page

This screenshot shows the page where the password needs to be set to enable this flow.

Comment 20 Fabian Deutsch 2015-07-31 10:58:19 UTC
Created attachment 1058030 [details]
Layout of the Hosted Engine page

This screenshot shows the new layout of the hosted engine page to support both flows.

Comment 22 Fabian Deutsch 2015-07-31 19:07:24 UTC
The flow laid out in comment 18 was tested and it seems to work.

Comment 28 cshao 2015-08-03 08:48:45 UTC
Created attachment 1058670 [details]
he6.7-software-iscsi-failed

Comment 29 Fabian Deutsch 2015-08-03 12:25:47 UTC
Let me note that the flow of the setting up the initla HE host is not change or directly affected by this change.

Comment 34 cshao 2015-08-10 09:20:33 UTC
separate a new bug to trace software iscsi issue.
Bug 1251901 - [RHEV-H] Failed to deploy Hosted Engine through specify software iscsi as storage.

Comment 35 Fabian Deutsch 2015-08-10 11:20:12 UTC
This bug is covering the basic flow to add an additional host to the HE setup.

The bug in comment 34 is covering the still non-working case related to the CPU types.

According to comment 27 and comment 31 and comment 33 this bug can be moved to verified.

Comment 39 cshao 2015-10-09 07:33:26 UTC
This bug is blocked by bug 1267437 as it block HE setup , I will verify this bug after 1267437 fixed.

Comment 40 cshao 2015-10-26 07:45:26 UTC
This bug stills blocked by bug 1271976 as it block HE setup , I will verify this bug after 1271976 fixed.

Comment 41 cshao 2015-11-23 06:16:59 UTC
The bug blocked by new bug 1280268 as HE-VM cannot startup automatically after successful configure HE, I will verify this bug after 1280268 fixed.

Comment 42 cshao 2015-12-15 07:41:44 UTC
Test version:
rhev-hypervisor7-7.2-20151210.1
ovirt-node-3.6.0-0.24.20151209gitc0fa931.el7ev.noarch
ovirt-node-plugin-hosted-engine-0.3.0-4.el7ev.noarch
ovirt-hosted-engine-setup-1.3.1.2-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.3.3-1.el7ev.noarch


Test steps:
1. Install first RHEV-H host
2. Configure networking and setup HE on the first host
3. Specify nfs storage during storage configuration.
4. After the HE setup: Go to the RHEV-M/Engine page, set a password and hit "<Save / Register>"
5. Install an additional RHEV-H host
5. Go to the Hosted Engine page and select "Add this host to an existing group"
6. Proceed as described with correct steps.
7. Reboot host.

Test result:
Both above two hosts still can auto register to HE, the persistence is working correct.

So the bug is fixed, change bug status to VERIFIED.

Comment 44 errata-xmlrpc 2016-03-09 14:32:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0378.html


Note You need to log in before you can comment on or make changes to this bug.