1215437 – hosted-engine --deploy will fail when re-deploying the first host

Bug 1215437 - hosted-engine --deploy will fail when re-deploying the first host

Summary: hosted-engine --deploy will fail when re-deploying the first host

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-hosted-engine-setup
Sub Component:
Version:	3.5.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	ovirt-3.6.2
Target Release:	3.6.2
Assignee:	Simone Tiraboschi
QA Contact:	Nikolai Sednev
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-04-26 15:01 UTC by Vladimir Vasilev
Modified:	2021-08-30 13:58 UTC (History)
CC List:	11 users (show)
Fixed In Version:	ovirt-hosted-engine-setup-1.3.2.2-2.el7ev
Doc Type:	Bug Fix
Doc Text:	Previously, hosted-engine-setup had the capability to redeploy a host reusing the same host ID it was using before, but this was not allowed for host 1. With this release, it is now possible to redeploy host 1 reusing the same host ID.
Clone Of:
Environment:
Last Closed:	2016-03-09 19:11:56 UTC
oVirt Team:	Integration
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
engine's sosreport (7.30 MB, application/x-xz) 2016-01-13 17:35 UTC, Nikolai Sednev	no flags	Details
first host's sosreport (5.81 MB, application/x-xz) 2016-01-13 17:36 UTC, Nikolai Sednev	no flags	Details
second host's sosreport (8.66 MB, application/x-xz) 2016-01-13 17:37 UTC, Nikolai Sednev	no flags	Details
hosted engine deployment log from first host during redeployment (147.35 KB, text/plain) 2016-01-13 17:38 UTC, Nikolai Sednev	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHV-43327	None	None	None	2021-08-30 13:58:09 UTC
Red Hat Knowledge Base (Solution)	2607881	None	None	None	2016-09-07 22:21:21 UTC
Red Hat Product Errata	RHEA-2016:0375	normal	SHIPPED_LIVE	ovirt-hosted-engine-setup bug fix and enhancement update	2016-03-09 23:48:34 UTC
oVirt gerrit	48836	master	MERGED	packaging: setup: let the user redeploy an host as host 1	2015-12-23 09:30:37 UTC
oVirt gerrit	50879	ovirt-hosted-engine-setup-1.3	MERGED	packaging: setup: let the user redeploy an host as host 1	2015-12-29 07:00:43 UTC

Description Vladimir Vasilev 2015-04-26 15:01:54 UTC

Description of problem:
I have issues with the first host and now I can't re-deploy it. hosted-engine --deploy is complaining with "Cannot use the same ID used by first host"

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-1.2.1-9.el6ev.noarch
ovirt-hosted-engine-ha-1.2.4-5.el6ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Re-deploy host 1 with "hosted-engine --deploy"
2. On "Please specify the Host ID [Must be integer, default: 2]:" put 1

Actual results:
[ ERROR ] Cannot use the same ID used by first host


Expected results:
The Host ID is already known. Is this a re-deployment on an additional host that was previously set up (Yes, No)

Additional info:
This was working fine with RHEV 3.4

Comment 1 Martin Sivák 2015-04-27 10:03:16 UTC

I see this was changed in https://gerrit.ovirt.org/#/c/17083/8/src/plugins/ovirt-hosted-engine-setup/storage/storage.py but the commit message does not say anything about the reason.

Comment 2 Vladimir Vasilev 2015-04-28 16:04:25 UTC

Just tested one workaround that works (all done on host id 1):

1. Enable RHEV 3.4 channel
2. Make sure all engine related services are stopped in correct order:
service vdsmd stop
service supervdsmd stop
service sanlock stop
service wdmd stop
service ovirt-ha-agent stop
service ovirt-ha-broker stop
(just in case ps and kill if something is still running)
3. Unmount the nfs share if it's mounted
4. Downgrade both ovirt-hosted-engine-ha ovirt-hosted-engine-setup
5. Run again 'hosted-engine --deploy'
This time the ID part pass:
[ INFO  ] The specified storage location already contains a data domain. Is this an additional host setup (Yes, No)[Yes]?
[ INFO  ] Installing on additional host
          Please specify the Host ID [Must be integer, default: 2]: 1
[ ERROR ] Cannot use the same ID used by first host
          The Host ID is already known. Is this a re-deployment on an additional host that was previously set up (Yes, No)[Yes]?
(deploy continues)
scp answers.conf from another working hypervisor.
At the end RHEVM will put the newly added host to "Non Operational" and the deploy script will timeout and exit. 
6. Stop again all engine related services in correct order and unmount the NFS share (double check with ps)
7. Copy manually /etc/ovirt-hosted-engine/answers.conf from working hypervisor to host 1
8. Upgrade both ovirt-hosted-engine-ha ovirt-hosted-engine-setup from RHEV 3.5 channel
9. Start ovirt-ha-broker and ovirt-ha-agent
10. Wait a few mins and 'hosted-engine --vm-status' will show correct status on all hosts including host 1
11. Activate host 1 in RHEVM

Comment 3 Mario Rial 2015-06-04 16:48:21 UTC

@Vladimir Vasilev

The workaround that you comment doesn't work on Centos7 because is incompatible with the Ovirt Release 3.4.

I think this is a common case that shoud be fixed in this release...

I worked around it installing a new hosted engine in a new path and in the middle of the process copy the HDD of the old hosted engine.

Comment 4 Yaniv Lavi 2015-11-02 14:42:31 UTC

When fixing thisx we should consider if this should happen only in rollback? 
I'm not sure we want this on reinstall after losing hosts?

Comment 5 Yaniv Lavi 2015-11-26 11:26:57 UTC

Why was this moved back to 3.6?

Comment 6 Doron Fediuck 2015-11-26 15:44:02 UTC

(In reply to Yaniv Dary from comment #5)
> Why was this moved back to 3.6?

There's a negative impact caused by this issue which for is trashing the
meta-data file by blocking host IDs.

Comment 7 Nikolai Sednev 2016-01-13 17:19:59 UTC

Tested on these components:
mom-0.5.1-1.el7ev.noarch
ovirt-vmconsole-1.0.0-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.5.x86_64
ovirt-vmconsole-host-1.0.0-1.el7ev.noarch
ovirt-host-deploy-1.4.1-1.el7ev.noarch
libvirt-client-1.2.17-13.el7_2.2.x86_64
sanlock-3.2.4-2.el7_2.x86_64
ovirt-setup-lib-1.0.1-1.el7ev.noarch
vdsm-4.17.15-0.el7ev.noarch
ovirt-hosted-engine-setup-1.3.2.1-1.el7ev.noarch

1.Deployed 2 hosts with HE.
2.Reprovisioned the first host to clean OS to reproduce this bug.
3.Installed hosted-engine-setup on first host.
4.Tried deploying the hosted-engine on first host and failed:
# hosted-engine --deploy
[ INFO  ] Stage: Initializing
[ INFO  ] Generating a temporary VNC password.
[ INFO  ] Stage: Environment setup
          Continuing will configure this host for serving as hypervisor and create a VM where you have to install oVirt Engine afterwards.
          Are you sure you want to continue? (Yes, No)[Yes]: 
          Configuration files: []
          Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160113191158-tbgk31.log
          Version: otopi-1.4.0 (otopi-1.4.0-1.el7ev)
          It has been detected that this program is executed through an SSH connection without using screen.
          Continuing with the installation may lead to broken installation if the network connection fails.
          It is highly recommended to abort the installation and run it inside a screen session using command "screen".
          Do you want to continue anyway? (Yes, No)[No]: yes
[ INFO  ] Hardware supports virtualization
[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Generating libvirt-spice certificates
[ INFO  ] Stage: Environment customization
         
          --== STORAGE CONFIGURATION ==--
         
          During customization use CTRL-D to abort.
          Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: 
          Please specify the full shared storage connection path to use (example: host:/path): 10.35.64.11:/vol/RHEV/Virt/nsednev_upgrade_he_3_5_6_to_3_5_7_el_7_2
          The specified storage location already contains a data domain. Is this an additional host setup (Yes, No)[Yes]? 
[ INFO  ] Installing on additional host
          Please specify the Host ID [Must be integer, default: 2]: 1
[ ERROR ] Cannot use the same ID used by first host
          Please specify the Host ID [Must be integer, default: 2]:

Comment 8 Nikolai Sednev 2016-01-13 17:22:46 UTC

HE-VM is running with these components:
rhevm-dwh-setup-3.6.2-1.el6ev.noarch
ovirt-vmconsole-1.0.0-1.el6ev.noarch
rhevm-dwh-3.6.2-1.el6ev.noarch
ovirt-engine-extension-aaa-jdbc-1.0.4-1.el6ev.noarch
rhevm-3.6.2-0.1.el6.noarch
ovirt-setup-lib-1.0.1-1.el6ev.noarch
ovirt-vmconsole-proxy-1.0.0-1.el6ev.noarch
ovirt-host-deploy-1.4.1-1.el6ev.noarch
ovirt-host-deploy-java-1.4.1-1.el6ev.noarch

Comment 9 Nikolai Sednev 2016-01-13 17:35:25 UTC

Created attachment 1114476 [details]
engine's sosreport

Comment 10 Nikolai Sednev 2016-01-13 17:36:32 UTC

Created attachment 1114477 [details]
first host's sosreport

Comment 11 Nikolai Sednev 2016-01-13 17:37:37 UTC

Created attachment 1114478 [details]
second host's sosreport

Comment 12 Nikolai Sednev 2016-01-13 17:38:46 UTC

Created attachment 1114479 [details]
hosted engine deployment log from first host during redeployment

Comment 13 Simone Tiraboschi 2016-01-18 14:48:37 UTC

(In reply to Nikolai Sednev from comment #7)
> ovirt-hosted-engine-setup-1.3.2.1-1.el7ev.noarch

Can you please verify with ovirt-hosted-engine-setup-1.3.2.2

Comment 14 Nikolai Sednev 2016-01-21 09:58:01 UTC

Successfully got HE-SD&HE-VM auto-imported on cleanly installed NFS deployment after NFS data SD was added. Engine was installed using PXE. Then I've added second HE host and migrated HE-VM to it. Then I've reprovisioned first host to clean OS and installed ovirt-hosted-engine-setup package on it. I've set the first host in to maintenance within the engine's WEBUI and removed it from there. I've re-deployed the first host successfully after it was removed from the engine's WEBUI.
Works for me on these components:
Host:
ovirt-vmconsole-1.0.0-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.3.7-1.el7ev.noarch
mom-0.5.1-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.6.x86_64
ovirt-host-deploy-1.4.1-1.el7ev.noarch
libvirt-client-1.2.17-13.el7_2.2.x86_64
ovirt-setup-lib-1.0.1-1.el7ev.noarch
vdsm-4.17.18-0.el7ev.noarch
ovirt-vmconsole-host-1.0.0-1.el7ev.noarch
ovirt-hosted-engine-setup-1.3.2.3-1.el7ev.noarch
sanlock-3.2.4-2.el7_2.x86_64
Linux version 3.10.0-327.8.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Mon Jan 11 05:03:18 EST 2016

Engine:
ovirt-vmconsole-1.0.0-1.el6ev.noarch
ovirt-host-deploy-1.4.1-1.el6ev.noarch
ovirt-setup-lib-1.0.1-1.el6ev.noarch
ovirt-vmconsole-proxy-1.0.0-1.el6ev.noarch
ovirt-host-deploy-java-1.4.1-1.el6ev.noarch
ovirt-engine-extension-aaa-jdbc-1.0.5-1.el6ev.noarch
rhevm-3.6.2.6-0.1.el6.noarch
rhevm-dwh-setup-3.6.2-1.el6ev.noarch
rhevm-dwh-3.6.2-1.el6ev.noarch
rhevm-reports-setup-3.6.2.4-1.el6ev.noarch
rhevm-reports-3.6.2.4-1.el6ev.noarch
rhevm-guest-agent-common-1.0.11-2.el6ev.noarch
Linux version 2.6.32-573.8.1.el6.x86_64 
(mockbuild.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) ) #1 SMP Fri Sep 25 19:24:22 EDT 2015


# hosted-engine --deploy
[ INFO  ] Stage: Initializing
[ INFO  ] Generating a temporary VNC password.
[ INFO  ] Stage: Environment setup
          Continuing will configure this host for serving as hypervisor and create a VM where you have to install the engine afterwards.
          Are you sure you want to continue? (Yes, No)[Yes]: 
          Configuration files: []
          Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160121114943-n0od5d.log
          Version: otopi-1.4.0 (otopi-1.4.0-1.el7ev)
          It has been detected that this program is executed through an SSH connection without using screen.
          Continuing with the installation may lead to broken installation if the network connection fails.
          It is highly recommended to abort the installation and run it inside a screen session using command "screen".
          Do you want to continue anyway? (Yes, No)[No]: yes
[ INFO  ] Hardware supports virtualization
[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup
[ INFO  ] Stage: Environment customization
         
          --== STORAGE CONFIGURATION ==--
         
          During customization use CTRL-D to abort.
          Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: 
          Please specify the full shared storage connection path to use (example: host:/path): 10.35.64.11:/vol/RHEV/Virt/nsednev_upgrade_he_3_5_6_to_3_5_7_el_7_2/
          The specified storage location already contains a data domain. Is this an additional host setup (Yes, No)[Yes]? 
[ INFO  ] Installing on additional host
          Please specify the Host ID [Must be integer, default: 2]: 1
          Local storage datacenter name is an internal name
          and currently will not be shown in engine's admin UI.
          Please enter local datacenter name [hosted_datacenter]: 
         
          --== SYSTEM CONFIGURATION ==--
         
[WARNING] A configuration file must be supplied to deploy Hosted Engine on an additional host.
          The answer file may be fetched from an active HE host using scp.
          If you do not want to download it automatically you can abort the setup answering no to the following question.
          Do you want to scp the answer file from another HE host? (Yes, No)[Yes]: 
          Please provide the FQDN or IP of an active HE host: alma04.qa.lab.tlv.redhat.com
          Enter 'root' user password for host alma04.qa.lab.tlv.redhat.com: 
[ INFO  ] Answer file successfully downloaded
         
          --== NETWORK CONFIGURATION ==--
         
[ INFO  ] Additional host deployment, firewall manager is 'iptables'
          The following CPU types are supported by this host:
                 - model_SandyBridge: Intel SandyBridge Family
                 - model_Westmere: Intel Westmere Family
                 - model_Nehalem: Intel Nehalem Family
                 - model_Penryn: Intel Penryn Family
                 - model_Conroe: Intel Conroe Family
         
          --== HOSTED ENGINE CONFIGURATION ==--
         
          Enter the name which will be used to identify this host inside the Administrator Portal [hosted_engine_1]: 
          Enter 'admin@internal' user password that will be used for accessing the Administrator Portal: 
          Confirm 'admin@internal' user password: 
[ INFO  ] Stage: Setup validation
          The Host ID is already known. Is this a re-deployment on an additional host that was previously set up (Yes, No)[Yes]? 
         
          --== CONFIGURATION PREVIEW ==--
         
          Engine FQDN                        : nsednev-he-1.qa.lab.tlv.redhat.com
          Bridge name                        : ovirtmgmt
          SSH daemon port                    : 22
          Firewall manager                   : iptables
          Gateway address                    : 10.35.117.254
          Host name for web application      : hosted_engine_1
          Host ID                            : 1
          Image size GB                      : 25
          GlusterFS Share Name               : hosted_engine_glusterfs
          GlusterFS Brick Provisioning       : False
          Storage connection                 : 10.35.64.11:/vol/RHEV/Virt/nsednev_upgrade_he_3_5_6_to_3_5_7_el_7_2/
          Console type                       : vnc
          Memory size MB                     : 4096
          MAC address                        : 00:16:3E:7B:B8:53
          Boot type                          : disk
          Number of CPUs                     : 4
          CPU Type                           : model_SandyBridge
[ INFO  ] Stage: Transaction setup
[ INFO  ] Stage: Misc configuration
[ INFO  ] Stage: Package installation
[ INFO  ] Stage: Misc configuration
[ INFO  ] Configuring libvirt
[ INFO  ] Configuring VDSM
[ INFO  ] Starting vdsmd
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Configuring VM
[ INFO  ] Updating hosted-engine configuration
[ INFO  ] Stage: Transaction commit
[ INFO  ] Stage: Closing up
[ INFO  ] Acquiring internal CA cert from the engine
[ INFO  ] The following CA certificate is going to be used, please immediately interrupt if not correct:
[ INFO  ] Issuer: C=US, O=qa.lab.tlv.redhat.com, CN=nsednev-he-1.qa.lab.tlv.redhat.com.64061, Subject: C=US, O=qa.lab.tlv.redhat.com, CN=nsednev-he-1.qa.lab.tlv.redhat.com.64061, Fingerprint (SHA-1): 2A0B2E22DBED530BF442D5467932F7FA84A94722
[ INFO  ] Connecting to the Engine
[ ERROR ] Cannot automatically add the host to cluster Default: Cannot add Host. The Host name is already in use, please choose a unique name and try again. 
         
         
          Please check Engine VM configuration.
         
          Make a selection from the options below:
          (1) Continue setup - Engine VM configuration has been fixed
          (2) Abort setup
         
          (1, 2)[1]: 
         
          Checking for oVirt-Engine status at nsednev-he-1.qa.lab.tlv.redhat.com...
[ INFO  ] Engine replied: DB Up!Welcome to Health Status!
[ INFO  ] Waiting for the host to become operational in the engine. This may take several minutes...
[ INFO  ] Still waiting for VDSM host to become operational...
[ INFO  ] The VDSM Host is now operational
[ INFO  ] Enabling and starting HA services
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160121115252.conf'
[ INFO  ] Generating answer file '/etc/ovirt-hosted-engine/answers.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ INFO  ] Hosted Engine successfully set up

Comment 16 errata-xmlrpc 2016-03-09 19:11:56 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0375.html

Note You need to log in before you can comment on or make changes to this bug.