1238709 – Redeploy of hosted-engine on NFS storage failed

Bug 1238709 - Redeploy of hosted-engine on NFS storage failed

Summary: Redeploy of hosted-engine on NFS storage failed

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-hosted-engine-setup
Sub Component:
Version:	3.6.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	3.6.0
Assignee:	Simone Tiraboschi
QA Contact:	Artyom
Docs Contact:
URL:
Whiteboard:	integration
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-07-02 12:54 UTC by Artyom
Modified:	2015-09-01 11:41 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-09-01 11:40:51 UTC
oVirt Team:	---
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
sos report (6.21 MB, application/x-xz) 2015-07-02 12:54 UTC, Artyom	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2016:0375	normal	SHIPPED_LIVE	ovirt-hosted-engine-setup bug fix and enhancement update	2016-03-09 23:48:34 UTC
oVirt gerrit	43749	master	ABANDONED	packaging: setup: avoid re-creating SP on re-deploy	Never
oVirt gerrit	44674	master	MERGED	packaging: setup: disallow redeploy on dirty storage	Never
oVirt gerrit	44675	ovirt-hosted-engine-setup-1.3	MERGED	packaging: setup: disallow redeploy on dirty storage	Never

Description Artyom 2015-07-02 12:54:48 UTC

Created attachment 1045504 [details]
sos report

Description of problem:
We have possibility to continue deployment with exist vm, how described in bug https://bugzilla.redhat.com/show_bug.cgi?id=1002454.
So I run deployment first time, and stop it(CTRL+D) after OS installation OS.
After I kill vm(vdsClient -s 0 destroy vm_id) and re-run deployment(hosted-engine --config-append=answer_file_from_first_deployment), but deployment failed with error:
[ ERROR ] Failed to execute stage 'Misc configuration': Wrong Master domain or its version: 'SD=829741c6-b01d-44c7-8bf7-3b75ed5fb2f2, pool=00000000-0000-0000-0000-000000000000'


Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150623153111.git68138d4.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. See above
2.
3.

Actual results:
re-deploy of hosted engine failed

Expected results:
re-deploy must success and continue deployment with VM from first deployment

Additional info:
See sos report for logs

Comment 1 Simone Tiraboschi 2015-07-29 08:17:10 UTC

The feature was quite interesting when we don't have the appliance cause the deployment process was really long (hours) and so being able to restart from a partially deployed system was an interesting option to save time in case of deployment errors.
Now deploying hosted-engine with the engine appliance deployment takes just a few minutes and so it's not more that relevant.

The trick was booting from CDROM image and suddenly exit without touching the previously installed OS.
With the appliance is not applicable anymore cause re-deploying it will be always overwriting the previously written appliance and so the process is not that different than deploying from scratch on a clean storage domain.

At the same time re-deploying on dirty storage can be risky and error prone cause we don't have the full control on previously partially created structure.

So don't seeing anymore benefits and seeing it as risky, I'm proposing to avoid supporting the redeploy on partially deployed storage simply asking to the user to cleanup the storage and try again. 
Yaniv, what do you think about that?

Comment 2 Yaniv Lavi 2015-08-09 16:24:11 UTC

(In reply to Simone Tiraboschi from comment #1)
> The feature was quite interesting when we don't have the appliance cause the
> deployment process was really long (hours) and so being able to restart from
> a partially deployed system was an interesting option to save time in case
> of deployment errors.
> Now deploying hosted-engine with the engine appliance deployment takes just
> a few minutes and so it's not more that relevant.
> 
> The trick was booting from CDROM image and suddenly exit without touching
> the previously installed OS.
> With the appliance is not applicable anymore cause re-deploying it will be
> always overwriting the previously written appliance and so the process is
> not that different than deploying from scratch on a clean storage domain.
> 
> At the same time re-deploying on dirty storage can be risky and error prone
> cause we don't have the full control on previously partially created
> structure.
> 
> So don't seeing anymore benefits and seeing it as risky, I'm proposing to
> avoid supporting the redeploy on partially deployed storage simply asking to
> the user to cleanup the storage and try again. 
> Yaniv, what do you think about that?

In appliance flow there is no point in this and since there is no user request, I would no allow partial for manual as well. If requested, we can consider this.

Comment 3 Simone Tiraboschi 2015-08-11 08:37:30 UTC

I perfectly agree: now with the appliance it can be really dangerous cause you are going to overwrite the previously deployed engine VM completely destroying your system so I prefer to explicitly disallow it to prevent any possible mistake for who is running the setup script again in the need for update.

Comment 4 Artyom 2015-09-01 11:40:51 UTC

Stop support of redeploy feature from 3.6 according to https://bugzilla.redhat.com/show_bug.cgi?id=1238709#c2

Note You need to log in before you can comment on or make changes to this bug.