Bug 1290073 - engine-setup should warn users running within hosted engine to set to maintenance
Summary: engine-setup should warn users running within hosted engine to set to mainten...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Setup.Engine
Version: 3.6.1.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.0.2
: 4.0.2.1
Assignee: Lev Veyde
QA Contact: Jiri Belka
URL:
Whiteboard:
: 1333166 (view as bug list)
Depends On:
Blocks: 902971 1359844
TreeView+ depends on / blocked
 
Reported: 2015-12-09 15:40 UTC by Sandro Bonazzola
Modified: 2016-08-12 14:26 UTC (History)
11 users (show)

Fixed In Version:
Clone Of:
: 1359844 (view as bug list)
Environment:
Last Closed: 2016-08-12 14:26:29 UTC
oVirt Team: Integration
Embargoed:
rule-engine: ovirt-4.0.z+
rule-engine: exception+
ylavi: planning_ack+
dfediuck: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1356743 0 unspecified CLOSED HE: Reports Engine Fails to Start - ovirt-engine-reportsd: ERROR run:532 Error: Directory '@OVIRT_REPORTS_JBOSS_HOME@' ... 2021-02-22 00:41:40 UTC
oVirt gerrit 57270 0 master MERGED packaging: Add Hosted Engine VM detection 2020-06-18 05:50:39 UTC
oVirt gerrit 60264 0 ovirt-engine-4.0 ABANDONED packaging: Add Hosted Engine VM detection 2020-06-18 05:50:38 UTC
oVirt gerrit 60265 0 ovirt-engine-4.0.1 ABANDONED packaging: Add Hosted Engine VM detection 2020-06-18 05:50:38 UTC
oVirt gerrit 60268 0 None MERGED Revert "packaging: Add Hosted Engine VM detection" 2020-06-18 05:50:38 UTC
oVirt gerrit 60275 0 master MERGED packaging: Add Hosted Engine VM detection 2020-06-18 05:50:38 UTC
oVirt gerrit 61339 0 ovirt-engine-3.6 MERGED packaging: Add Hosted Engine VM detection 2020-06-18 05:50:38 UTC
oVirt gerrit 61340 0 ovirt-engine-4.0 MERGED packaging: Add Hosted Engine VM detection 2020-06-18 05:50:37 UTC
oVirt gerrit 61341 0 ovirt-engine-4.0.2 MERGED packaging: Add Hosted Engine VM detection 2020-06-18 05:50:37 UTC

Internal Links: 1356743

Description Sandro Bonazzola 2015-12-09 15:40:57 UTC
Description of problem:
With an Hosted Engine running ovirt-engine 3.6.1.1 I started and upgrade forgetting to move the cluster to global maintenance.
engine-setup started the upgrade to 3.6.1.2 and while handling the DB upgrade the VM got fenced.

Running again engine-setup shows:

[ ERROR ] Failed to execute stage 'Misc configuration': function getdwhhistorytimekeepingbyvarname(unknown) does not exist LINE 2:             select * from GetDwhHistoryTimekeepingByVarName(                                   ^ HINT:  No function matches the given name and argument types. You might need to add explicit type casts. 
[ INFO  ] Yum Performing yum transaction rollback

so the engine instance got corrupted.

engine-setup should avoid to enter misc stage and changing data on disk if running within a hosted engine without being in maintenance.


Version-Release number of selected component (if applicable):
ovirt-engine-3.6.1.1
ovirt-engine-setup-3.6.1.2

How reproducible:

Steps to Reproduce:
1. install hosted engine
2. update the engine without moving to maintenance


Actual results:
hosted engine VM get fenced causing data loss

Expected results:
engine-setup should exit if not in global maintenance


Additional info:

Comment 1 Sandro Bonazzola 2015-12-23 12:18:57 UTC
Let's add some warning pointing to the documentation to just read it before starting the upgrade and let's make sure the doc says to move hosts to global maintenance.

Comment 2 Yedidyah Bar David 2016-03-08 15:52:39 UTC
We added logic to test if we are hosted-engine in bug 1311027. Can reuse parts for current bug.

Comment 3 Yaniv Kaul 2016-04-10 13:15:06 UTC
(Should be proposed for 4.0 before being backported to 3.6.x).

Comment 4 Sandro Bonazzola 2016-04-28 11:26:05 UTC
Do you think QE should check this warning in both 4.0 and 3.6?
Every oVirt bug targeted to 3.6.x must go through master first if not clearly stated that it affects 3.6.x only.

Comment 5 Yaniv Kaul 2016-05-01 07:24:23 UTC
(In reply to Sandro Bonazzola from comment #4)
> Do you think QE should check this warning in both 4.0 and 3.6?

They do, that's part of the reason for the cloning process.

> Every oVirt bug targeted to 3.6.x must go through master first if not
> clearly stated that it affects 3.6.x only.

Comment 6 Simone Tiraboschi 2016-05-05 12:17:40 UTC
*** Bug 1333166 has been marked as a duplicate of this bug. ***

Comment 7 Marina Kalinin 2016-05-05 14:57:39 UTC
If possible, I vote for the engine to check maintenance itself and quit, if not enabled. Users may just skip it.
Or, at least, make the default answer as [Abort] rather then [Continue].

Comment 8 Yaniv Lavi 2016-05-05 15:19:18 UTC
(In reply to Marina from comment #7)
> If possible, I vote for the engine to check maintenance itself and quit, if
> not enabled. Users may just skip it.
> Or, at least, make the default answer as [Abort] rather then [Continue].

We do not want to add hosted engine specific tests in the setup, since it will make things much more complicated.

Comment 9 Yaniv Lavi 2016-07-05 08:32:40 UTC
Why do we need to test in two streams?

Comment 10 Lev Veyde 2016-07-05 13:22:28 UTC
(In reply to Yaniv Dary from comment #8)
> (In reply to Marina from comment #7)
> > If possible, I vote for the engine to check maintenance itself and quit, if
> > not enabled. Users may just skip it.
> > Or, at least, make the default answer as [Abort] rather then [Continue].
> 
> We do not want to add hosted engine specific tests in the setup, since it
> will make things much more complicated.

In this particular case we don't have too many options, as we need to stop the engine-setup itself.

After internal discussions we decided on approach that Marina suggested, as the safest way. In case the user wants to override and continue, he/she required to add a flag into the answer file.

Comment 11 Sandro Bonazzola 2016-07-15 14:44:16 UTC
As Lev mentioned we discussed the issue with GSS, not checking if we're running on HE without global maintenance will mean data corruption on hosted engine vm fencing.

Comment 13 Yaniv Lavi 2016-07-28 12:32:19 UTC
Should this be on qa?

Comment 14 Lev Veyde 2016-07-28 13:19:28 UTC
(In reply to Yaniv Dary from comment #13)
> Should this be on qa?

We just recently merged it, so need to check if there is a build with the patch available.

Comment 15 Jiri Belka 2016-08-01 11:17:55 UTC
ok, ovirt-engine-setup.noarch 0:4.0.2.3-0.1.el7ev

...
[ ERROR ] It seems that you are running your engine inside of the hosted-engine VM and are not in "Global
 Maintenance" mode. In that case you should put the system into the "Global Maintenance" mode before runn
ing engine-setup, or the hosted-engine HA agent might kill the machine, which might corrupt your data. 
[ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine setup detected, but Global Maintenanc
e is not set.
...


Note You need to log in before you can comment on or make changes to this bug.