Bug 1881119 - [RFE] Make engine-backup refuse to restore a backup from a version earlier than 4.3.10
Summary: [RFE] Make engine-backup refuse to restore a backup from a version earlier th...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Backup-Restore.Engine
Version: 4.4.1
Hardware: All
OS: Linux
medium
high
Target Milestone: ovirt-4.4.3
: ---
Assignee: Yedidyah Bar David
QA Contact: Petr Matyáš
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-21 14:56 UTC by Greg Scott
Modified: 2022-08-10 13:07 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
engine-backup now refuses to restore a backup taken with a version earlier than 4.3.10, to prevent a missing cinderlib database on restore - as cinderlib database backup was added on in 4.3.10.
Clone Of:
Environment:
Last Closed: 2020-11-11 06:41:24 UTC
oVirt Team: Integration
Embargoed:
pm-rhel: ovirt-4.4?
mavital: testing_plan_complete-
pm-rhel: planning_ack?
pm-rhel: devel_ack+
lleistne: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Article) 5398981 0 None None None 2020-09-21 14:56:29 UTC
oVirt gerrit 111394 0 master MERGED packaging: engine-backup: Refuse restoring < 4.3.10 2020-11-11 08:17:27 UTC

Description Greg Scott 2020-09-21 14:56:30 UTC
Description of problem:
During an upgrade, after using an old RHVM 4.3 z-stream to backup and restore to 4.4, engine-setup fails because the 4.3.9 and earlier backup did not capture the Cinderlib database. Fair enough; the upgrade should start with 4.3.latest. But after the engine-setup failure, engine-cleanup also blows up, and the only recovery is, wipe and rebuild the 4.4 system from bare metal.

Version-Release number of selected component (if applicable):
4.4

How reproducible:
At will

Steps to Reproduce:
1. Do an engine-backup from a 4.3.9 or earlier RHVM.
2. Restore the backup above to a fresh 4.4.1 RHVM.
3. Run engine-setup. It will fail as expected with a message about unable to login to the Cinderlib database.
4. Run engine-cleanup.

Actual results:
Engine-cleanup fails with an obscure error, leaving the system a mess. The only feasible recovery - wipe and rebuild from bare metal.

Expected results:
Engine-cleanup should put RHVM back to its original state, as if engine-setup had never run.

Additional info:

Comment 1 Michal Skrivanek 2020-09-22 07:21:46 UTC
Didi, as we discussed offline, if it is easier to just add a check for restores from backups done in 4.3.11 then let's do that instead

Comment 3 Yedidyah Bar David 2020-09-22 07:44:12 UTC
(In reply to Greg Scott from comment #0)
> Description of problem:
> During an upgrade, after using an old RHVM 4.3 z-stream to backup and
> restore to 4.4, engine-setup fails because the 4.3.9 and earlier backup did
> not capture the Cinderlib database. Fair enough; the upgrade should start
> with 4.3.latest. But after the engine-setup failure, engine-cleanup also
> blows up, and the only recovery is, wipe and rebuild the 4.4 system from
> bare metal.
> 
> Version-Release number of selected component (if applicable):
> 4.4
> 
> How reproducible:
> At will
> 
> Steps to Reproduce:
> 1. Do an engine-backup from a 4.3.9 or earlier RHVM.
> 2. Restore the backup above to a fresh 4.4.1 RHVM.
> 3. Run engine-setup. It will fail as expected with a message about unable to
> login to the Cinderlib database.
> 4. Run engine-cleanup.
> 
> Actual results:
> Engine-cleanup fails with an obscure error, leaving the system a mess. The
> only feasible recovery - wipe and rebuild from bare metal.

Which obscure error?

IMO it should be saying "Cleanup utility and installed version mismatch", no?

If so, that's by design - the fact that we decided that it's safe for
4.4's 'engine-backup --mode=restore' to accept a backup taken in 4.3,
does not automatically mean that 4.4 engine-cleanup also accepts the same -
to cleanup a (basically) 4.3 system.

> 
> Expected results:
> Engine-cleanup should put RHVM back to its original state, as if
> engine-setup had never run.

See above. I tend to close notabug.

It should be very easy to fix it, if it's indeed the above error -
just allow, like engine-backup, to cleanup a 4.3 setup. But it adds
quite a significant complexity to our support matrix, and I do not
think it's worth it.

I also think it's not that bad, to have to reinstall the machine.
It's a new machine, you do not have to consider very hard if you need
to backup anything or something like that. If you have (even partial)
automation for the installation, it should take ~ 20 minutes and you
can retry.

That said, if you do not really care about fully cleaning it, and only
want to try again - say, take another backup on 4.3 without cinderlib
and restore that one, or whatever other plan you want to try - it's usually
enough, assuming the two backups are "similar enough", to:

1. Stop PG, rm -rf /var/lib/pgsql/data/*
2. Try again to restore

(1.) should be enough to make restoring DBs work, and the previously-restored
files should simply be overwritten by (2.).

Comment 4 Yedidyah Bar David 2020-09-22 08:36:22 UTC
(In reply to Michal Skrivanek from comment #1)
> Didi, as we discussed offline, if it is easier to just add a check for
> restores from backups done in 4.3.11 then let's do that instead

I think this should be enough, didn't test:

https://gerrit.ovirt.org/111394

Comment 5 Greg Scott 2020-09-22 15:20:14 UTC
I don't remember the error from engine-cleanup. I made it worse by trying to clean it up myself by hand. It took me a while to figure out that recovering from the mess I made was a can of worms. I ended up wiping my messed up 4.4 system and rebuilding it from scratch. The upgrade went smoothly after updating my old 4.3 RHVM to 4.3.10.

If it's easy to make 4.4 engine-backup refuse to restore a backup taken earlier than 4.3.10, that solution seems fine to me. Best way to deal with a can of worms is, avoid opening it in the first place.

Want me to change the BZ title to "Make engine-backup refuse to restore a backup from a version earlier then 4.3.10?"

- Greg

Comment 6 Yedidyah Bar David 2020-09-23 05:43:24 UTC
Very well. Changing summary.

Reproduction/Verification:

1. Setup 4.3.0 <= engine <= 4.3.9
2. Backup
3. Restore on 4.4

In a broken build, it will succeed.

In a fixed build, it will emit something like:

FATAL: Backup was created by version '4.3.9' and can not be restored using the installed version 4.4.3

Please note that it will fail the same way, and at the same step, also in HE restore/upgrade - meaning, only after the local VM is created etc. and we try to restore. Feel free to open another bug on ovirt-hosted-engine-setup to make it check this by itself and fail earlier, if you think we should. Thanks.

Comment 7 Petr Matyáš 2020-10-15 10:12:39 UTC
Verified on ovirt-engine-tools-backup-4.4.3.6-0.13.el8ev.noarch

FATAL: Backup was created by version '3.6.6.2' and can not be restored using the installed version 4.4.3.6

Comment 8 Yedidyah Bar David 2020-10-15 10:23:25 UTC
(In reply to Petr Matyáš from comment #7)
> Verified on ovirt-engine-tools-backup-4.4.3.6-0.13.el8ev.noarch
> 
> FATAL: Backup was created by version '3.6.6.2' and can not be restored using
> the installed version 4.4.3.6

That's not real verification - you'd have gotten a very similar error also in e.g. 4.4.1, when trying to restore 3.6.6.2 backups.
But, it _would_ allow you to restore any 4.3, since bug 1812906.

The change in this bug is to require 4.3.10. So in order to verify, you should try to restore e.g. a 4.3.9 backup and see that it fails (and 4.3.10 and see that it succeeds).

Comment 9 Petr Matyáš 2020-10-15 10:51:11 UTC
Right, I missed that is isn't 4.3.6 but 3.6.6. Lets do this again some time lates this year when I menage to deploy 4.3.6 or whichever 4.3.[0-9]

Comment 10 Petr Matyáš 2020-10-15 13:03:53 UTC
FATAL: Backup was created by version '4.3.6.7' and can not be restored using the installed version 4.4.3.6

works with backup from version 4.3.10.4

Comment 11 Sandro Bonazzola 2020-11-11 06:41:24 UTC
This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 12 meital avital 2022-08-10 13:07:59 UTC
Due to QE capacity, we are not going to cover this issue in our automation


Note You need to log in before you can comment on or make changes to this bug.