Bug 1932392

Summary: engine-setup fails after 'engine-backup --mode=restore' if the backup was taken on a newer version
Product: [oVirt] ovirt-engine Reporter: Miguel Martin <mmartinv>
Component: Backup-Restore.EngineAssignee: Yedidyah Bar David <didi>
Status: CLOSED CURRENTRELEASE QA Contact: Nikolai Sednev <nsednev>
Severity: low Docs Contact:
Priority: medium    
Version: 4.4.5.6CC: aoconnor, bugs, didi, michal.skrivanek, nsednev
Target Milestone: ovirt-4.4.8Keywords: Regression, ZStream
Target Release: ---Flags: pm-rhel: ovirt-4.4+
aoconnor: blocker-
pm-rhel: planning_ack+
sbonazzo: devel_ack+
pm-rhel: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.4.7 Doc Type: Bug Fix
Doc Text:
engine-backup now refuses to restore a backup taken by a version newer than the installed one.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-19 06:23:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Miguel Martin 2021-02-24 14:48:38 UTC
Description of problem:

It's not possible to restore a backup

Version-Release number of selected component (if applicable):
4.4.5.6

How reproducible:
Always

Steps to Reproduce:
1. Make a backup of a SHE in 4.4.5.6 version
2. Follow [1] to restore the backup

Actual results:
The restoration fails


Expected results:
The restoration succeeds

Additional info:

[ INFO  ] Creating/refreshing Engine database schema\n[ ERROR ] schema.sh: FATAL: Cannot execute sql command: --file=/usr/share/ovirt-engine/dbscripts/create_views.sql\n[ ERROR ]
Failed to execute stage 'Misc configuration': Engine schema refresh failed

Comment 3 Miguel Martin 2021-02-24 14:59:26 UTC
From ovirt-engine-setup-20210224151953-m680u8.log:

CREATE TYPE
********* QUERY **********
...skipping...
psql:/usr/share/ovirt-engine/dbscripts/create_views.sql:998: ERROR:  column vm_templates.single_qxl_pci does not exist
LINE 18:     vm_templates.single_qxl_pci AS single_qxl_pci,
             ^
FATAL: Cannot execute sql command: --file=/usr/share/ovirt-engine/dbscripts/create_views.sql

2021-02-24 15:20:37,903+0100 ERROR otopi.plugins.ovirt_engine_setup.ovirt_engine.db.schema schema._misc:530 schema.sh: FATAL: Cannot execute sql command: --file=/usr/share/ovirt-engine/dbscripts/create_views.sql
2021-02-24 15:20:37,904+0100 DEBUG otopi.context context._executeMethod:145 method exception
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", line 532, in _misc
    raise RuntimeError(_('Engine schema refresh failed'))
RuntimeError: Engine schema refresh failed

Comment 4 Yedidyah Bar David 2021-02-25 08:36:46 UTC
Please clarify the flow.

The attached setup log has:

2021-02-24 15:19:54,752+0100 DEBUG otopi.plugins.ovirt_engine_common.base.core.misc misc._init:105 Package: ovirt-engine-4.4.4.7 (4.4.4.7-1.el8)

But also:

2021-02-24 15:20:12,012+0100 DEBUG otopi.plugins.ovirt_engine_common.base.core.uninstall uninstall.getLines:201 getLines /etc/ovirt-engine/uninstall.d/20210219115906-uninstall.conf file_group_versionlock: aggregated_lines = {'/etc/dnf/plugins/versionlock.list': [{'added': 'ovirt-engine-wildfly-22.0.0-1.el8.x86_64'}, {'added': 'ovirt-engine-restapi-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-webadmin-portal-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-backend-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-wildfly-overlay-22.0.0-1.el8.noarch'}, {'added': 'ovirt-engine-dwh-4.4.5.4-1.el8.noarch'}, {'added': 'ovirt-engine-dbscripts-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-tools-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-ui-extensions-1.2.4-1.el8.noarch'}, {'added': 'ovirt-engine-tools-backup-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-extension-aaa-jdbc-1.2.0-1.el8.noarch'}], '/etc/yum/pluginconf.d/versionlock.list': [{'added': 'ovirt-engine-wildfly-22.0.0-1.el8.x86_64'}, {'added': 'ovirt-engine-restapi-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-webadmin-portal-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-backend-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-wildfly-overlay-22.0.0-1.el8.noarch'}, {'added': 'ovirt-engine-dwh-4.4.5.4-1.el8.noarch'}, {'added': 'ovirt-engine-dbscripts-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-tools-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-ui-extensions-1.2.4-1.el8.noarch'}, {'added': 'ovirt-engine-tools-backup-4.4.5.6-1.el8.noarch'}, {'added': 'ovirt-engine-extension-aaa-jdbc-1.2.0-1.el8.noarch'}]}

Perhaps you downgraded stuff?

Comment 5 Miguel Martin 2021-02-25 10:08:34 UTC
Maybe the appliance package has not been built for 4.4.5.6?

In the hypervisor where I am restoring the backup has the following version:
~~~
rpm -qa |grep ovirt-engine-appliance
ovirt-engine-appliance-4.4-20201221110111.1.el8.x86_64
~~~

Comment 6 Yedidyah Bar David 2021-02-25 10:34:55 UTC
(In reply to Miguel Martin from comment #5)
> Maybe the appliance package has not been built for 4.4.5.6?

Maybe.

> 
> In the hypervisor where I am restoring the backup has the following version:
> ~~~
> rpm -qa |grep ovirt-engine-appliance
> ovirt-engine-appliance-4.4-20201221110111.1.el8.x86_64
> ~~~

And the backup you restore was taken on 4.4.5? That's not supported. You can only restore to a version >= the version used to do the backup. Otherwise, it's essentially a downgrade, which is not supported.

That said, perhaps we should try fail earlier, and with a clearer error message.

Doing this in engine-backup is rather easy, will benefit also standalone (non-hosted-engine), but will still be rather late.

Doing in hosted-engine before even starting creating the VM is more work, not sure it's worth it.

Comment 7 Yedidyah Bar David 2021-02-25 10:37:38 UTC
Please make sure that "legit" restore (e.g. from 4.4.4.7 to itself, or from an earlier version) works and then either close this bug, or decide what you want to do, and update severity/milestone accordingly. Thanks.

Comment 8 Yedidyah Bar David 2021-03-03 10:33:41 UTC
(In reply to Yedidyah Bar David from comment #6)
> Doing this in engine-backup is rather easy, will benefit also standalone
> (non-hosted-engine), but will still be rather late.

That's what 113757 does. Changing subject and component accordingly. If you want something else, please open another bug. Thanks!

Comment 9 Miguel Martin 2021-03-03 10:37:20 UTC
(In reply to Yedidyah Bar David from comment #7)
> Please make sure that "legit" restore (e.g. from 4.4.4.7 to itself, or from
> an earlier version) works and then either close this bug, or decide what you
> want to do, and update severity/milestone accordingly. Thanks.

It looks like the hypervisor used a different ovirt repo and that's why it downloaded an appliance version which was lower than the required one.

Either way, it would be nice to fail earlier with a more clear error message.

Comment 10 Yedidyah Bar David 2021-03-03 10:44:24 UTC
(In reply to Miguel Martin from comment #9)
> Either way, it would be nice to fail earlier with a more clear error message.

With the current patch, it will fail only slightly earlier, and with a better error message (I hope).

If you want to fail much earlier (before creating the local VM), please open a bug on ovirt-hosted-engine-setup. Not sure it's worth the effort, though - will require extracting the backed up version from the backup file (should be easy) and the version inside the appliance (somewhat harder) and comparing them. And will prevent the "normal" flow, which also does 'dnf update' inside the engine before restore/engine-setup, from preventing current bug. Doing also that beforehand, sounds too hard to me to be worth it (and even harder in RHV, where the host and the engine might be subscribed to different channels, etc.).

Comment 11 RHEL Program Management 2021-04-07 08:47:24 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 12 Nikolai Sednev 2021-06-16 23:39:28 UTC
I suppose that verification should be general backup and restore on latest 4.4.7.3-0.3.el8ev?

Comment 13 Yedidyah Bar David 2021-06-17 06:09:24 UTC
(In reply to Nikolai Sednev from comment #12)
> I suppose that verification should be general backup and restore on latest
> 4.4.7.3-0.3.el8ev?

For sanity, yes.

Proper verification is something like:

1. Install, setup and backup an engine with some "new" version.

2. Try to restore this backup with an older (than "new") version, but that has this bug fixed.

Right now, this means e.g. to backup with 4.4.7.3 and restore with 4.4.7.1, if you have it.

Alternatively, you can try manually patching the backup file (untar it, edit "version" inside it, tar again) to fake a backup from e.g. 4.4.8 or 4.4.7.5, and then you can verify restoring it with 4.4.7.3.

In any case:

With a broken version (prior to 4.4.7), restore will succeed (if same minor. E.g. a 4.3 engine-backup will still refuse to restore a backup taken on 4.4.).

With a fixed version, it should fail.

Comment 14 Nikolai Sednev 2021-06-17 07:33:15 UTC
What's the reason of taking backup on higher version and trying to restore it on lower one? Its possible, but customer should run it on newer version and above or on old and above, always moving forward with the versions...

Comment 15 Yedidyah Bar David 2021-06-17 07:36:39 UTC
(In reply to Nikolai Sednev from comment #14)
> What's the reason of taking backup on higher version and trying to restore
> it on lower one?

Not sure, but apparently people did this - likely by mistake.

> Its possible,

It _was_ possible. This bug is about making sure it's not possible anymore.

> but customer should run it on newer version
> and above or on old and above, always moving forward with the versions...

Exactly.

Comment 19 RHEL Program Management 2021-06-23 21:40:13 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 20 Nikolai Sednev 2021-06-23 21:45:10 UTC
(In reply to Nikolai Sednev from comment #18)
> Tried to restore from ovirt-engine-setup-4.4.7.4-0.9.el8ev.noarch with
> ansible-2.9.21-1.el8ae.noarch, to
> ovirt-engine-setup-4.4.7.1-0.9.el8ev.noarch with
> ansible-2.9.21-1.el8ae.noarch.
> I've used an old appliance rhvm-appliance-4.4-20210527.0.el8ev.x86_64, which
> is ovirt-engine-setup-4.4.6.8-0.1.el8ev.noarch with
> ansible-2.9.18-1.el8ae.noarch and
> ovirt-ansible-collection-1.4.2-1.el8ev.noarchas and I upgraded it to
> ovirt-engine-setup-4.4.7.4-0.9.el8ev.noarch with
> ansible-2.9.21-1.el8ae.noarch during restore.
> 
> I did not seen anything preventing me from restoring a backup from new
> version to an older version.
> 
> Moving back to assigned.

Please disregard comment #18 and read this one instead (I wrote there by mistake wrong engine version after upgrade during deployment, it should have been ovirt-engine-setup-4.4.7.1-0.9.el8ev.noarch instead of ovirt-engine-setup-4.4.7.4-0.9.el8ev.noarch):


Tried to restore from ovirt-engine-setup-4.4.7.4-0.9.el8ev.noarch with ansible-2.9.21-1.el8ae.noarch, to ovirt-engine-setup-4.4.7.1-0.9.el8ev.noarch with ansible-2.9.21-1.el8ae.noarch.
I've used an old appliance rhvm-appliance-4.4-20210527.0.el8ev.x86_64, which is ovirt-engine-setup-4.4.6.8-0.1.el8ev.noarch with ansible-2.9.18-1.el8ae.noarch and ovirt-ansible-collection-1.4.2-1.el8ev.noarchas and I upgraded it to ovirt-engine-setup-4.4.7.1-0.9.el8ev.noarch with ansible-2.9.21-1.el8ae.noarch during restore.

I did not seen anything preventing me from restoring a backup from new version to an older version.

Moving back to assigned.

Comment 21 Yedidyah Bar David 2021-06-24 05:21:07 UTC
(In reply to Nikolai Sednev from comment #20)

Please clarify your exact steps and attach all relevant logs:

> Tried to restore from ovirt-engine-setup-4.4.7.4-0.9.el8ev.noarch

Not sure what this means.

If you had some engine set up and then upgraded ovirt-engine-setup to 4.4.7.4 that's not enough.

> with
> ansible-2.9.21-1.el8ae.noarch, to

ansible is irrelevant. The only thing relevant is the version of ovirt-engine-tools-backup.
Normally this is version-locked and is upgraded only via engine-setup.

> ovirt-engine-setup-4.4.7.1-0.9.el8ev.noarch with
> ansible-2.9.21-1.el8ae.noarch.
> I've used an old appliance rhvm-appliance-4.4-20210527.0.el8ev.x86_64, which
> is ovirt-engine-setup-4.4.6.8-0.1.el8ev.noarch with
> ansible-2.9.18-1.el8ae.noarch and
> ovirt-ansible-collection-1.4.2-1.el8ev.noarchas and I upgraded it to
> ovirt-engine-setup-4.4.7.1-0.9.el8ev.noarch with
> ansible-2.9.21-1.el8ae.noarch during restore.

Did you then ran engine-setup? And then engine-backup? And then tried to restore this one?

> 
> I did not seen anything preventing me from restoring a backup from new
> version to an older version.

What did you try to do? Please attach logs.

Re hosted-engine:

Current bug is on engine-backup, not on hosted-engine.

Please verify it with a standalone engine.

If you want to verify with a hosted-engine, it will still fail, hopefully with
the expected error message somewhere (please check this), but quite late in
the process.

See also comment 6 and comment 10. If you disagree, please open another bug, on
ovirt-hosted-engine-setup. Thanks.

Comment 22 Nikolai Sednev 2021-06-24 06:51:51 UTC
Steps were described in https://bugzilla.redhat.com/show_bug.cgi?id=1932392#c16.
1.Deploy HE over NFS and get it upgraded to latest bits to ovirt-engine-setup-4.4.7.4-0.9.el8ev.noarch.
2.Backup the engine, while environment is in global maintenance mode.
3.Reprovision host that runs the engine and run the restore on host, while backup is copied to that host.

During restore I was using old appliance rhvm-appliance-4.4-20210527.0.el8ev.x86_64 (our latest appliance available for QA), which is ovirt-engine-setup-4.4.6.8-0.1.el8ev.noarch, so during either initial deployment or restore, I had to upgrade it to relevant engine version using engine upgrade, then continued with the deployment/restore. So I indeed ran engine-setup during deployment/restore, its the only way to upgrade.

Engine backup and restore on hosted engine is the same as on standalone engine, the engine is just the same.
Please also read the description of the bug:
"
Steps to Reproduce:
1. Make a backup of a SHE in 4.4.5.6 version
2. Follow [1] to restore the backup
"
Step 1 clearly says to run this on SHE (self hosted engine), so this is what I did.
I can't attach any logs at the moment due to capacity, environment is already running different verification.

Comment 23 Yedidyah Bar David 2021-06-24 07:24:49 UTC
(In reply to Nikolai Sednev from comment #22)
> Steps were described in
> https://bugzilla.redhat.com/show_bug.cgi?id=1932392#c16.
> 1.Deploy HE over NFS and get it upgraded to latest bits to
> ovirt-engine-setup-4.4.7.4-0.9.el8ev.noarch.
> 2.Backup the engine, while environment is in global maintenance mode.
> 3.Reprovision host that runs the engine and run the restore on host, while
> backup is copied to that host.

That's ok. Good.

> 
> During restore I was using old appliance
> rhvm-appliance-4.4-20210527.0.el8ev.x86_64 (our latest appliance available
> for QA), which is ovirt-engine-setup-4.4.6.8-0.1.el8ev.noarch, so during
> either initial deployment or restore, I had to upgrade it to relevant engine
> version using engine upgrade,

How? This matters.

In comment 16 (not sure why it's hidden), you said you did this by replying 'Yes'
to "Pause the execution after adding this host to the engine". This is pausing the
deployment _after_ restoring the backup, so restore was done with the old version.

If you want to make it pause before restore, you can set ansible var
'he_pause_before_engine_setup' to true, e.g. with:

hosted-engine --deploy --ansible-extra-vars=he_pause_before_engine_setup=true

(I didn't try this).

This is documented in:

https://github.com/oVirt/ovirt-ansible-collection/blob/master/roles/hosted_engine_setup/README.md

> then continued with the deployment/restore. So
> I indeed ran engine-setup during deployment/restore, its the only way to
> upgrade.
> 
> Engine backup and restore on hosted engine is the same as on standalone
> engine, the engine is just the same.

Indeed, but hosted-engine restore takes longer, and the failure happens later.

> Please also read the description of the bug:
> "
> Steps to Reproduce:
> 1. Make a backup of a SHE in 4.4.5.6 version
> 2. Follow [1] to restore the backup
> "
> Step 1 clearly says to run this on SHE (self hosted engine),

Indeed. When I decide to change the scope of a bug, I usually do not edit
comment 0 (although recent bugzilla versions allow that), but only change
other fields - product/component/subject. I often also add a new comment
with 'QE:'. In current bug, I didn't, and you asked, and I replied - see
comment 13. That one, from my side, is the authoritative definition for the
scope of this bug.

> so this is what
> I did.

Understood.

Current bug is not on hosted-engine, though.

I also find it extremely important to clarify, not only to you, that the
bug is fixed only in _future_ versions. Meaning, if you backup with a
fixed version and restore with a broken version, the restore will not fail.

See above if you still want to verify it on hosted-engine, np with that -
but it will take longer and be more complicated, as you already see.

> I can't attach any logs at the moment due to capacity, environment is
> already running different verification.

Please always attach relevant logs after verification, especially, but not
only, if it fails, right after trying. Thanks. This very simply prevents
future questions/guessing.

Comment 24 Nikolai Sednev 2021-06-24 15:50:43 UTC
Forth to our conversation with Yedidyah, I tried backup and restore now with following steps:
1.I deployed HE ovirt-engine-setup-4.4.7.4-0.9.el8ev.noarch with ovirt-engine-tools-backup-4.4.7.4-0.9.el8ev.noarch, over iSCSI storage on 2 ha-hosts.
2.Then I moved environment to global maintenance on alma03 ha-host, on which engine was running and it also was an SPM.
3.I ran backup on the engine "engine-backup --mode=backup --file=nsednev_from_alma03_rhevm_4_4_7_4 --log=Log_nsednev_from_alma03_rhevm_4_4_7_4" and copied files to my laptop.
4.Reprovisioned alma03 to clean RHEL8.4.
5.Copied "nsednev_from_alma03_rhevm_4_4_7_4" backup file from my laptop to the "/root" on alma03.
6.Ran restore on alma03 "hosted-engine --deploy --ansible-extra-vars=he_pause_before_engine_setup=true  --restore-from-file=/root/nsednev_from_alma03_rhevm_4_4_7_4".
7.It doesn't seems like it have been stopped before running engine-setup:
[ INFO  ] skipping: [localhost]
[ INFO  ] TASK [redhat.rhv.engine_setup : Run engine-setup with answerfile]
[ INFO  ] changed: [localhost -> nsednev-he-1.qa.lab.tlv.redhat.com]
[ INFO  ] TASK [redhat.rhv.engine_setup : Make sure `ovirt-engine` service is running]
[ INFO  ] ok: [localhost -> nsednev-he-1.qa.lab.tlv.redhat.com]
[ INFO  ] TASK [redhat.rhv.engine_setup : Check if Engine health page is up]
[ INFO  ] ok: [localhost -> nsednev-he-1.qa.lab.tlv.redhat.com]
[ INFO  ] TASK [redhat.rhv.engine_setup : Run engine-config]
[ INFO  ] TASK [redhat.rhv.engine_setup : Restart engine after engine-config]
[ INFO  ] skipping: [localhost]
[ INFO  ] TASK [redhat.rhv.engine_setup : Clean temporary files]
[ INFO  ] changed: [localhost -> nsednev-he-1.qa.lab.tlv.redhat.com]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Include after engine-setup custom tasks files for the engine VM]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Wait for the engine to reach a stable condition]
nsednev-he-1 ~]# rpm -qa | grep ovirt-engine-setup
ovirt-engine-setup-4.4.6.8-0.1.el8ev.noarch
ovirt-engine-setup-base-4.4.6.8-0.1.el8ev.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-4.4.6.8-0.1.el8ev.noarch
ovirt-engine-setup-plugin-ovirt-engine-4.4.6.8-0.1.el8ev.noarch
ovirt-engine-setup-plugin-websocket-proxy-4.4.6.8-0.1.el8ev.noarch
ovirt-engine-setup-plugin-imageio-4.4.6.8-0.1.el8ev.noarch
ovirt-engine-setup-plugin-cinderlib-4.4.6.8-0.1.el8ev.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.4.6.8-0.1.el8ev.noarch
[root@nsednev-he-1 ~]# systemctl status ovirt-engine
● ovirt-engine.service - oVirt Engine
   Loaded: loaded (/usr/lib/systemd/system/ovirt-engine.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2021-06-24 18:44:19 IDT; 5min ago
 Main PID: 19990 (ovirt-engine.py)
    Tasks: 209 (limit: 100880)
   Memory: 1.4G
   CGroup: /system.slice/ovirt-engine.service
           ├─19990 /usr/libexec/platform-python /usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.py --redir>
           └─21028 ovirt-engine --add-modules java.se -server -XX:+TieredCompilation -Xms3958M -Xmx3958M -Xss1M -Djav>

Jun 24 18:44:19 nsednev-he-1.qa.lab.tlv.redhat.com systemd[1]: Starting oVirt Engine...
Jun 24 18:44:19 nsednev-he-1.qa.lab.tlv.redhat.com systemd[1]: Started oVirt Engine.
Jun 24 18:44:20 nsednev-he-1.qa.lab.tlv.redhat.com ovirt-engine.py[19990]: 2021-06-24 18:44:20,799+0300 ovirt-engine:>
Jun 24 18:44:23 nsednev-he-1.qa.lab.tlv.redhat.com ovirt-engine.py[19990]: 2021-06-24 18:44:23,672+0300 ovirt-engine:>

I think that as discussed, we should move this bug back ON_QA, once we'll have higher appliance versions with 4.4.7 and 4.4.8.

Comment 25 Yedidyah Bar David 2021-06-27 06:39:38 UTC
Moving to 4.4.8 to be verified then. Verification steps:

1. Deploy HE 4.4.8 and take a backup
2. On a clean/new host, start deployment with 4.4.7 appliance (we should have one released by then) with --restore-from-file and the backup from stop 1.

Should fail with a proper error message from engine-backup (which might not be the final error from --deploy, but should be visible in its logs).

Comment 26 Nikolai Sednev 2021-07-05 10:58:45 UTC
Restore from 4.4.7.6 to 4.4.7.5 fails as expected with: ["FATAL: Backup was created by version '4.4.7.6' and can not be restored using the installed version 4.4.7.5?"].
Tested by automation and also manually.
I've tested with rhvm-appliance-4.4-20210628.0.el8ev.x86_64=ovirt-engine-setup-4.4.7.5-0.9.el8ev.noarch.
Normal restore from 4.4.7.5 to 4.4.7.5 works just fine.
I think we may move the bug to verified.

Comment 27 Nikolai Sednev 2021-07-05 15:13:09 UTC
Exact error "FATAL: Backup was created by version '4.4.7.6' and can not be restored using the installed version 4.4.7.5" message looks like:
[ ERROR ] fatal: [localhost -> 192.168.222.167]: FAILED! => {"changed": true, "cmd": "engine-backup --mode=restore --log=/var/log/ovirt-engine/setup/restore-backup-$(date -u +%Y%m%d%H%M%S).log --file=/root/engine_backup --provision-all-databases --restore-permissions", "delta": "0:00:00.347268", "end": "2021-07-05 18:09:15.441939", "msg": "non-zero return code", "rc": 1, "start": "2021-07-05 18:09:15.094671", "stderr": "FATAL: Backup was created by version '4.4.7.6' and can not be restored using the installed version 4.4.7.5", "stderr_lines": ["FATAL: Backup was created by version '4.4.7.6' and can not be restored using the installed version 4.4.7.5"], "stdout": "Start of engine-backup with mode 'restore'\nscope: all\narchive file: /root/engine_backup\nlog file: /var/log/ovirt-engine/setup/restore-backup-20210705150915.log\nPreparing to restore:\n- Unpacking file '/root/engine_backup'", "stdout_lines": ["Start of engine-backup with mode 'restore'", "scope: all", "archive file: /root/engine_backup", "log file: /var/log/ovirt-engine/setup/restore-backup-20210705150915.log", "Preparing to restore:", "- Unpacking file '/root/engine_backup'"]}

Comment 28 Michal Skrivanek 2021-07-08 13:30:01 UTC
> 7.It doesn't seems like it have been stopped before running engine-setup:

you claim "--ansible-extra-vars=he_pause_before_engine_setup=true" doesn't actaully pause - can you please add logs?
also the hook mechanism documented in install guide - does it not work too? (https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html-single/installing_red_hat_virtualization_as_a_self-hosted_engine_using_the_command_line/index#Deploying_the_Self-Hosted_Engine_Using_the_CLI_install_RHVM - enginevm_before_engine_setup hook)

Comment 29 Nikolai Sednev 2021-07-08 16:08:26 UTC
I ran without any hooks and will try to reproduce https://bugzilla.redhat.com/show_bug.cgi?id=1932392#c24 next week.
In case of reproduction I'll open separate bug.

Comment 30 Nikolai Sednev 2021-07-12 18:03:18 UTC
Retested on rhvm-appliance-4.4-20210705.0.el8ev.x86_64.rpm which contains ovirt-engine-setup-4.4.7.6-0.11.el8ev.noarch and "hosted-engine --deploy --ansible-extra-vars=he_pause_before_engine_setup=true  --restore-from-file=/root/nsednev_from_alma03_rhevm_4_4_7_6_0_11" successfully stopped the VM during deployment before running engine-setup:

[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Pause execution until /tmp/ansible.s_l_weuy_he_setup_lock is removed, delete it once ready to proceed]

nsednev-he-1 ~]# systemctl status ovirt-engine
● ovirt-engine.service - oVirt Engine
   Loaded: loaded (/usr/lib/systemd/system/ovirt-engine.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : include_tasks]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Copy the backup file to the engine VM for restore]
[ INFO  ] changed: [localhost -> 192.168.222.167]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Run engine-backup]
.
.
.
[ INFO  ] Hosted Engine successfully deployed
[ INFO  ] Other hosted-engine hosts have to be reinstalled in order to update their storage configuration. From the engine, host by host, please set maintenance mode and then click on reinstall button ensuring you choose DEPLOY in hosted engine tab.
[ INFO  ] Please note that the engine VM ssh keys have changed. Please remove the engine VM entry in ssh known_hosts on your clients.


Tested with:
ovirt-hosted-engine-setup-2.5.1-1.el8ev.noarch
ovirt-hosted-engine-ha-2.4.7-1.el8ev.noarch
ansible-2.9.21-1.el8ae.noarch
ovirt-ansible-collection-1.5.3-1.el8ev.noarch

This time everything worked as designed.

Comment 31 Sandro Bonazzola 2021-08-19 06:23:01 UTC
This bugzilla is included in oVirt 4.4.8 release, published on August 19th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.8 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.