Easiest would be to flip the default answer to that question
With this hook we could continue and HE was deployed successfully: /usr/share/ansible/roles/ovirt.hosted_engine_setup/hooks/enginevm_before_engine_setup/fixOldSnapshot.yml ~~~ - name: Adding env variable to accept old snapshots lineinfile: path: /root/ovirt-engine-answers line: "OVESETUP_IGNORE_SNAPSHOTS_WITH_OLD_COMPAT_LEVEL=str:yes" state: present - name: Change default answer to accept old snapshots lineinfile: path: '/usr/share/ovirt-engine/setup/plugins/ovirt-engine-setup/ovirt-engine/db/schema.py' backrefs: yes regexp: '(\s+)default=False(.*)' line: '\1default=True\2' state: present backup: yes owner: root group: root mode: 0644 ~~~
Some notes: 1. One might claim it's not a bug in hosted-engine restore, but in the fact that we repeatedly (I think, didn't try) ask about the same snapshots on each upgrade - that we should mark the ones we prompted about as "confirmed", and do not ask again about them. 2. Another way to think about this is "Please remove these snapshots as soon as possible". Do we allow that? Not sure. If so, we can update the prompt like this. 3. We can also (unrelated) allow passing a custom answer file, to ease the workaround. 4. If we do not do any of above, I'd personally still prefer CLOSE WONTFIX, because I think that changing the default to Yes is risky for users doing an actual upgrade and not noticing it, then breaking their snapshots.
(In reply to Yedidyah Bar David from comment #8) > Some notes: > > 1. One might claim it's not a bug in hosted-engine restore, but in the fact > that we repeatedly (I think, didn't try) ask about the same snapshots on > each upgrade - that we should mark the ones we prompted about as > "confirmed", and do not ask again about them. > > 2. Another way to think about this is "Please remove these snapshots as soon > as possible". Do we allow that? Not sure. If so, we can update the prompt > like this. With respect to both of the above, Are you referring to the prompt we give in engine-setup when we go to the next version?. > 3. We can also (unrelated) allow passing a custom answer file, to ease the > workaround. right, I tried inserting the answer to this option as OVESETUP_IGNORE_SNAPSHOTS_WITH_OLD_COMPAT_LEVEL via the hook mentioned in comment #3 But for some reason it did not work, could you confirm if OVESETUP_IGNORE_SNAPSHOTS_WITH_OLD_COMPAT_LEVEL is the correct parameter to pass in the answers file for resolving this?. > 4. If we do not do any of above, I'd personally still prefer CLOSE WONTFIX, > because I think that changing the default to Yes is risky for users doing an > actual upgrade and not noticing it, then breaking their snapshots. Understood and agreed However, I was suggesting if we could prompt the user this question if old snapshots are detected, _only_ when we run hosted-engine deploy with "--restore-from-file" IMO, we should at least give the user a choice here, this because many a times many a times when they restore, users don't have the old manager server to go back and delete the snapshots. During restore in such situations, we fail because of the default answer. Let me know your views.
(In reply to Siddhant Rao from comment #9) > (In reply to Yedidyah Bar David from comment #8) > > Some notes: > > > > 1. One might claim it's not a bug in hosted-engine restore, but in the fact > > that we repeatedly (I think, didn't try) ask about the same snapshots on > > each upgrade - that we should mark the ones we prompted about as > > "confirmed", and do not ask again about them. > > > > 2. Another way to think about this is "Please remove these snapshots as soon > > as possible". Do we allow that? Not sure. If so, we can update the prompt > > like this. > > With respect to both of the above, Are you referring to the prompt we give > in engine-setup when we go to the next version?. Yes. AFAIU this is the prompt that is breaking the restore, no? (Also partially replying to your point below:) I assume that normally, people do not have to restore to a different version than they backed up. So the flow is: 1. Install and setup an old engine, create VMs and snapshots 2. Upgrade to a newer one, be asked about the snapshots and confirm upgrade 3. Take a backup 4. Try to restore it to same version If so, then if we make step (2.) mark somewhere that the user already confirmed upgrade with these snapshots, no reason to ask about them again at (4.). > > > > 3. We can also (unrelated) allow passing a custom answer file, to ease the > > workaround. > > right, I tried inserting the answer to this option as > OVESETUP_IGNORE_SNAPSHOTS_WITH_OLD_COMPAT_LEVEL via the hook mentioned in > comment #3 > But for some reason it did not work, could you confirm if > OVESETUP_IGNORE_SNAPSHOTS_WITH_OLD_COMPAT_LEVEL is the correct parameter to > pass > in the answers file for resolving this?. Sorry, no. I do not have an engine right now to test this on. Please try this manually and check the generated answer file. It should be something like: QUESTION/1/OVESETUP_IGNORE_SNAPSHOTS_WITH_OLD_COMPAT_LEVEL=str:yes I am sorry I didn't notice this when reading previously and wasted your time :-(. > > > > 4. If we do not do any of above, I'd personally still prefer CLOSE WONTFIX, > > because I think that changing the default to Yes is risky for users doing an > > actual upgrade and not noticing it, then breaking their snapshots. > > Understood and agreed > However, I was suggesting if we could prompt the user this question if old > snapshots are detected, _only_ when we run hosted-engine deploy with > "--restore-from-file" > IMO, we should at least give the user a choice here, > this because many a times many a times when they restore, users don't have > the old manager server to go back and delete the snapshots. See the start of my comment. If we do this well (mark these old snapshots as ACKed also for future upgrades), we should be ok. For now, I do not object to making restore-from-file add this option to the answer file. I see why it makes sense. > During restore in such situations, we fail because of the default answer. > > > Let me know your views. Also: 1. In the past we did have specific interaction in deploy to affect engine-setup, see e.g. bug 1686445. So in principle we can do this again, although in practice it adds lots of duplication (in hosted-engine deploy and engine-setup), while engine-setup was really simply not designed to be used like that. 2. You can also try to reply 'Yes' to 'Pause the execution after adding this host to the engine?', see bug 1712667 comment 13. Looking at the code, I think it would not have helped, because it makes deploy wait after trying to add the host, while in your case you failed before that. Perhaps we should add another such pause after engine-setup, if user replied Yes and it failed. Then, user can login to the engine machine, run engine-setup interactively, and continue (by removing the lock file).
Did you try updating your workaround with the correct line? Did it work? If so, is that ok for you? I'd like also someone from storage team to comment. Nir - other than the warning/prompt telling people their old snapshots will not work, can/should we do anything else? Can/Should we mark each such snapshot as "confirmed"? I do not think this can be a single value for all snapshots, because future versions/upgrades might introduce new incompatibilities. E.g. if a future 4.5 version supports only snapshots created by 4.4 and later (just an example), we'll want to prompt again then. Currently we always prompt, if we find such snapshots. Can a user remove them, safely? If not, can they remove them safely before upgrade? If so, I guess we should simply do nothing, with the assumption that users that want to remove these snapshots must do that before upgrade. Then we need to decide what to do about hosted-engine restore/upgrade.
(In reply to Yedidyah Bar David from comment #11) > Can/Should we mark each such snapshot as "confirmed"? I > do not think this can be a single value for all snapshots, because future > versions/upgrades might introduce new incompatibilities. E.g. if a future > 4.5 version supports only snapshots created by 4.4 and later (just an > example), we'll want to prompt again then. Currently we always prompt, if we > find such snapshots. Can a user remove them, safely? If not, can they remove > them safely before upgrade? If so, I guess we should simply do nothing, with > the assumption that users that want to remove these snapshots must do that > before upgrade. Then we need to decide what to do about hosted-engine > restore/upgrade. it doesn't have tot be complicated. You can always ask, just default to ignore them. You can remove them later on too, the only thing the check is telling you is that you won't be able to restore to them
I'd suggest to add that QUESTION/1/OVESETUP_IGNORE_SNAPSHOTS_WITH_OLD_COMPAT_LEVEL=str:yes to either https://github.com/oVirt/ovirt-ansible-engine-setup/blob/master/templates/basic_answerfile.txt.j2 or the upgrade one. Or both, as it makes sense to skip in every non-interactive engine-setup executions
(In reply to Yedidyah Bar David from comment #11) > Did you try updating your workaround with the correct line? > > Did it work? > comment #6 did resolve the issue, but there we actually changed the default from False to True in ovirt-engine/db/schema.py, which seems to be literally changing the source code. Not sure if that would be feasible everytime. (In reply to Michal Skrivanek from comment #12) > (In reply to Yedidyah Bar David from comment #11) > > Can/Should we mark each such snapshot as "confirmed"? I > > do not think this can be a single value for all snapshots, because future > > versions/upgrades might introduce new incompatibilities. E.g. if a future > > 4.5 version supports only snapshots created by 4.4 and later (just an > > example), we'll want to prompt again then. Currently we always prompt, if we > > find such snapshots. Can a user remove them, safely? If not, can they remove > > them safely before upgrade? If so, I guess we should simply do nothing, with > > the assumption that users that want to remove these snapshots must do that > > before upgrade. Then we need to decide what to do about hosted-engine > > restore/upgrade. > > it doesn't have tot be complicated. You can always ask, just default to > ignore them. You can remove them later on too, the only thing the check is > telling you is that you won't be able to restore to them Agreed. (In reply to Michal Skrivanek from comment #13) > I'd suggest to add that > QUESTION/1/OVESETUP_IGNORE_SNAPSHOTS_WITH_OLD_COMPAT_LEVEL=str:yes to either > https://github.com/oVirt/ovirt-ansible-engine-setup/blob/master/templates/ > basic_answerfile.txt.j2 or the upgrade one. Or both, as it makes sense to > skip in every non-interactive engine-setup executions Again, agreed
Will do the build of ovirt-ansible-engine-setup as soon as possible.
(In reply to Yedidyah Bar David from comment #11) > I'd like also someone from storage team to comment. Nir - other than the > warning/prompt telling people their old snapshots will not work, can/should > we do anything else? I don't know what are these snapshots. Benny, can you help with this?
the fix was released on 12th May (https://github.com/oVirt/ovirt-ansible-engine-setup/releases/tag/1.2.4)
This still fails HE setup, are there any specific arguments that need to be added when running the deploy from backup? I think it should be made default to ignore the old snapshots. Of course I'm running through the regular cmd line tool as I far as I know that is still the default way to install HE. Using ovirt-hosted-engine-setup-2.4.4-1.el8ev.noarch
So the fix on the ovirt-ansible-engine-setup did not help maybe the issue is somewhere else.
(In reply to Petr Matyáš from comment #18) > This still fails HE setup, are there any specific arguments that need to be > added when running the deploy from backup? > I think it should be made default to ignore the old snapshots. > Of course I'm running through the regular cmd line tool as I far as I know > that is still the default way to install HE. > > Using ovirt-hosted-engine-setup-2.4.4-1.el8ev.noarch Please note that the fix was on ovirt-ansible-engine-setup, not ovirt-hosted-engine-setup. Did you use the correct version? If so, please upload relevant logs (probably a sosreport is enough).
I'm using ovirt-ansible-engine-setup-1.2.4-1.el8ev.noarch I deployed HE, created couple VMs and added two snapshots to each, then changed compat level in DB for each of those snapshots and verified by running engine-setup that it does find snapshots with old compat level. After reinstalling the HE host I installed HE packages and ran 'hosted-engine --deploy --restore-from-file=file.backup' and provided all necessary information. As the bug was reported on hosted-engine-setup I guess verifying regular flow is in place.
Ok, sorry for bothering Martin, we actually do not use the answer file(s) contained in ovirt-ansible-engine-setup, but rely on the default one provided in the appliance. Moving the bug there. (In reply to Petr Matyáš from comment #21) > I'm using ovirt-ansible-engine-setup-1.2.4-1.el8ev.noarch > > I deployed HE, created couple VMs and added two snapshots to each, then > changed compat level in DB for each of those snapshots and verified by > running engine-setup that it does find snapshots with old compat level. > After reinstalling the HE host I installed HE packages and ran > 'hosted-engine --deploy --restore-from-file=file.backup' and provided all > necessary information. > > As the bug was reported on hosted-engine-setup I guess verifying regular > flow is in place. I agree, for flows using this ansible role.
Verified on rhvm-appliance-2:4.4-20200707.0.el8ev.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHV Appliance (rhvm-appliance) 4.4), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:3315