Upgraded engine from 4.3.9 to 4.4.0, host still on 4.3.9, one VM running, everythign normal. Moved one host to maintenance and turned off. Reinstalled the host with oVirt Node 4.4.0 rc2. Within the engine the host is not active. Moved to maintenance. Tried to reinstall host and it failed telling to look at /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20200515145913-node0.lab-4115d42e-604c-4ef1-a247-26fa21bafac3.log its content is: 2020-05-15 14:59:16 UTC - TASK [Gathering Facts] ********************************************************* 2020-05-15 14:59:16 UTC - PLAY RECAP ********************************************************************* node0.lab : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0 which doesn't really give any valuable information. The real issue is found in engine log: 2020-05-15 15:02:23,121Z ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-780) [42aa371c-5223-416e-9266-47b105cd528e] ssh-copy-id command failed on ho st 'node0.lab': Invalid fingerprint SHA256:eWJXTDz5aWN2GZD/Y0RU6yrZSQyhvs4CwvW0Bm8uU0w, expected SHA256:Z8uzhgEqtTEZvYw/q9a/bs2mCNQDksvVh439VtgaPog host fingerprint changed due to host full reinstall with different operating system. Just fetching the new fingerprint from the host solves the issue and the system can then start to be re-deployed. We need to improve the error handling on fingerprint changed.
(In reply to Sandro Bonazzola from comment #0) > Upgraded engine from 4.3.9 to 4.4.0, host still on 4.3.9, one VM running, > everythign normal. > > Moved one host to maintenance and turned off. > > Reinstalled the host with oVirt Node 4.4.0 rc2. > > Within the engine the host is not active. Moved to maintenance. > Tried to reinstall host and it failed telling to look at This is not supported flow, the only supported flow of upgrading host from 4.3 to 4.4 is: 1. Move host to Maintenance 2. Remove host from engine 3. Reinstall OS on the host 4. Add host to engine > > /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20200515145913- > node0.lab-4115d42e-604c-4ef1-a247-26fa21bafac3.log > > its content is: > 2020-05-15 14:59:16 UTC - TASK [Gathering Facts] > ********************************************************* > > 2020-05-15 14:59:16 UTC - PLAY RECAP > ********************************************************************* > > node0.lab : ok=0 changed=0 unreachable=1 failed=0 > skipped=0 rescued=0 ignored=0 > > which doesn't really give any valuable information. > > The real issue is found in engine log: > > 2020-05-15 15:02:23,121Z ERROR > [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] > (EE-ManagedThreadFactory-engine-Thread-780) > [42aa371c-5223-416e-9266-47b105cd528e] ssh-copy-id command failed on ho > st 'node0.lab': Invalid fingerprint > SHA256:eWJXTDz5aWN2GZD/Y0RU6yrZSQyhvs4CwvW0Bm8uU0w, expected > SHA256:Z8uzhgEqtTEZvYw/q9a/bs2mCNQDksvVh439VtgaPog > > host fingerprint changed due to host full reinstall with different operating > system. > > Just fetching the new fingerprint from the host solves the issue and the > system can then start to be re-deployed. > > We need to improve the error handling on fingerprint changed. I agree, we need to show above error as error event in audit_log
(In reply to Martin Perina from comment #1 > This is not supported flow, the only supported flow of upgrading host from > 4.3 to 4.4 is: > > 1. Move host to Maintenance > 2. Remove host from engine > 3. Reinstall OS on the host > 4. Add host to engine Any particular reason for #2 besides fingerprint? What would it take to fix this limitation? Host removal is..annoying
(In reply to Michal Skrivanek from comment #2) > (In reply to Martin Perina from comment #1 > > This is not supported flow, the only supported flow of upgrading host from > > 4.3 to 4.4 is: > > > > 1. Move host to Maintenance > > 2. Remove host from engine > > 3. Reinstall OS on the host > > 4. Add host to engine > > Any particular reason for #2 besides fingerprint? What would it take to fix > this limitation? Host removal is..annoying Host is just one step. And around consequences, we would need to retest, but here are a few examples: 1. If the host is hosted engine host and we wouldn't reinstall it with hosted engine option set to deploy, we would have inconsistence between engine DB and host 2. If host is part of OVS cluster and we would just reinstalled it, we are loosing OVS setup on the itself, but the host would be still mentioned within OVS database on engine And there might be other issues We have never supported changing OS of the host on for host in maintenance (for example switch from RHV-H to RHEL-H and vice versa), so I really don't see any reason why we should support even more problematic change from EL7 to EL8
Moving to MODIFIED as the fix displaying the error around invalid fingerprint is shown
moving milestone due to dependant bug
Verified on ovirt-engine-4.4.1.10-0.1.el8ev.noarch
This bugzilla is included in oVirt 4.4.1.1 Async release, published on July 13th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.1.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.