Created attachment 1880802 [details] ovirt-hosted-engine-setup-20220517234200-cdzlvl.log.gz Description of problem: Restore from 4.3 to 4.5 fails Version-Release number of selected component (if applicable): ovirt-hosted-engine-ha-2.5.0-1.el8ev.noarch ovirt-hosted-engine-setup-2.6.3-1.el8ev.noarch How reproducible:Reproduced while the described below manual testing. Now repeating the test to see if this is consistent Steps to Reproduce: 0. Install env 4.3 with 3 hosts (done in Jenkins) The HE SD is nfs , though there are other SDs in the setup: 3 iscsi SDs, 3 gluster SDs. 1. migrate HE VM to host_mixed_1 (the first host in the setup) 2. set global maintenance . 3. engine-backup --mode=backup --file=backup_ge-8 --log=log_ge-8_backup and put it aside 4. reprovision host_mixed_1 to rhel 8.6 https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/infra_reprovision_job/3257/ 5. fix the initiator /etc/iscsi/initiatorname.iscsi after reprovisioning 6. On the host put the latest 4.5 repos and 'yum update -y', then 'yum install ovirt-hosted-engine-setup' . 7. Copy to host the backup file and run 'hosted-engine --deploy --restore-from-file=backup_ge-8 Actual results: while deploment I set DC, cluster the same names as in the backed up env. Result: it failed with the error "2022-05-18 00:09:54,923+0300 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:115 TASK [ovirt.ovirt.hosted_engine_setup : Generate the error message from the engine events] 2022-05-18 00:09:55,627+0300 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 {'msg': "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'id'\n\nThe error appears to be in '/usr/share/ansible/collections/ansible_collections/ovirt/ovirt/roles/hosted_engine_setup/tasks/bootstrap_local_vm/05_add_host.yml': line 233, column 9, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: Generate the error message from the engine events\n ^ here\n", '_ansible_no_log': False} 2022-05-18 00:09:55,728+0300 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 ignored: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'id'\n\nThe error appears to be in '/usr/share/ansible/collections/ansible_collections/ovirt/ovirt/roles/hosted_engine_setup/tasks/bootstrap_local_vm/05_add_host.yml': line 233, column 9, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: Generate the error message from the engine events\n ^ here\n"} 2022-05-18 00:09:56,430+0300 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:115 TASK [ovirt.ovirt.hosted_engine_setup : Fail with error description] 2022-05-18 00:09:57,133+0300 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:115 skipping: [localhost] 2022-05-18 00:09:58,037+0300 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:115 TASK [ovirt.ovirt.hosted_engine_setup : Fail with generic error] 2022-05-18 00:09:58,841+0300 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 {'msg': 'The host has been set in non_operational status, please check engine logs, more info can be found in the engine logs, fix accordingly and re-deploy.', '_ansible_no_log': False, 'changed': False} 2022-05-18 00:09:58,942+0300 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:113 fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, please check engine logs, more info can be found in the engine logs, fix accordingly and re-deploy."} " I'm repeating the test again and will provide more logs Expected results: restored with no errors Additional info:
Hi Didi, I'm repeating the test again , this time 4.3 setup must be deployed with no iscsi, gluster at all. Could you please tell which logs should be added besides the ovirt-hosted-engine-setup-20220517234200-cdzlvl.log.gz, vdsm.log,
whole /var/log directory on the host. That should include HE's engine.log inside /var/log/ovirt-hosted-engine-setup/engine-logs-*/
HE is being deployed with IPv6 but the hostname resolution is dual stack, java prefers IPv4 so it tries to connect over IPv4 but HE VM doesn't have a route for that, it only has IPv6 either - explicitly force IPv4 (as I suppose that's what you want to use anyway) - fix DNS so it returns IPv6-only records - change java DNS resolution preference to java.net.preferIPv6Addresses=true in /etc/ovirt-engine/engine.conf.d/
closing for now. Please reopen if this re-appears after applying one of the three alternatives in comment #15
Successfully upgraded from ovirt-engine-setup-4.3.11.4-0.1.el7.noarch to ovirt-engine-4.5.0.6-0.7.el8ev.noarch , NFS to NFS using hosted-engine --deploy --4 --restore-from-file=/root/nsednev_from_serval14_SPM_rhevm_4_3. [ INFO ] Hosted Engine successfully deployed [ INFO ] Other hosted-engine hosts have to be reinstalled in order to update their storage configuration. From the engine, host by host, please set maintenance mode and then click on reinstall button ensuring you choose DEPLOY in hosted engine tab. [ INFO ] Please note that the engine VM ssh keys have changed. Please remove the engine VM entry in ssh known_hosts on your clients. 4.3 components on hosts: ansible-2.9.13-1.el7ae.noarch ovirt-ansible-repositories-1.1.6-1.el7ev.noarch ovirt-ansible-hosted-engine-setup-1.0.38-1.el7ev.noarch ovirt-ansible-engine-setup-1.1.9-1.el7ev.noarch ovirt-hosted-engine-ha-2.3.6-1.el7ev.noarch ovirt-hosted-engine-setup-2.3.13-2.el7ev.noarch Engne 4.3: ovirt-ansible-engine-setup-1.1.9-1.el7ev.noarch ovirt-ansible-hosted-engine-setup-1.0.38-1.el7ev.noarch ovirt-engine-setup-4.3.11.4-0.1.el7.noarch 4.5 components on hosts: ovirt-hosted-engine-setup-2.6.3-1.el8ev.noarch ovirt-hosted-engine-ha-2.5.0-1.el8ev.noarch ansible-collection-ansible-utils-2.3.0-2.2.el8ev.noarch ansible-collection-ansible-posix-1.3.0-1.2.el8ev.noarch ansible-core-2.12.2-3.1.el8.x86_64 ovirt-ansible-collection-2.0.3-1.el8ev.noarch ansible-collection-ansible-netcommon-2.2.0-3.2.el8ev.noarch Engne 4.5: ovirt-engine-4.5.0.6-0.7.el8ev.noarch ansible-collection-ansible-netcommon-2.2.0-3.2.el8ev.noarch ansible-runner-2.1.3-1.el8ev.noarch ansible-collection-ansible-utils-2.3.0-2.2.el8ev.noarch python38-ansible-runner-2.1.3-1.el8ev.noarch ansible-core-2.12.2-3.1.el8.x86_64 ovirt-ansible-collection-2.0.3-1.el8ev.noarch ansible-collection-ansible-posix-1.3.0-1.2.el8ev.noarch I've created this bug https://bugzilla.redhat.com/show_bug.cgi?id=2088466 for the warning to be added to deployment to avoid such issues in future.