Description of problem: One of the tasks in hosted-engine deploy is to add the host to the engine, and then wait until it shows as 'Up' in the engine. If after several attempts it's not up, we fail the deploy. If the host's state is "non_operational", we try to get error messages from the engine vm and present these. Otherwise, we do not emit anything concrete. We do always log the result of the check. Before fixing bug 1787267, this looks like: 2019-12-31 14:37:23,095+0900 DEBUG var changed: host "localhost" var "host_result_up_check" type "<type 'dict'>" value: "{ "ansible_facts": { "ovirt_hosts": [] }, "attempts": 120, "changed": false, "deprecations": [ { "msg": "The 'ovirt_host_facts' module has been renamed to 'ovirt_host_info', and the renamed one no longer returns ansible_facts", "version": "2.13" } ], "failed": true }" 2019-12-31 14:37:23,096+0900 ERROR ansible failed {'status': 'FAILED', 'ansible_type': 'task', 'ansible_task': u'Wait for the host to be up', 'ansible_result': u'type: <type \'dict\'>\nstr: {u\'deprecations\': [{u\'msg\': u"The \'ovirt_host_facts\' module has been renamed to \'ovirt_host_info\', and the renamed one no longer returns ansible_facts", u\'version\': u\'2.13\'}], \'_ansible_no_log\': False, u\'changed\': False, \'attempts\': 120, u\'invocation\': {u\'module_args\': {u\'all_content\': False, u\'patt', 'task_duration': 671, 'ansible_host': u'localhost', 'ansible_playbook': u'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml'} However, the note about ovirt_host_facts is just a deprecation warning, and is not the reason for failure. The reason for failure, from the POV of this part of the code, is simply that the host did not come up. The reason for that can usually be diagnosed by checking logs from inside the engine vm, which the script does try to fetch as the next task, which then looks like: 2019-12-31 14:37:23,729+0900 INFO ansible task start {'status': 'OK', 'ansible_task': u'ovirt.hosted_engine_setup : Fetch logs from the engine VM', 'ansible_playbook': u'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml', 'ansible_type': 'task'} So we should probably emit something like: ERROR: host is not up, please check logs, perhaps also on the engine machine
Adding my 2 cents worth (after spending half an hour debugging a broken installation due to me being careless, see below). 1. If the user's DNS server cannot resolve the host address and the user (read: me) was stupid enough to miss the "Add lines... to /etc/hosts on the engine VM?" (or simply answers no), there's little indication by the error message (see below), why the deployment has failed. (unless you are driven enough to connect the semi-dead hosted engine and check what's broken). 2. The "add lines" question should default to "Yes" if the ansible script fails to resolve the host address. (You only get a short warning that the host can only be resolved locally) 3. If the user (read: me) was stupid enough to select "No" in the previous question and the VM engine fails to resolve the host address, it should show a big red sign saying "You have a broken DNS setup. Are you really really, really sure you want to continue trying to deploy? 'cause if it breaks, and it will, you'll get to keep all the pieces...") 4. As above, the resulting error message is very descriptiveness. (to say the least). Error message: [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_vms": [{"affinity_labels": [], "applications": [], "bios": {"boot_menu": {"enabled": false}, "type": "cluster_default"}, "cdroms": [], "cluster": {"href": "/ovirt-engine/api/clusters/1ac7525a-b3d1-11ea-9c7a-00163e57d088", "id": "1ac7525a-b3d1-11ea-9c7a-00163e57d088"}, "comment": "", "cpu": {"architecture": "x86_64", "topology": {"cores": 1, "sockets": 4, "threads": 1}}, "cpu_profile": {"href": "/ovirt-engine/api/cpuprofiles/58ca604e-01a7-003f-01de-000000000250", "id": "58ca604e-01a7-003f-01de-000000000250"}, "cpu_shares": 0, "creation_time": "2020-06-21 11:15:08.207000-04:00", "delete_protected": false, "description": "", "disk_attachments": [], "display": {"address": "127.0.0.1", "allow_override": false, "certificate": {"content": "-----BEGIN CERTIFICATE-----\nMIID3jCCAsagAwIBAgICEAAwDQYJKoZIhvcNAQELBQAwUTELMAkGA1UEBhMCVVMxFDASBgNVBAoM\nC2xvY2FsZG9tYWluMSwwKgYDVQQDDCNnaWxib2Etd3gtdm1vdmlydC5sb2NhbGRvbWFpbi40MTE5\nMTAeFw0yMDA2MjAxNTA3MTFaFw0zMDA2MTkxNTA3MTFaMFExCzAJBgNVBAYTAlVTMRQwEgYDVQQK\nDAtsb2NhbGRvbWFpbjEsMCoGA1UEAwwjZ2lsYm9hLXd4LXZtb3ZpcnQubG9jYWxkb21haW4uNDEx\nOTEwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCUNgcCn28BMlMcadFZPR9JAWjOWyh0\nWMQffOSKUlr7H+6K02IdjCR5K9bR9moAlMA4dNzF/NJa12BlCmDkwOSsgZl+NK/Ut3kqfPp4CqMl\nU3jkJzqRnh0rqOFnQ4Q1tsejziH1MSiH5/eb4A3g2s0awXF6K+JRMp2MB9wYQx//tZrvhTLprK+Y\n9jXdQFZby8j+/9pqIdN7uoYbuqESRNcfIJ0WigJ10/IOAwloT0MASwyVtCRTCCXNE4PRN+Lexlcc\nxXq2QZ0zG8u3leLT6/J87PCP/OEj976fZ19q83stWjygu4+UiWS+QStlrzc1U+aGVxa+sO+9mv3f\n6CwT0clvAgMBAAGjgb8wgbwwHQYDVR0OBBYEFOiEmL8+rz3I4j5rmL+ws47Jv5KiMHoGA1UdIwRz\nMHGAFOiEmL8+rz3I4j5rmL+ws47Jv5KioVWkUzBRMQswCQYDVQQGEwJVUzEUMBIGA1UECgwLbG9j\nYWxkb21haW4xLDAqBgNVBAMMI2dpbGJvYS13eC12bW92aXJ0LmxvY2FsZG9tYWluLjQxMTkxggIQ\nADAPBgNVHRMBAf8EBTADAQH/MA4GA1UdDwEB/wQEAwIBBjANBgkqhkiG9w0BAQsFAAOCAQEAStVI\nhHRrw5aa3YUNcwYh+kQfS47Es12nNRFeVVzbXj9CLS/TloYjyXEyZvFmYyyjNvuj4/3WcQDfeaG6\nTUGoFJ1sleOMT04WYWNJGyvsOfokT+I7yrBsVMg/7vip8UQV0ttmVoY/kMhZufwAUNlsZyh6F2o2\nNpAAcdLoguHo3UCGyaL8pF4G0NOAR/eV1rpl4VikqehUsXZ1sYzYZfK98xXrmepI42Lt3B2L6f9t\ngzYJ99jsrOGFhgvgV0H+PclviIdz79Jj3ZpPhezHkNQyrp0GOM0rqW+9xy50tlCQJ4rjdrRxnr21\nGpD3ZaQ2KSwGU79pnnRT6m7MSQ8irci3/A==\n-----END CERTIFICATE-----\n", "organization": "localdomain", "subject": "O=localdomain,CN=gilboa-wx-ovirt.localdomain"}, "copy_paste_enabled": true, "disconnect_action": "LOCK_SCREEN", "file_transfer_enabled": true, "monitors": 1, "port": 5900, "single_qxl_pci": false, "smartcard_enabled": false, "type": "vnc"}, "fqdn": "gilboa-wx-vmovirt.localdomain", "graphics_consoles": [], "guest_operating_system": {"architecture": "x86_64", "codename": "", "distribution": "CentOS Linux", "family": "Linux", "kernel": {"version": {"build": 0, "full_version": "4.18.0-147.8.1.el8_1.x86_64", "major": 4, "minor": 18, "revision": 147}}, "version": {"full_version": "8", "major": 8}}, "guest_time_zone": {"name": "EDT", "utc_offset": "-04:00"}, "high_availability": {"enabled": false, "priority": 0}, "host": {"href": "/ovirt-engine/api/hosts/5ca55132-6d20-4a7f-81a8-717095ba8f78", "id": "5ca55132-6d20-4a7f-81a8-717095ba8f78"}, "host_devices": [], "href": "/ovirt-engine/api/vms/60ba9f1a-cdb1-406e-810d-187dbdd7775c", "id": "60ba9f1a-cdb1-406e-810d-187dbdd7775c", "io": {"threads": 1}, "katello_errata": [], "large_icon": {"href": "/ovirt-engine/api/icons/a753f77a-89a4-4b57-9c23-d23bd61ebdaf", "id": "a753f77a-89a4-4b57-9c23-d23bd61ebdaf"}, "memory": 8589934592, "memory_policy": {"guaranteed": 8589934592, "max": 8589934592}, "migration": {"auto_converge": "inherit", "compressed": "inherit", "encrypted": "inherit"}, "migration_downtime": -1, "multi_queues_enabled": true, "name": "external-HostedEngineLocal", "next_run_configuration_exists": false, "nics": [], "numa_nodes": [], "numa_tune_mode": "interleave", "origin": "external", "original_template": {"href": "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": "00000000-0000-0000-0000-000000000000"}, "os": {"boot": {"devices": ["hd"]}, "type": "other"}, "permissions": [], "placement_policy": {"affinity": "migratable"}, "quota": {"id": "27d40902-b3d1-11ea-80f7-00163e57d088"}, "reported_devices": [], "run_once": false, "sessions": [], "small_icon": {"href": "/ovirt-engine/api/icons/0676b521-5b2b-4474-9394-8e9e8e3b426f", "id": "0676b521-5b2b-4474-9394-8e9e8e3b426f"}, "snapshots": [], "sso": {"methods": [{"id": "guest_agent"}]}, "start_paused": false, "stateless": false, "statistics": [], "status": "unknown", "storage_error_resume_behaviour": "auto_resume", "tags": [], "template": {"href": "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": "00000000-0000-0000-0000-000000000000"}, "time_zone": {"name": "Etc/GMT"}, "type": "server", "usb": {"enabled": false}, "watchdogs": []}]}, "attempts": 24, "changed": false, "deprecations": [{"msg": "The 'ovirt_vm_facts' module has been renamed to 'ovirt_vm_info', and the renamed one no longer returns ansible_facts", "version": "2.13"}]} [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
Failed to deploy HE with Host is not up, please check logs, perhaps also on the engine machine [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Host is not up, please check logs, perhaps also on the engine machine"} ovirt-ansible-hosted-engine-setup-1.1.7-1.el8ev.noarch rhvm-appliance-4.4-20200722.0.el8ev.x86_64 ovirt-hosted-engine-ha-2.4.4-1.el8ev.noarch ovirt-hosted-engine-setup-2.4.6-1.el8ev.noarch Linux 4.18.0-193.14.3.el8_2.x86_64 #1 SMP Mon Jul 20 15:02:29 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux release 8.2 (Ootpa)
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.