Created attachment 1769246 [details] logs Created attachment 1769246 [details] logs Description of problem: Hosted-engine fail to add the first host Version-Release number of selected component (if applicable): 4.4.6.1-0.11.el8ev How reproducible: 100% Steps to Reproduce: 1. Provision host with rhel-8.4 2. Run 'ovirt-ansible-collection' 'hosted_engine_setup' role 3. Actual results: Should succeed Expected results: Failed to connect to Engine FQDN, No route to host Additional info: See in attached logs: /var/log/ovirt-hosted-engine-setup/engine-logs-2021-04-05T08:46:51Z/log/ovirt-engine/engine.log 2021-04-05 12:29:41,701+03 ERROR [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor] (EE-ManagedThreadFactory-engine-Thread-1) [21ae3b75] Exception: sleep interrupted 2021-04-05 12:29:41,717+03 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [21ae3b75] Host installation failed for host '2d95e60b-e8af-4315-bdb3-0b9797a19061', 'host_mixed_1': Failed to execute call to ansible runner service: http://localhost:50001/api/v1/jobs/169a1fca-95f1-11eb-8198-001a4a168bfa/events Connect to localhost:50001 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused) 2021-04-05 12:29:41,871+03 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [21ae3b75] START, SetVdsStatusVDSCommand(HostName = host_mixed_1, SetVdsStatusVDSCommandParameters:{hostId='2d95e60b-e8af-4315-bdb3-0b9797a19061', status='InstallFailed', nonOperationalReason='NONE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 2d72b719 2021-04-05 12:29:41,925+03 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [21ae3b75] FINISH, SetVdsStatusVDSCommand, return: , log id: 2d72b719 2021-04-05 12:29:41,979+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [21ae3b75] EVENT_ID: VDS_INSTALL_FAILED(505), Host host_mixed_1 installation failed. Failed to execute call to ansible runner service: http://localhost:50001/api/v1/jobs/169a1fca-95f1-11eb-8198-001a4a168bfa/events Connect to localhost:50001 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused). Part of console output: 12:27:04 TASK [ovirt.ovirt.hosted_engine_setup : Obtain SSO token using username/password credentials] *** 12:27:05 ok: [lynx09.lab.eng.tlv2.redhat.com] 12:27:05 12:27:05 TASK [ovirt.ovirt.hosted_engine_setup : Wait for the host to be up] ************ 12:27:06 FAILED - RETRYING: Wait for the host to be up (120 retries left). 12:27:17 FAILED - RETRYING: Wait for the host to be up (119 retries left). 12:27:28 FAILED - RETRYING: Wait for the host to be up (118 retries left). 12:27:39 FAILED - RETRYING: Wait for the host to be up (117 retries left). 12:27:49 FAILED - RETRYING: Wait for the host to be up (116 retries left). 12:28:00 FAILED - RETRYING: Wait for the host to be up (115 retries left). 12:28:11 FAILED - RETRYING: Wait for the host to be up (114 retries left). 12:28:22 FAILED - RETRYING: Wait for the host to be up (113 retries left). 12:28:33 FAILED - RETRYING: Wait for the host to be up (112 retries left). 12:28:44 FAILED - RETRYING: Wait for the host to be up (111 retries left). 12:28:54 FAILED - RETRYING: Wait for the host to be up (110 retries left). 12:29:05 FAILED - RETRYING: Wait for the host to be up (109 retries left). 12:29:16 FAILED - RETRYING: Wait for the host to be up (108 retries left). 12:29:27 FAILED - RETRYING: Wait for the host to be up (107 retries left). 12:29:38 FAILED - RETRYING: Wait for the host to be up (106 retries left). 12:29:48 FAILED - RETRYING: Wait for the host to be up (105 retries left). 12:30:02 FAILED - RETRYING: Wait for the host to be up (104 retries left). 12:30:16 FAILED - RETRYING: Wait for the host to be up (103 retries left). 12:30:30 FAILED - RETRYING: Wait for the host to be up (102 retries left). 12:30:44 FAILED - RETRYING: Wait for the host to be up (101 retries left). 12:30:57 FAILED - RETRYING: Wait for the host to be up (100 retries left). 12:31:11 FAILED - RETRYING: Wait for the host to be up (99 retries left). 12:31:25 FAILED - RETRYING: Wait for the host to be up (98 retries left). 12:31:39 FAILED - RETRYING: Wait for the host to be up (97 retries left). 12:31:53 FAILED - RETRYING: Wait for the host to be up (96 retries left). 12:32:06 FAILED - RETRYING: Wait for the host to be up (95 retries left). 12:32:20 FAILED - RETRYING: Wait for the host to be up (94 retries left). 12:32:34 FAILED - RETRYING: Wait for the host to be up (93 retries left). 12:32:48 FAILED - RETRYING: Wait for the host to be up (92 retries left). 12:33:01 FAILED - RETRYING: Wait for the host to be up (91 retries left). 12:33:15 FAILED - RETRYING: Wait for the host to be up (90 retries left). 12:33:29 FAILED - RETRYING: Wait for the host to be up (89 retries left). 12:33:43 FAILED - RETRYING: Wait for the host to be up (88 retries left). 12:33:57 FAILED - RETRYING: Wait for the host to be up (87 retries left). 12:34:10 FAILED - RETRYING: Wait for the host to be up (86 retries left). 12:34:24 FAILED - RETRYING: Wait for the host to be up (85 retries left). 12:34:38 FAILED - RETRYING: Wait for the host to be up (84 retries left). 12:34:52 FAILED - RETRYING: Wait for the host to be up (83 retries left). 12:35:05 FAILED - RETRYING: Wait for the host to be up (82 retries left). 12:35:19 FAILED - RETRYING: Wait for the host to be up (81 retries left). 12:35:33 FAILED - RETRYING: Wait for the host to be up (80 retries left). 12:35:47 FAILED - RETRYING: Wait for the host to be up (79 retries left). 12:36:01 FAILED - RETRYING: Wait for the host to be up (78 retries left). 12:36:14 FAILED - RETRYING: Wait for the host to be up (77 retries left). 12:36:28 FAILED - RETRYING: Wait for the host to be up (76 retries left). 12:36:42 FAILED - RETRYING: Wait for the host to be up (75 retries left). 12:36:56 FAILED - RETRYING: Wait for the host to be up (74 retries left). 12:37:09 FAILED - RETRYING: Wait for the host to be up (73 retries left). 12:37:23 FAILED - RETRYING: Wait for the host to be up (72 retries left). 12:37:37 FAILED - RETRYING: Wait for the host to be up (71 retries left). 12:37:51 FAILED - RETRYING: Wait for the host to be up (70 retries left). 12:38:05 FAILED - RETRYING: Wait for the host to be up (69 retries left). 12:38:18 FAILED - RETRYING: Wait for the host to be up (68 retries left). 12:38:32 FAILED - RETRYING: Wait for the host to be up (67 retries left). 12:38:46 FAILED - RETRYING: Wait for the host to be up (66 retries left). 12:39:00 FAILED - RETRYING: Wait for the host to be up (65 retries left). 12:39:13 FAILED - RETRYING: Wait for the host to be up (64 retries left). 12:39:27 FAILED - RETRYING: Wait for the host to be up (63 retries left). 12:39:41 FAILED - RETRYING: Wait for the host to be up (62 retries left). 12:39:55 FAILED - RETRYING: Wait for the host to be up (61 retries left). 12:40:09 FAILED - RETRYING: Wait for the host to be up (60 retries left). 12:40:22 FAILED - RETRYING: Wait for the host to be up (59 retries left). 12:40:36 FAILED - RETRYING: Wait for the host to be up (58 retries left). 12:40:50 FAILED - RETRYING: Wait for the host to be up (57 retries left). 12:41:04 FAILED - RETRYING: Wait for the host to be up (56 retries left). 12:41:18 FAILED - RETRYING: Wait for the host to be up (55 retries left). 12:41:31 FAILED - RETRYING: Wait for the host to be up (54 retries left). 12:41:45 FAILED - RETRYING: Wait for the host to be up (53 retries left). 12:41:59 FAILED - RETRYING: Wait for the host to be up (52 retries left). 12:42:13 FAILED - RETRYING: Wait for the host to be up (51 retries left). 12:42:27 FAILED - RETRYING: Wait for the host to be up (50 retries left). 12:42:40 FAILED - RETRYING: Wait for the host to be up (49 retries left). 12:42:54 FAILED - RETRYING: Wait for the host to be up (48 retries left). 12:43:08 FAILED - RETRYING: Wait for the host to be up (47 retries left). 12:43:22 FAILED - RETRYING: Wait for the host to be up (46 retries left). 12:43:36 FAILED - RETRYING: Wait for the host to be up (45 retries left). 12:43:49 FAILED - RETRYING: Wait for the host to be up (44 retries left). 12:44:03 FAILED - RETRYING: Wait for the host to be up (43 retries left). 12:44:17 FAILED - RETRYING: Wait for the host to be up (42 retries left). 12:44:31 FAILED - RETRYING: Wait for the host to be up (41 retries left). 12:44:45 FAILED - RETRYING: Wait for the host to be up (40 retries left). 12:44:58 FAILED - RETRYING: Wait for the host to be up (39 retries left). 12:45:12 FAILED - RETRYING: Wait for the host to be up (38 retries left). 12:45:26 FAILED - RETRYING: Wait for the host to be up (37 retries left). 12:45:40 FAILED - RETRYING: Wait for the host to be up (36 retries left). 12:45:54 FAILED - RETRYING: Wait for the host to be up (35 retries left). 12:46:07 FAILED - RETRYING: Wait for the host to be up (34 retries left). 12:46:21 FAILED - RETRYING: Wait for the host to be up (33 retries left). 12:46:35 FAILED - RETRYING: Wait for the host to be up (32 retries left). 12:46:49 FAILED - RETRYING: Wait for the host to be up (31 retries left). 12:47:02 FAILED - RETRYING: Wait for the host to be up (30 retries left). 12:47:16 FAILED - RETRYING: Wait for the host to be up (29 retries left). 12:47:30 FAILED - RETRYING: Wait for the host to be up (28 retries left). 12:47:44 FAILED - RETRYING: Wait for the host to be up (27 retries left). 12:47:58 FAILED - RETRYING: Wait for the host to be up (26 retries left). 12:48:11 FAILED - RETRYING: Wait for the host to be up (25 retries left). 12:48:25 FAILED - RETRYING: Wait for the host to be up (24 retries left). 12:48:39 FAILED - RETRYING: Wait for the host to be up (23 retries left). 12:48:53 FAILED - RETRYING: Wait for the host to be up (22 retries left). 12:49:07 FAILED - RETRYING: Wait for the host to be up (21 retries left). 12:49:20 FAILED - RETRYING: Wait for the host to be up (20 retries left). 12:49:34 FAILED - RETRYING: Wait for the host to be up (19 retries left). 12:49:48 FAILED - RETRYING: Wait for the host to be up (18 retries left). 12:50:02 FAILED - RETRYING: Wait for the host to be up (17 retries left). 12:50:15 FAILED - RETRYING: Wait for the host to be up (16 retries left). 12:50:29 FAILED - RETRYING: Wait for the host to be up (15 retries left). 12:50:43 FAILED - RETRYING: Wait for the host to be up (14 retries left). 12:50:57 FAILED - RETRYING: Wait for the host to be up (13 retries left). 12:51:10 FAILED - RETRYING: Wait for the host to be up (12 retries left). 12:51:24 FAILED - RETRYING: Wait for the host to be up (11 retries left). 12:51:38 FAILED - RETRYING: Wait for the host to be up (10 retries left). 12:51:52 FAILED - RETRYING: Wait for the host to be up (9 retries left). 12:52:06 FAILED - RETRYING: Wait for the host to be up (8 retries left). 12:52:19 FAILED - RETRYING: Wait for the host to be up (7 retries left). 12:52:33 FAILED - RETRYING: Wait for the host to be up (6 retries left). 12:52:47 FAILED - RETRYING: Wait for the host to be up (5 retries left). 12:53:01 FAILED - RETRYING: Wait for the host to be up (4 retries left). 12:53:15 FAILED - RETRYING: Wait for the host to be up (3 retries left). 12:53:28 FAILED - RETRYING: Wait for the host to be up (2 retries left). 12:53:42 FAILED - RETRYING: Wait for the host to be up (1 retries left). 12:53:56 An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ovirtsdk4.Error: Failed to read response: [(<pycurl.Curl object at 0x5598c6c29b38>, 7, 'Failed to connect to hosted-engine-04.lab.eng.tlv2.redhat.com port 443: No route to host')] 12:53:56 fatal: [lynx09.lab.eng.tlv2.redhat.com]: FAILED! => {"attempts": 120, "changed": false, "msg": "Failed to read response: [(<pycurl.Curl object at 0x5598c6c29b38>, 7, 'Failed to connect to hosted-engine-04.lab.eng.tlv2.redhat.com port 443: No route to host')]"} 12:53:56 ...ignoring 12:53:56 12:53:56 TASK [ovirt.ovirt.hosted_engine_setup : debug] ********************************* 12:53:56 ok: [lynx09.lab.eng.tlv2.redhat.com] => { 12:53:56 "host_result_up_check": { 12:53:56 "attempts": 120, 12:53:56 "changed": false, 12:53:56 "exception": "Traceback (most recent call last):\n File \"/tmp/ansible_ovirt_host_info_payload_u1tz_twi/ansible_ovirt_host_info_payload.zip/ansible_collections/ovirt/ovirt/plugins/modules/ovirt_host_info.py\", line 112, in main\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/services.py\", line 13222, in list\n return self._internal_get(headers, query, wait)\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py\", line 211, in _internal_get\n return future.wait() if wait else future\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py\", line 54, in wait\n response = self._connection.wait(self._context)\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/__init__.py\", line 497, in wait\n return self.__wait(context, failed_auth)\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/__init__.py\", line 511, in __wait\n raise Error(\"Failed to read response: {}\".format(err_list))\novirtsdk4.Error: Failed to read response: [(<pycurl.Curl object at 0x5598c6c29b38>, 7, 'Failed to connect to hosted-engine-04.lab.eng.tlv2.redhat.com port 443: No route to host')]\n", 12:53:56 "failed": true, 12:53:56 "msg": "Failed to read response: [(<pycurl.Curl object at 0x5598c6c29b38>, 7, 'Failed to connect to hosted-engine-04.lab.eng.tlv2.redhat.com port 443: No route to host')]" 12:53:56 } 12:53:56 } 12:53:56 12:53:56 TASK [ovirt.ovirt.hosted_engine_setup : Notify the user about a failure] ******* 12:53:56 fatal: [lynx09.lab.eng.tlv2.redhat.com]: FAILED! => {"changed": false, "msg": "Host is not up, please check logs, perhaps also on the engine machine"} 12:53:56
Does it always reproduce at the same exact point? How many times did it fail like that? can you please get more logs (journal) from the engine being deployed?
also, does it fail or succeed when deployed using the documented procedure?
(In reply to Michal Skrivanek from comment #2) > Does it always reproduce at the same exact point? How many times did it fail > like that? > can you please get more logs (journal) from the engine being deployed? Yes it always reproduces, please see attached complete logs from the deployed host, note that the Engine VM was not created, due to this failure
(In reply to Roni from comment #4) > (In reply to Michal Skrivanek from comment #2) > > Does it always reproduce at the same exact point? How many times did it fail > > like that? > > can you please get more logs (journal) from the engine being deployed? > > Yes it always reproduces, please see attached complete logs from the > deployed host, > note that the Engine VM was not created, due to this failure thanks, but can you please answer my other questions - Does it fail or succeed when deployed using the documented procedure? Does it always reproduce at the same exact point (during host deploy, start of ovirt-vmconsole-host-sshd being the last message received in engine)? There is an engine - the one deploying that host that fails - and the logs included in the report are not sufficient, hence I'd like to ask for a journal - or a sosreport of that environment. It could be that it's removed at the end after the timeout of 120 retries...
(In reply to Michal Skrivanek from comment #5) > (In reply to Roni from comment #4) > > (In reply to Michal Skrivanek from comment #2) > > > Does it always reproduce at the same exact point? How many times did it fail > > > like that? > > > can you please get more logs (journal) from the engine being deployed? > > > > Yes it always reproduces, please see attached complete logs from the > > deployed host, > > note that the Engine VM was not created, due to this failure > > thanks, but can you please answer my other questions - Does it fail or > succeed when deployed using the documented procedure? Does it always > reproduce at the same exact point (during host deploy, start of > ovirt-vmconsole-host-sshd being the last message received in engine)? > There is an engine - the one deploying that host that fails - and the logs > included in the report are not sufficient, hence I'd like to ask for a > journal - or a sosreport of that environment. It could be that it's removed > at the end after the timeout of 120 retries... Here is another run example, it fails on the same place: 20:51:14 TASK [ovirt.ovirt.hosted_engine_setup : Obtain SSO token using username/password credentials] *** 20:51:15 ok: [lynx12.lab.eng.tlv2.redhat.com] 20:51:15 20:51:15 TASK [ovirt.ovirt.hosted_engine_setup : Wait for the host to be up] ************ 20:51:16 FAILED - RETRYING: Wait for the host to be up (120 retries left). 20:51:27 FAILED - RETRYING: Wait for the host to be up (119 retries left). 20:51:38 FAILED - RETRYING: Wait for the host to be up (118 retries left). 20:51:49 FAILED - RETRYING: Wait for the host to be up (117 retries left). 20:51:59 FAILED - RETRYING: Wait for the host to be up (116 retries left). 20:52:10 FAILED - RETRYING: Wait for the host to be up (115 retries left). 20:52:21 FAILED - RETRYING: Wait for the host to be up (114 retries left). 20:52:32 FAILED - RETRYING: Wait for the host to be up (113 retries left). 20:52:43 FAILED - RETRYING: Wait for the host to be up (112 retries left). 20:52:53 FAILED - RETRYING: Wait for the host to be up (111 retries left). 20:53:04 FAILED - RETRYING: Wait for the host to be up (110 retries left). 20:53:15 FAILED - RETRYING: Wait for the host to be up (109 retries left). 20:53:26 FAILED - RETRYING: Wait for the host to be up (108 retries left). 20:53:36 FAILED - RETRYING: Wait for the host to be up (107 retries left). 20:53:47 FAILED - RETRYING: Wait for the host to be up (106 retries left). 20:53:58 FAILED - RETRYING: Wait for the host to be up (105 retries left). 20:54:12 FAILED - RETRYING: Wait for the host to be up (104 retries left). 20:54:26 FAILED - RETRYING: Wait for the host to be up (103 retries left). 20:54:39 FAILED - RETRYING: Wait for the host to be up (102 retries left). 20:54:53 FAILED - RETRYING: Wait for the host to be up (101 retries left). 20:55:07 FAILED - RETRYING: Wait for the host to be up (100 retries left). 20:55:21 FAILED - RETRYING: Wait for the host to be up (99 retries left). 20:55:34 FAILED - RETRYING: Wait for the host to be up (98 retries left). 20:55:48 FAILED - RETRYING: Wait for the host to be up (97 retries left). 20:56:02 FAILED - RETRYING: Wait for the host to be up (96 retries left). 20:56:16 FAILED - RETRYING: Wait for the host to be up (95 retries left). 20:56:30 FAILED - RETRYING: Wait for the host to be up (94 retries left). 20:56:43 FAILED - RETRYING: Wait for the host to be up (93 retries left). 20:56:57 FAILED - RETRYING: Wait for the host to be up (92 retries left). 20:57:11 FAILED - RETRYING: Wait for the host to be up (91 retries left). 20:57:25 FAILED - RETRYING: Wait for the host to be up (90 retries left). 20:57:38 FAILED - RETRYING: Wait for the host to be up (89 retries left). 20:57:52 FAILED - RETRYING: Wait for the host to be up (88 retries left). 20:58:06 FAILED - RETRYING: Wait for the host to be up (87 retries left). 20:58:20 FAILED - RETRYING: Wait for the host to be up (86 retries left). 20:58:33 FAILED - RETRYING: Wait for the host to be up (85 retries left). 20:58:47 FAILED - RETRYING: Wait for the host to be up (84 retries left). 20:59:01 FAILED - RETRYING: Wait for the host to be up (83 retries left). 20:59:15 FAILED - RETRYING: Wait for the host to be up (82 retries left). 20:59:29 FAILED - RETRYING: Wait for the host to be up (81 retries left). 20:59:42 FAILED - RETRYING: Wait for the host to be up (80 retries left). 20:59:56 FAILED - RETRYING: Wait for the host to be up (79 retries left). 21:00:10 FAILED - RETRYING: Wait for the host to be up (78 retries left). 21:00:24 FAILED - RETRYING: Wait for the host to be up (77 retries left). 21:00:37 FAILED - RETRYING: Wait for the host to be up (76 retries left). 21:00:51 FAILED - RETRYING: Wait for the host to be up (75 retries left). 21:01:05 FAILED - RETRYING: Wait for the host to be up (74 retries left). 21:01:19 FAILED - RETRYING: Wait for the host to be up (73 retries left). 21:01:33 FAILED - RETRYING: Wait for the host to be up (72 retries left). 21:01:46 FAILED - RETRYING: Wait for the host to be up (71 retries left). 21:02:00 FAILED - RETRYING: Wait for the host to be up (70 retries left). 21:02:14 FAILED - RETRYING: Wait for the host to be up (69 retries left). 21:02:28 FAILED - RETRYING: Wait for the host to be up (68 retries left). 21:02:42 FAILED - RETRYING: Wait for the host to be up (67 retries left). 21:02:55 FAILED - RETRYING: Wait for the host to be up (66 retries left). 21:03:09 FAILED - RETRYING: Wait for the host to be up (65 retries left). 21:03:23 FAILED - RETRYING: Wait for the host to be up (64 retries left). 21:03:37 FAILED - RETRYING: Wait for the host to be up (63 retries left). 21:03:50 FAILED - RETRYING: Wait for the host to be up (62 retries left). 21:04:04 FAILED - RETRYING: Wait for the host to be up (61 retries left). 21:04:18 FAILED - RETRYING: Wait for the host to be up (60 retries left). 21:04:32 FAILED - RETRYING: Wait for the host to be up (59 retries left). 21:04:45 FAILED - RETRYING: Wait for the host to be up (58 retries left). 21:04:59 FAILED - RETRYING: Wait for the host to be up (57 retries left). 21:05:13 FAILED - RETRYING: Wait for the host to be up (56 retries left). 21:05:27 FAILED - RETRYING: Wait for the host to be up (55 retries left). 21:05:41 FAILED - RETRYING: Wait for the host to be up (54 retries left). 21:05:54 FAILED - RETRYING: Wait for the host to be up (53 retries left). 21:06:08 FAILED - RETRYING: Wait for the host to be up (52 retries left). 21:06:22 FAILED - RETRYING: Wait for the host to be up (51 retries left). 21:06:36 FAILED - RETRYING: Wait for the host to be up (50 retries left). 21:06:49 FAILED - RETRYING: Wait for the host to be up (49 retries left). 21:07:03 FAILED - RETRYING: Wait for the host to be up (48 retries left). 21:07:17 FAILED - RETRYING: Wait for the host to be up (47 retries left). 21:07:31 FAILED - RETRYING: Wait for the host to be up (46 retries left). 21:07:44 FAILED - RETRYING: Wait for the host to be up (45 retries left). 21:07:58 FAILED - RETRYING: Wait for the host to be up (44 retries left). 21:08:12 FAILED - RETRYING: Wait for the host to be up (43 retries left). 21:08:26 FAILED - RETRYING: Wait for the host to be up (42 retries left). 21:08:39 FAILED - RETRYING: Wait for the host to be up (41 retries left). 21:08:53 FAILED - RETRYING: Wait for the host to be up (40 retries left). 21:09:07 FAILED - RETRYING: Wait for the host to be up (39 retries left). 21:09:21 FAILED - RETRYING: Wait for the host to be up (38 retries left). 21:09:34 FAILED - RETRYING: Wait for the host to be up (37 retries left). 21:09:48 FAILED - RETRYING: Wait for the host to be up (36 retries left). 21:10:02 FAILED - RETRYING: Wait for the host to be up (35 retries left). 21:10:16 FAILED - RETRYING: Wait for the host to be up (34 retries left). 21:10:30 FAILED - RETRYING: Wait for the host to be up (33 retries left). 21:10:43 FAILED - RETRYING: Wait for the host to be up (32 retries left). 21:10:57 FAILED - RETRYING: Wait for the host to be up (31 retries left). 21:11:11 FAILED - RETRYING: Wait for the host to be up (30 retries left). 21:11:25 FAILED - RETRYING: Wait for the host to be up (29 retries left). 21:11:38 FAILED - RETRYING: Wait for the host to be up (28 retries left). 21:11:52 FAILED - RETRYING: Wait for the host to be up (27 retries left). 21:12:06 FAILED - RETRYING: Wait for the host to be up (26 retries left). 21:12:20 FAILED - RETRYING: Wait for the host to be up (25 retries left). 21:12:33 FAILED - RETRYING: Wait for the host to be up (24 retries left). 21:12:47 FAILED - RETRYING: Wait for the host to be up (23 retries left). 21:13:01 FAILED - RETRYING: Wait for the host to be up (22 retries left). 21:13:15 FAILED - RETRYING: Wait for the host to be up (21 retries left). 21:13:28 FAILED - RETRYING: Wait for the host to be up (20 retries left). 21:13:42 FAILED - RETRYING: Wait for the host to be up (19 retries left). 21:13:56 FAILED - RETRYING: Wait for the host to be up (18 retries left). 21:14:10 FAILED - RETRYING: Wait for the host to be up (17 retries left). 21:14:24 FAILED - RETRYING: Wait for the host to be up (16 retries left). 21:14:37 FAILED - RETRYING: Wait for the host to be up (15 retries left). 21:14:51 FAILED - RETRYING: Wait for the host to be up (14 retries left). 21:15:05 FAILED - RETRYING: Wait for the host to be up (13 retries left). 21:15:19 FAILED - RETRYING: Wait for the host to be up (12 retries left). 21:15:32 FAILED - RETRYING: Wait for the host to be up (11 retries left). 21:15:46 FAILED - RETRYING: Wait for the host to be up (10 retries left). 21:16:00 FAILED - RETRYING: Wait for the host to be up (9 retries left). 21:16:14 FAILED - RETRYING: Wait for the host to be up (8 retries left). 21:16:27 FAILED - RETRYING: Wait for the host to be up (7 retries left). 21:16:41 FAILED - RETRYING: Wait for the host to be up (6 retries left). 21:16:55 FAILED - RETRYING: Wait for the host to be up (5 retries left). 21:17:09 FAILED - RETRYING: Wait for the host to be up (4 retries left). 21:17:23 FAILED - RETRYING: Wait for the host to be up (3 retries left). 21:17:36 FAILED - RETRYING: Wait for the host to be up (2 retries left). 21:17:50 FAILED - RETRYING: Wait for the host to be up (1 retries left). 21:18:04 An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ovirtsdk4.Error: Failed to read response: [(<pycurl.Curl object at 0x55db31d56b38>, 7, 'Failed to connect to hosted-engine-02.lab.eng.tlv2.redhat.com port 443: No route to host')] 21:18:04 fatal: [lynx12.lab.eng.tlv2.redhat.com]: FAILED! => {"attempts": 120, "changed": false, "msg": "Failed to read response: [(<pycurl.Curl object at 0x55db31d56b38>, 7, 'Failed to connect to hosted-engine-02.lab.eng.tlv2.redhat.com port 443: No route to host')]"} 21:18:04 ...ignoring 21:18:04 21:18:04 TASK [ovirt.ovirt.hosted_engine_setup : debug] ********************************* 21:18:04 ok: [lynx12.lab.eng.tlv2.redhat.com] => { 21:18:04 "host_result_up_check": { 21:18:04 "attempts": 120, 21:18:04 "changed": false, 21:18:04 "exception": "Traceback (most recent call last):\n File \"/tmp/ansible_ovirt_host_info_payload_04wx5866/ansible_ovirt_host_info_payload.zip/ansible_collections/ovirt/ovirt/plugins/modules/ovirt_host_info.py\", line 112, in main\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/services.py\", line 13222, in list\n return self._internal_get(headers, query, wait)\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py\", line 211, in _internal_get\n return future.wait() if wait else future\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py\", line 54, in wait\n response = self._connection.wait(self._context)\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/__init__.py\", line 497, in wait\n return self.__wait(context, failed_auth)\n File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/__init__.py\", line 511, in __wait\n raise Error(\"Failed to read response: {}\".format(err_list))\novirtsdk4.Error: Failed to read response: [(<pycurl.Curl object at 0x55db31d56b38>, 7, 'Failed to connect to hosted-engine-02.lab.eng.tlv2.redhat.com port 443: No route to host')]\n", 21:18:04 "failed": true, 21:18:04 "msg": "Failed to read response: [(<pycurl.Curl object at 0x55db31d56b38>, 7, 'Failed to connect to hosted-engine-02.lab.eng.tlv2.redhat.com port 443: No route to host')]" 21:18:04 } 21:18:04 } 21:18:04 21:18:04 TASK [ovirt.ovirt.hosted_engine_setup : Notify the user about a failure] ******* 21:18:04 fatal: [lynx12.lab.eng.tlv2.redhat.com]: FAILED! => {"changed": false, "msg": "Host is not up, please check logs, perhaps also on the engine machine"} 21:18:04 21:18:04 TASK [ovirt.ovirt.hosted_engine_setup : Sync on engine machine] **************** 21:18:05 changed: [lynx12.lab.eng.tlv2.redhat.com] 21:18:05 21:18:05 TASK [ovirt.ovirt.hosted_engine_setup : Fetch logs from the engine VM] *********
the HE VM appears to be shut down during its host's deployment.
seems like a change in behavior in RHEL 8.4 Doesn't reproduce on CentOS, not even on CentOS Stream with oVirt
actually, it seems libvirt services-related, rather than integration. It happens during libvirtd restart(which is what the host deploy role does), and not vdsmd restart - where we depend on libvirt-guests so I was assuming it happens there too, but it doesn't. Marcin, maybe you can take a look, being the last one who "enjoyed" dealing with libvirt systemd changes in RHEL 8:)
actually, root cause is https://github.com/libvirt/libvirt/commit/f035f53baa2e5dc00b8e866e594672a90b4cea78
(many thanks to danpb for quick response) ...and reverted in https://github.com/libvirt/libvirt/commit/32c5e432044689b6679cdedeb1026f27653449d8 after reported problem in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=955216
keeping the bug for spec bump (not stricly required, but useful for tracking)
Regular HE deployment works just fine for me on these components: ovirt-engine-setup-4.4.6.6-0.10.el8ev.noarch ovirt-hosted-engine-ha-2.4.6-1.el8ev.noarch ovirt-hosted-engine-setup-2.5.0-2.el8ev.noarch vdsm-4.40.60.5-1.el8ev.x86_64 Linux 4.18.0-304.el8.x86_64 #1 SMP Tue Apr 6 05:19:59 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux release 8.4 (Ootpa)
This bugzilla is included in oVirt 4.4.6 release, published on May 4th 2021. Since the problem described in this bug report should be resolved in oVirt 4.4.6 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.