Bug 1784010 - [rhv-4.4.0-9] Right after adding host to engine - Failed to execute Ansible host-deploy role: null with host unreachable
Summary: [rhv-4.4.0-9] Right after adding host to engine - Failed to execute Ansible h...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: ovirt-host-deploy-ansible
Version: 4.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ovirt-4.4.0
: ---
Assignee: Dana
QA Contact: Petr Matyáš
URL:
Whiteboard:
Depends On:
Blocks: 1766595
TreeView+ depends on / blocked
 
Reported: 2019-12-16 13:46 UTC by Avihai
Modified: 2020-05-20 20:04 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-20 20:04:03 UTC
oVirt Team: Infra
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)
engine vdsm and host deploy logs (374.64 KB, application/gzip)
2019-12-16 13:46 UTC, Avihai
no flags Details
ansible_runner_error_log (218.99 KB, text/plain)
2019-12-19 06:34 UTC, Avihai
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 105850 0 master MERGED engine: add selinux rule 2020-08-17 14:37:07 UTC
oVirt gerrit 106063 0 master MERGED fix typo 2020-08-17 14:36:56 UTC

Description Avihai 2019-12-16 13:46:37 UTC
Created attachment 1645593 [details]
engine vdsm and host deploy logs

Description of problem:
Engine upgraded from rhv-4.3.8.1 to rhv-4.4.0.9.
Host reprovisioned to rhel8.1 and upgraded according to 4.4.0.9 build mail.

I see issues with adding host to engine at ovirt-host-deploy stage , it fails right away after starting to run host-deploy with no route to host errors.
I checked and connectivity between engine and host is good(both ping and ssh).

I tried with 2 hosts and saw the same issue.

On both hosts ping/ssh connectivity from engine to host is OK:

[root@storage-ge-08 ~]# ping storage-ge8-vdsm2.scl.lab.tlv.redhat.com
PING storage-ge8-vdsm2.scl.lab.tlv.redhat.com (10.35.82.80) 56(84) bytes of data.
64 bytes from storage-ge8-vdsm2.scl.lab.tlv.redhat.com (10.35.82.80): icmp_seq=1 ttl=63 time=1.07 ms
64 bytes from storage-ge8-vdsm2.scl.lab.tlv.redhat.com (10.35.82.80): icmp_seq=2 ttl=63 time=0.772 ms
^C
--- storage-ge8-vdsm2.scl.lab.tlv.redhat.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.772/0.925/1.079/0.156 ms
[root@storage-ge-08 ~]# ssh root.lab.tlv.redhat.com
Activate the web console with: systemctl enable --now cockpit.socket

Last login: Mon Dec 16 13:48:54 2019 from 10.35.162.7
[root@storage-ge8-vdsm2 ~]# 


Engine log:
2019-12-16 13:40:17,458+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-3737) [34f4b46a] EVENT_ID: VDS_ANSIBLE_INSTALL_STARTED(560), Ansibl
e host-deploy playbook execution has started on host host_mixed_2.
2019-12-16 13:40:19,046+02 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [5b933867] Connecting to storage-ge8-vdsm1.scl.lab.tlv.redhat.com/10.35.82.79
2019-12-16 13:40:19,048+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-62) [] Unable to RefreshCapabilities: NoRouteToH
ostException: No route to host
2019-12-16 13:40:19,049+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesAsyncVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-62) [] Command 'GetCapabilitiesAs
yncVDSCommand(HostName = host_mixed_11, VdsIdAndVdsVDSCommandParametersBase:{hostId='9d3b15b0-0723-4671-b88a-80de4c666ec0', vds='Host[host_mixed_11,9d3b15b0-0723-4671-b88a-80de4c666ec0]'})' execution failed: ja
va.net.NoRouteToHostException: No route to host
2019-12-16 13:40:22,056+02 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [5b933867] Connecting to storage-ge8-vdsm1.scl.lab.tlv.redhat.com/10.35.82.79
2019-12-16 13:40:22,058+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-80) [] Unable to RefreshCapabilities: NoRouteToH
ostException: No route to host
2019-12-16 13:40:22,059+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesAsyncVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-80) [] Command 'GetCapabilitiesAs
yncVDSCommand(HostName = host_mixed_11, VdsIdAndVdsVDSCommandParametersBase:{hostId='9d3b15b0-0723-4671-b88a-80de4c666ec0', vds='Host[host_mixed_11,9d3b15b0-0723-4671-b88a-80de4c666ec0]'})' execution failed: ja
va.net.NoRouteToHostException: No route to host
2019-12-16 13:40:24,071+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-3737) [34f4b46a] Host installation failed for host '1e38ff69-3c73-4690-8e
ac-ea20300bdc12', 'host_mixed_2': Failed to execute Ansible host-deploy role: null. Please check logs for more details: /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20191216134017-storage-ge8-vds
m2.scl.lab.tlv.redhat.com-34f4b46a.log
2019-12-16 13:40:24,075+02 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-3737) [34f4b46a] START, SetVdsStatusVDSCommand(HostName = host_mixed_2, SetVdsSta
tusVDSCommandParameters:{hostId='1e38ff69-3c73-4690-8eac-ea20300bdc12', status='InstallFailed', nonOperationalReason='NONE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 7ad559c0
2019-12-16 13:40:24,082+02 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-3737) [34f4b46a] FINISH, SetVdsStatusVDSCommand, return: , log id: 7ad559c0
2019-12-16 13:40:24,092+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-3737) [34f4b46a] EVENT_ID: VDS_INSTALL_FAILED(505), Host host_mixed
_2 installation failed. Failed to execute Ansible host-deploy role: null. Please check logs for more details: /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20191216134017-storage-ge8-vdsm2.scl.lab
.tlv.redhat.com-34f4b46a.log.

Host deploy log: 
cat /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20191216134017-storage-ge8-vdsm2.scl.lab.tlv.redhat.com-34f4b46a.log.
cat: /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20191216134017-storage-ge8-vdsm2.scl.lab.tlv.redhat.com-34f4b46a.log.: No such file or directory
[root@storage-ge-08 ~]# cat /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20191216134017-storage-ge8-vdsm2.scl.lab.tlv.redhat.com-34f4b46a.log
2019-12-16 13:40:24 IST - TASK [Gathering Facts] *********************************************************
2019-12-16 13:40:24 IST - PLAY RECAP *********************************************************************
storage-ge8-vdsm2.scl.lab.tlv.redhat.com : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0

Version-Release number of selected component (if applicable):
Engine:
ovirt-engine-4.4.0-0.9.master.el7.noarch
ansible-2.9.2-1.el7ae.noarch

Host:
vdsm-4.40.0-164.git38a19bb.el8ev.x86_64
libvirt-client-5.6.0-6.module+el8.1.0+4244+9aa4e6bb.x86_64
libvirt-lock-sanlock-5.6.0-6.module+el8.1.0+4244+9aa4e6bb.x86_64
qemu-img-4.1.0-14.module+el8.1.0+4548+ed1300f4.x86_64


How reproducible:
100% 

Steps to Reproduce:
1.Upgrade engine and hosts from rhv-4.3.8.1 -> rhv-4.4.0.9
2.Add host to engine via webadmin


Actual results:
ovirt-host-deploy fails right away after starting to run host-deploy with no route to host errors.

Host goes to unresponsive.

Expected results:


Additional info:

Comment 1 Dana 2019-12-16 14:56:31 UTC
Hi Avihai,
Can you please copy the content of /etc/ansible-runner-service/config.yaml and the output of journalctl -u ansible-runner-service?
Thanks!

Comment 2 Avihai 2019-12-17 11:12:10 UTC
(In reply to Dana from comment #1)
> Hi Avihai,
> Can you please copy the content of /etc/ansible-runner-service/config.yaml
> and the output of journalctl -u ansible-runner-service?
> Thanks!
Sure , here goes :

[root@storage-ge-08 ~]# cat /etc/ansible-runner-service/config.yaml

version: 1
playbooks_root_dir: '/usr/share/ovirt-engine/ansible-runner-service-project'
ssh_private_key: '/etc/pki/ovirt-engine/keys/engine_id_rsa'
port: 50001
target_user: root
 
[root@storage-ge-08 ~]# journalctl -u ansible-runner-service
-- No entries --

Comment 3 Dana 2019-12-18 15:59:38 UTC
Can you please attach the log from /var/log/httpd/ansible_runner_service_error_log?

Comment 4 Avihai 2019-12-19 06:34:41 UTC
Created attachment 1646320 [details]
ansible_runner_error_log

Comment 5 RHV bug bot 2020-01-08 14:47:41 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Tag 'ovirt-engine-4.4.0' doesn't contain patch 'https://gerrit.ovirt.org/106063']
gitweb: https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/ovirt-engine-4.4.0

For more info please contact: infra

Comment 6 RHV bug bot 2020-01-08 15:17:06 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Tag 'ovirt-engine-4.4.0' doesn't contain patch 'https://gerrit.ovirt.org/106063']
gitweb: https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/ovirt-engine-4.4.0

For more info please contact: infra

Comment 7 Petr Matyáš 2020-01-20 15:53:52 UTC
Verified on ovirt-host-deploy-common-1.9.0-0.0.master.20191128124417.gitd2b9fa5.el7ev.noarch

Comment 8 Sandro Bonazzola 2020-05-20 20:04:03 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.