Description of problem: When using the ansible role to deploy the hosted engine this times out during the "Wait for local vm" task. How reproducible: Always Steps to Reproduce: 1. follow the steps in the "Automating RHHI for Virtualization deployment" document Actual results: TASK [ovirt.hosted_engine_setup : Wait for the local VM] ****************************************************************************************************************************************************** fatal: [localhost -> zplk1028.adm.siverek.enedis.fr]: FAILED! => {"changed": false, "elapsed": 186, "msg": "timed out waiting for ping module test success: Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this. Please add this host's fingerprint to your known_hosts file to manage this host."} Expected results: Deployment succeeds Additional info: This appears to be an issue zhere ssh-ing in to the vm (which was correctly started) was impossible. As a workaround I logged in to the hosted engine VM frm a different window to make sure the host key was in known_hosts. This made the script progres, but it landed in the next issue... TASK [ovirt.hosted_engine_setup : Add an entry for this host on /etc/hosts on the local VM] ******************************************************************************************************************* fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: u\"hostvars['zplk0023']\" is undefined\n\nThe error appears to be in '/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/bootstrap_local_vm/03_engine_initial_tasks.yml': line 8, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n timeout: 180\n - name: Add an entry for this host on /etc/hosts on the local VM\n ^ here\n"}
Can you please provide "rpm -qa |grep ovirt" output? trying to understand which version is affected.
Closing with insufficient data resolution. If you can provide requested information please reopen.
(In reply to Sandro Bonazzola from comment #1) > Can you please provide "rpm -qa |grep ovirt" output? trying to understand > which version is affected. I see the exact same problem. Here are the details that was requested earlier: [root@rhsqa-grafton7-nic2 defaults]# rpm -qa | grep ovirt ovirt-ansible-repositories-1.1.5-1.el7ev.noarch ovirt-hosted-engine-ha-2.3.5-1.el7ev.noarch ovirt-vmconsole-1.0.7-3.el7ev.noarch python-ovirt-engine-sdk4-4.3.2-1.el7ev.x86_64 ovirt-host-deploy-common-1.8.2-1.el7ev.noarch ovirt-ansible-hosted-engine-setup-1.0.28-1.el7ev.noarch cockpit-ovirt-dashboard-0.13.8-1.el7ev.noarch ovirt-vmconsole-host-1.0.7-3.el7ev.noarch cockpit-machines-ovirt-195-1.el7.noarch python2-ovirt-node-ng-nodectl-4.3.6-0.20190820.0.el7ev.noarch ovirt-provider-ovn-driver-1.2.22-1.el7ev.noarch ovirt-hosted-engine-setup-2.3.12-1.el7ev.noarch ovirt-node-ng-nodectl-4.3.6-0.20190820.0.el7ev.noarch ovirt-imageio-daemon-1.5.2-0.el7ev.noarch ovirt-host-4.3.4-1.el7ev.x86_64 python2-ovirt-setup-lib-1.2.0-1.el7ev.noarch ovirt-ansible-engine-setup-1.1.9-1.el7ev.noarch ovirt-host-dependencies-4.3.4-1.el7ev.x86_64 ovirt-imageio-common-1.5.2-0.el7ev.x86_64 python2-ovirt-host-deploy-1.8.2-1.el7ev.noarch
Hi, in the extra vars the following vars were configured: { "he_appliance_password": "****", "he_admin_password": "****", "he_domain_type": "glusterfs", "he_fqdn": "hostedenginesm3.lab.eng.blr.redhat.com", "he_vm_mac_addr": "00:47:55:20:49:01", "he_default_gateway": "10.70.37.254", "he_mgmt_network": "ovirtmgmt", "he_ansible_host_name": "rhsqa-grafton7-nic2.lab.eng.blr.redhat.com", "he_storage_domain_name": "HostedEngine", "he_storage_domain_path": "/engine", "he_storage_domain_addr": "rhsqa-grafton7.lab.eng.blr.redhat.com", "he_mount_options": "backup-volfile-servers=rhsqa-grafton8.lab.eng.blr.redhat.com:rhsqa-grafton9.lab.eng.blr.redhat.com", "he_bridge_if": "enp129s0f0", "he_enable_hc_gluster_service": true, "he_mem_size_MB": "16384", "he_cluster": "Default" } its appears that `he_ansible_host_name` cannot be changed to something other then default `localhost` , because it relies on the ansible facts (which are gathered for localhost) . when removing this line `"he_ansible_host_name": "rhsqa-grafton7-nic2.lab.eng.blr.redhat.com",` deployment progressed. @simone can you confirm please.
(In reply to Sandro Bonazzola from comment #1) > Can you please provide "rpm -qa |grep ovirt" output? trying to understand > which version is affected. As I have provided the required information, removing the needinfo on Krist van Besien
Removed 'he_ansible_host_name' from extra vars and it lead to the success. Fix needs to be done at gluster-ansible-roles package that provides this extra vars file. [root@ ~]# rpm -qf /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json gluster-ansible-roles-1.0.5-7.el7rhgs.noarch So I will move this bug as part of gluster-ansible component
*** Bug 1755481 has been marked as a duplicate of this bug. ***
Fix is required in gluster-ansible-roles to remove the param 'he_ansible_host_name' from /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json [root@ ~]# cat /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json { "he_appliance_password": "encrypt-password-using-ansible-vault", "he_admin_password": "UI-password-for-login", "he_domain_type": "glusterfs", "he_fqdn": "FQDN-for-Hosted-Engine", "he_vm_mac_addr": "Valid MAC address", "he_default_gateway": "Valid Gateway", "he_mgmt_network": "ovirtmgmt", "he_ansible_host_name": "host1", <<------------------- This needs to be removed "he_storage_domain_name": "HostedEngine", "he_storage_domain_path": "/engine", "he_storage_domain_addr": "host1", "he_mount_options": "backup-volfile-servers=host2:host3", "he_bridge_if": "interface name for bridge creation", "he_enable_hc_gluster_service": true, "he_mem_size_MB": "4096", "he_cluster": "Default" }
@Gobinda, Also make sure that the HE VM is allocated with 16GB of ram, with this change in vars "he_mem_size_MB": "16384"
Tested with gluster-ansible-roles-1.0.5-7.el8rhgs Contents of he_gluster_vars.json file below: [root@ ~]# cat /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json { "he_appliance_password": "encrypt-password-using-ansible-vault", "he_admin_password": "UI-password-for-login", "he_domain_type": "glusterfs", "he_fqdn": "FQDN-for-Hosted-Engine", "he_vm_mac_addr": "Valid MAC address", "he_default_gateway": "Valid Gateway", "he_mgmt_network": "ovirtmgmt", "he_storage_domain_name": "HostedEngine", "he_storage_domain_path": "/engine", "he_storage_domain_addr": "host1-backend-network-FQDN", "he_mount_options": "backup-volfile-servers=host2-backend-network-FQDN:host3-backend-network-FQDN", "he_bridge_if": "interface name for bridge creation", "he_enable_hc_gluster_service": true, "he_mem_size_MB": "16384", "he_cluster": "Default" } 1. he_ansible_host_name is removed 2. he_mem_size_MB is updated to 16384 Ansible based CLI deployment is successful
Removing old needinfo
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:2575