Description of problem: ----------------------- When using the ansible role to deploy the hosted engine this times out during the "Wait for local vm" task. How reproducible: ----------------- Always Steps to Reproduce: ------------------- 1. follow the steps in the "Automating RHHI for Virtualization deployment" document Actual results: TASK [ovirt.hosted_engine_setup : Wait for the local VM] ****************************************************************************************************************************************************** fatal: [localhost -> zplk1028.adm.siverek.enedis.fr]: FAILED! => {"changed": false, "elapsed": 186, "msg": "timed out waiting for ping module test success: Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this. Please add this host's fingerprint to your known_hosts file to manage this host."} Expected results: ----------------- Deployment succeeds Additional info: ---------------- This appears to be an issue zhere ssh-ing in to the vm (which was correctly started) was impossible. As a workaround I logged in to the hosted engine VM frm a different window to make sure the host key was in known_hosts. This made the script progres, but it landed in the next issue... TASK [ovirt.hosted_engine_setup : Add an entry for this host on /etc/hosts on the local VM] ******************************************************************************************************************* fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: u\"hostvars['zplk0023']\" is undefined\n\nThe error appears to be in '/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/bootstrap_local_vm/03_engine_initial_tasks.yml': line 8, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n timeout: 180\n - name: Add an entry for this host on /etc/hosts on the local VM\n ^ here\n"} Fix is required in gluster-ansible-roles to remove the param 'he_ansible_host_name' from /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json [root@ ~]# cat /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json { "he_appliance_password": "encrypt-password-using-ansible-vault", "he_admin_password": "UI-password-for-login", "he_domain_type": "glusterfs", "he_fqdn": "FQDN-for-Hosted-Engine", "he_vm_mac_addr": "Valid MAC address", "he_default_gateway": "Valid Gateway", "he_mgmt_network": "ovirtmgmt", "he_ansible_host_name": "host1", <<------------------- This needs to be removed "he_storage_domain_name": "HostedEngine", "he_storage_domain_path": "/engine", "he_storage_domain_addr": "host1", "he_mount_options": "backup-volfile-servers=host2:host3", "he_bridge_if": "interface name for bridge creation", "he_enable_hc_gluster_service": true, "he_mem_size_MB": "4096", "he_cluster": "Default" }
@Anjana, This needs to be documented as a known_issue for RHV 4.3.8 based RHHI-V 1.7 Problem: RHHI-V deployment fails using ansible playbook from commandline Workaround: Remove the entry 'he_ansible_host_name: host1' from /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json and proceed with the deployment
@Gobinda, Also make sure that the HE VM is allocated with 16GB of ram, with this change in vars "he_mem_size_MB": "16384"
Clearing need info as it's documented as known_issue with correct message
Tested with gluster-ansible-roles-1.0.5-7.el8rhgs Contents of he_gluster_vars.json file below: [root@ ~]# cat /etc/ansible/roles/gluster.ansible/playbooks/hc-ansible-deployment/he_gluster_vars.json { "he_appliance_password": "encrypt-password-using-ansible-vault", "he_admin_password": "UI-password-for-login", "he_domain_type": "glusterfs", "he_fqdn": "FQDN-for-Hosted-Engine", "he_vm_mac_addr": "Valid MAC address", "he_default_gateway": "Valid Gateway", "he_mgmt_network": "ovirtmgmt", "he_storage_domain_name": "HostedEngine", "he_storage_domain_path": "/engine", "he_storage_domain_addr": "host1-backend-network-FQDN", "he_mount_options": "backup-volfile-servers=host2-backend-network-FQDN:host3-backend-network-FQDN", "he_bridge_if": "interface name for bridge creation", "he_enable_hc_gluster_service": true, "he_mem_size_MB": "16384", "he_cluster": "Default" } 1. he_ansible_host_name is removed 2. he_mem_size_MB is updated to 16384 Ansible based CLI deployment is successful
Doc text looks good.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHHI for Virtualization 1.8 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:3314