Bug 1731871

Summary: Unable to install metrics VM on rhev - failed to connect to master VM
Product: [oVirt] ovirt-engine-metrics Reporter: Evgeny Slutsky <eslutsky>
Component: GenericAssignee: Evgeny Slutsky <eslutsky>
Status: CLOSED CURRENTRELEASE QA Contact: Ivana Saranova <isaranov>
Severity: high Docs Contact:
Priority: high    
Version: 1.3.2CC: bugs, lleistne, sradco
Target Milestone: ovirt-4.3.5-1   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-31 10:57:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Metrics RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Evgeny Slutsky 2019-07-22 09:10:46 UTC
when running metrics installation from engine:
ANSIBLE_JINJA2_EXTENSIONS="jinja2.ext.do" ./configure_ovirt_machines_for_metrics.sh   --playbook=ovirt-metrics-store-installation.yml

Error:

ASK [oVirt.metrics/roles/oVirt.metrics-store-installation : Generate template files] *****************************************************************************************************************************
ok: [localhost] => (item=ovirt_metrics_curator_configmap.yaml)
ok: [localhost] => (item=metrics_store_post_installation.yaml)

TASK [oVirt.metrics/roles/oVirt.metrics-store-installation : Copy files to bastion machine] ***********************************************************************************************************************
ok: [localhost -> 192.168.200.166] => (item=ovirt_metrics_curator_configmap.yaml)
ok: [localhost -> 192.168.200.166] => (item=metrics_store_post_installation.yaml)

TASK [oVirt.metrics/roles/oVirt.metrics-store-installation : Delete engine_id_rsa.pub] ****************************************************************************************************************************
changed: [localhost]

PLAY [Add master host dns entry to /etc/hosts] ********************************************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
fatal: [master0.lago.local]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.200.168 port 22: No route to host", "unreachable": true}

PLAY RECAP ********************************************************************************************************************************************************************************************************
localhost                  : ok=112  changed=10   unreachable=0    failed=0    skipped=70   rescued=0    ignored=0   
master0.lago.local         : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0   





Version-Release number of selected component (if applicable):
ansible-2.8.1-1.el7.noarch
ovirt-ansible-vm-infra-1.1.20-1.el7.noarch
ovirt-ansible-roles-1.1.7-1.el7.noarch
ovirt-engine-metrics-1.3.4-0.0.master.20190714091448.git97f3dea.el7.noarch
python-ovirt-engine-sdk4-4.3.2-2.20190703git231d55c.el7.x86_64




How reproducible:


Steps to Reproduce:
1. Install Engine + Host 4.3.4
2. run metrics deployment from engine


Actual results:
VM Installation failed

Expected results:
VM Installation completed

Additional info:

Comment 1 Evgeny Slutsky 2019-07-22 09:22:11 UTC
running the deployment again can be used as a workaround. (it passes)

Comment 2 Evgeny Slutsky 2019-07-22 13:00:33 UTC
Master  vm is created with reboot :

2019-07-22 08:41:24,067 p=22837 u=root |  [DEPRECATION WARNING]: evaluating create_openshift_vms as a bare variable, this behaviour will go away and you might need to add |bool to the expression in the future. Also see CONDITIONAL_BARE_VARS
configuration toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
2019-07-22 08:41:24,068 p=22837 u=root |  [DEPRECATION WARNING]: evaluating wait_for_ip as a bare variable, this behaviour will go away and you might need to add |bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration
toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
2019-07-22 08:41:24,827 p=22837 u=root |  [DEPRECATION WARNING]: evaluating ip_cond as a bare variable, this behaviour will go away and you might need to add |bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle..
This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
2019-07-22 08:41:24,840 p=22837 u=root |  FAILED - RETRYING: Wait for VMs IP (30 retries left).
2019-07-22 08:41:45,560 p=22837 u=root |  ok: [localhost] => (item={'profile': {u'cloud_init': {u'authorized_ssh_keys': u'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDJOHKcl/+lk18veWZNdq5ITGox1M/J8hFAxFGCtdzdd3etvUjnZfv7OqmkueDhthqZGnpqlp1LPJATSMy1Vm09fyY8xdPRHAWyVzyNLy7+hbA1sUMGGsHaLpOiEgLIXoXzmNIXX2m2IUGjtYlzXfSv9/x4iEojKXNm0TTxtFVsRFGu3nQq068mRw3bB/YF3PCu297esfzcHJMMRl3zZLfsZSVwAR/qoj/Fod2I8c6bcWIn+Ps600F+L5yKeglrWLu/73Fqm0nOXezRxtFrMxoMIOp66HQIimLDYB+uNyZBeN9jfjnDkES9oyF4K1Zi0ssj4JViR7nPUCrkHQvZ2VUD', u'custom_script': u'yum_repos:\n  centos-ovirt42:\n    baseurl: http://mirror.centos.org/centos/7/virt/x86_64/ovirt-4.2\n    enabled: true\n    gpgcheck: false\npackages:\n  - ovirt-guest-agent\n  - epel-release\n  - NetworkManager\n  - centos-release-openshift-origin311\nruncmd:\n  - sed -i \'s/# ignored_nics =.*/ignored_nics = docker0 tun0 /\' etc/ovirt-guest-agent.conf\n  - systemctl enable ovirt-guest-agent\n  - systemctl start ovirt-guest-agent\n  - systemctl enable NetworkManager\n  - systemctl start NetworkManager\n  - mkdir -p /var/lib/docker\n  - /usr/sbin/mkfs.xfs -L dockervo /dev/vdb\n  - echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC1UqCmHfVOvm6IzFnlfw+rwR5zLWlt9NeYki+Y1MEFm38nEI0D3so8i3aHeA/GId7dS4Txu114ic7yV1AzYhYHUgq8j0EbHdb2HoydNmPdmJPmZY/5gp9JRCoaCpsB12sqpjvZzEm0JXuxOF1TPlJzHz8Nm3KVAVwApeIqEi5C9ucoBFG8gDCRURvjGeF+ArOk+yUp5dumBnhBwutxG+hFWj1OKX7ejZ/6oNbJguHQH5Qw+zotc8sXAgX4gVkZSppSoguxyP6uk6NGPe4PL3MeRDXACaxEJy5/eqDO1PL7gwu9MN85/IlpNo7trvoOJzWqxUBwAlmDskliaIKPbBL/" >> /root/.ssh/authorized_keys\n  - echo "Defaults !requiretty" > /etc/sudoers.d/999-cloud-init-requiretty\n  - mkdir -p \'/var/lib/elasticsearch\'\n  - /usr/sbin/mkfs.xfs -L elasticvo /dev/vdc\nmounts:\n  - [ \'/dev/vdb\', \'/var/lib/docker\', \'xfs\', \'defaults,gquota\' ]\n  - [ \'/dev/vdc\', \'/var/lib/elasticsearch\', \'xfs\', \'defaults,gquota\' ]\npower_state:\n  mode: reboot\n  message: cloud init finished - boot and install openshift\n  condition: true\n', u'root_password': u'******'}, u'disks': [{u'interface': u'virtio', u'storage_domain': u'nfs', u'name': u'docker_disk', u'size': u'10GiB'}, {u'interface': u'virtio', u'storage_domain': u'nfs', u'name': u'elasticsearch_disk', u'size': u'50GiB'}], u'cluster': u'test-cluster', u'state': u'running', u'template': u'centos76', u'memory': u'10GiB', u'cores': 2, u'high_availability': True}, 'tag': u'openshift_master_vm', 'description': u'', 'name': u'master0.lago.local', 'cloud_init': {u'authorized_ssh_keys': u'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDJOHKcl/+lk18veWZNdq5ITGox1M/J8hFAxFGCtdzdd3etvUjnZfv7OqmkueDhthqZGnpqlp1LPJATSMy1Vm09fyY8xdPRHAWyVzyNLy7+hbA1sUMGGsHaLpOiEgLIXoXzmNIXX2m2IUGjtYlzXfSv9/x4iEojKXNm0TTxtFVsRFGu3nQq068mRw3bB/YF3PCu297esfzcHJMMRl3zZLfsZSVwAR/qoj/Fod2I8c6bcWIn+Ps600F+L5yKeglrWLu/73Fqm0nOXezRxtFrMxoMIOp66HQIimLDYB+uNyZBeN9jfjnDkES9oyF4K1Zi0ssj4JViR7nPUCrkHQvZ2VUD', u'custom_script': u'yum_repos:\n  centos-ovirt42:\n    baseurl: http://mirror.centos.org/centos/7/virt/x86_64/ovirt-4.2\n    enabled: true\n    gpgcheck: false\npackages:\n  - ovirt-guest-agent\n  - epel-release\n  - NetworkManager\n  - centos-release-openshift-origin311\nruncmd:\n  - sed -i \'s/# ignored_nics =.*/ignored_nics = docker0 tun0 /\' etc/ovirt-guest-agent.conf\n  - systemctl enable ovirt-guest-agent\n  - systemctl start ovirt-guest-agent\n  - systemctl enable NetworkManager\n  - systemctl start NetworkManager\n  - mkdir -p /var/lib/docker\n  - /usr/sbin/mkfs.xfs -L dockervo /dev/vdb\n  - echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC1UqCmHfVOvm6IzFnlfw+rwR5zLWlt9NeYki+Y1MEFm38nEI0D3so8i3aHeA/GId7dS4Txu114ic7yV1AzYhYHUgq8j0EbHdb2HoydNmPdmJPmZY/5gp9JRCoaCpsB12sqpjvZzEm0JXuxOF1TPlJzHz8Nm3KVAVwApeIqEi5C9ucoBFG8gDCRURvjGeF+ArOk+yUp5dumBnhBwutxG+hFWj1OKX7ejZ/6oNbJguHQH5Qw+zotc8sXAgX4gVkZSppSoguxyP6uk6NGPe4PL3MeRDXACaxEJy5/eqDO1PL7gwu9MN85/IlpNo7trvoOJzWqxUBwAlmDskliaIKPbBL/" >> /root/.ssh/authorized_keys\n  - echo "Defaults !requiretty" > /etc/sudoers.d/999-cloud-init-requiretty\n  - mkdir -p \'/var/lib/elasticsearch\'\n  - /usr/sbin/mkfs.xfs -L elasticvo /dev/vdc\nmounts:\n  - [ \'/dev/vdb\', \'/var/lib/docker\', \'xfs\', \'defaults,gquota\' ]\n  - [ \'/dev/vdc\', \'/var/lib/elasticsearch\', \'xfs\', \'defaults,gquota\' ]\npower_state:\n  mode: reboot\n  message: cloud init finished - boot and install openshift\n  condition: true\n', u'root_password': u'******'}, 'host_name': u''})



playbook  is not waiting for reboot to complete and attempts to gather_facts on master VM:

019-07-22 08:41:52,460 p=22837 u=root |  TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
2019-07-22 08:41:55,516 p=22837 u=root |  fatal: [master0.lago.local]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.200.176 port 22: No route to host", "unreachable": true}
2019-07-22 08:41:55,517 p=22837 u=root |  PLAY RECAP ********************************************************************************************************************************************************************************************************
2019-07-22 08:41:55,518 p=22837 u=root |  localhost                  : ok=113  changed=25   unreachable=0    failed=0    skipped=69   rescued=0    ignored=0
2019-07-22 08:41:55,518 p=22837 u=root |  master0.lago.local         : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0

Comment 3 Ivana Saranova 2019-07-26 14:08:31 UTC
Steps:
1) Install metrics store according to the documentation
2) Check that in the first playbook, there is a task that waits for the VM to be rebooted

Results:
``` TASK [Wait for the reboot to complete] ```
Playbook ends successfully, metrics store is installed without error.

Verified in:
ovirt-engine-4.3.5.4-0.1.el7.noarch
ovirt-engine-metrics-1.3.4-0.0.master.20190723115530.gitf329e5a.el7.noarch
ansible-2.8.2-1.201907241649git.42c2b3e496.el7.ans.noarch

Comment 4 Sandro Bonazzola 2019-07-31 10:57:53 UTC
This bugzilla is included in oVirt 4.3.5 first async release, published on July 31th 2019.

Since the problem described in this bug report should be
resolved in oVirt 4.3.5 first async release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.