Bug 1731871 - Unable to install metrics VM on rhev - failed to connect to master VM
Summary: Unable to install metrics VM on rhev - failed to connect to master VM
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine-metrics
Classification: oVirt
Component: Generic
Version: 1.3.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.3.5-1
: ---
Assignee: Evgeny Slutsky
QA Contact: Ivana Saranova
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-22 09:10 UTC by Evgeny Slutsky
Modified: 2019-07-31 10:57 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-07-31 10:57:53 UTC
oVirt Team: Metrics
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 102051 0 master MERGED Wait for reboot to complete validation. 2020-07-14 06:43:22 UTC

Description Evgeny Slutsky 2019-07-22 09:10:46 UTC
when running metrics installation from engine:
ANSIBLE_JINJA2_EXTENSIONS="jinja2.ext.do" ./configure_ovirt_machines_for_metrics.sh   --playbook=ovirt-metrics-store-installation.yml

Error:

ASK [oVirt.metrics/roles/oVirt.metrics-store-installation : Generate template files] *****************************************************************************************************************************
ok: [localhost] => (item=ovirt_metrics_curator_configmap.yaml)
ok: [localhost] => (item=metrics_store_post_installation.yaml)

TASK [oVirt.metrics/roles/oVirt.metrics-store-installation : Copy files to bastion machine] ***********************************************************************************************************************
ok: [localhost -> 192.168.200.166] => (item=ovirt_metrics_curator_configmap.yaml)
ok: [localhost -> 192.168.200.166] => (item=metrics_store_post_installation.yaml)

TASK [oVirt.metrics/roles/oVirt.metrics-store-installation : Delete engine_id_rsa.pub] ****************************************************************************************************************************
changed: [localhost]

PLAY [Add master host dns entry to /etc/hosts] ********************************************************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
fatal: [master0.lago.local]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.200.168 port 22: No route to host", "unreachable": true}

PLAY RECAP ********************************************************************************************************************************************************************************************************
localhost                  : ok=112  changed=10   unreachable=0    failed=0    skipped=70   rescued=0    ignored=0   
master0.lago.local         : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0   





Version-Release number of selected component (if applicable):
ansible-2.8.1-1.el7.noarch
ovirt-ansible-vm-infra-1.1.20-1.el7.noarch
ovirt-ansible-roles-1.1.7-1.el7.noarch
ovirt-engine-metrics-1.3.4-0.0.master.20190714091448.git97f3dea.el7.noarch
python-ovirt-engine-sdk4-4.3.2-2.20190703git231d55c.el7.x86_64




How reproducible:


Steps to Reproduce:
1. Install Engine + Host 4.3.4
2. run metrics deployment from engine


Actual results:
VM Installation failed

Expected results:
VM Installation completed

Additional info:

Comment 1 Evgeny Slutsky 2019-07-22 09:22:11 UTC
running the deployment again can be used as a workaround. (it passes)

Comment 2 Evgeny Slutsky 2019-07-22 13:00:33 UTC
Master  vm is created with reboot :

2019-07-22 08:41:24,067 p=22837 u=root |  [DEPRECATION WARNING]: evaluating create_openshift_vms as a bare variable, this behaviour will go away and you might need to add |bool to the expression in the future. Also see CONDITIONAL_BARE_VARS
configuration toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
2019-07-22 08:41:24,068 p=22837 u=root |  [DEPRECATION WARNING]: evaluating wait_for_ip as a bare variable, this behaviour will go away and you might need to add |bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration
toggle.. This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
2019-07-22 08:41:24,827 p=22837 u=root |  [DEPRECATION WARNING]: evaluating ip_cond as a bare variable, this behaviour will go away and you might need to add |bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle..
This feature will be removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
2019-07-22 08:41:24,840 p=22837 u=root |  FAILED - RETRYING: Wait for VMs IP (30 retries left).
2019-07-22 08:41:45,560 p=22837 u=root |  ok: [localhost] => (item={'profile': {u'cloud_init': {u'authorized_ssh_keys': u'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDJOHKcl/+lk18veWZNdq5ITGox1M/J8hFAxFGCtdzdd3etvUjnZfv7OqmkueDhthqZGnpqlp1LPJATSMy1Vm09fyY8xdPRHAWyVzyNLy7+hbA1sUMGGsHaLpOiEgLIXoXzmNIXX2m2IUGjtYlzXfSv9/x4iEojKXNm0TTxtFVsRFGu3nQq068mRw3bB/YF3PCu297esfzcHJMMRl3zZLfsZSVwAR/qoj/Fod2I8c6bcWIn+Ps600F+L5yKeglrWLu/73Fqm0nOXezRxtFrMxoMIOp66HQIimLDYB+uNyZBeN9jfjnDkES9oyF4K1Zi0ssj4JViR7nPUCrkHQvZ2VUD', u'custom_script': u'yum_repos:\n  centos-ovirt42:\n    baseurl: http://mirror.centos.org/centos/7/virt/x86_64/ovirt-4.2\n    enabled: true\n    gpgcheck: false\npackages:\n  - ovirt-guest-agent\n  - epel-release\n  - NetworkManager\n  - centos-release-openshift-origin311\nruncmd:\n  - sed -i \'s/# ignored_nics =.*/ignored_nics = docker0 tun0 /\' etc/ovirt-guest-agent.conf\n  - systemctl enable ovirt-guest-agent\n  - systemctl start ovirt-guest-agent\n  - systemctl enable NetworkManager\n  - systemctl start NetworkManager\n  - mkdir -p /var/lib/docker\n  - /usr/sbin/mkfs.xfs -L dockervo /dev/vdb\n  - echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC1UqCmHfVOvm6IzFnlfw+rwR5zLWlt9NeYki+Y1MEFm38nEI0D3so8i3aHeA/GId7dS4Txu114ic7yV1AzYhYHUgq8j0EbHdb2HoydNmPdmJPmZY/5gp9JRCoaCpsB12sqpjvZzEm0JXuxOF1TPlJzHz8Nm3KVAVwApeIqEi5C9ucoBFG8gDCRURvjGeF+ArOk+yUp5dumBnhBwutxG+hFWj1OKX7ejZ/6oNbJguHQH5Qw+zotc8sXAgX4gVkZSppSoguxyP6uk6NGPe4PL3MeRDXACaxEJy5/eqDO1PL7gwu9MN85/IlpNo7trvoOJzWqxUBwAlmDskliaIKPbBL/" >> /root/.ssh/authorized_keys\n  - echo "Defaults !requiretty" > /etc/sudoers.d/999-cloud-init-requiretty\n  - mkdir -p \'/var/lib/elasticsearch\'\n  - /usr/sbin/mkfs.xfs -L elasticvo /dev/vdc\nmounts:\n  - [ \'/dev/vdb\', \'/var/lib/docker\', \'xfs\', \'defaults,gquota\' ]\n  - [ \'/dev/vdc\', \'/var/lib/elasticsearch\', \'xfs\', \'defaults,gquota\' ]\npower_state:\n  mode: reboot\n  message: cloud init finished - boot and install openshift\n  condition: true\n', u'root_password': u'******'}, u'disks': [{u'interface': u'virtio', u'storage_domain': u'nfs', u'name': u'docker_disk', u'size': u'10GiB'}, {u'interface': u'virtio', u'storage_domain': u'nfs', u'name': u'elasticsearch_disk', u'size': u'50GiB'}], u'cluster': u'test-cluster', u'state': u'running', u'template': u'centos76', u'memory': u'10GiB', u'cores': 2, u'high_availability': True}, 'tag': u'openshift_master_vm', 'description': u'', 'name': u'master0.lago.local', 'cloud_init': {u'authorized_ssh_keys': u'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDJOHKcl/+lk18veWZNdq5ITGox1M/J8hFAxFGCtdzdd3etvUjnZfv7OqmkueDhthqZGnpqlp1LPJATSMy1Vm09fyY8xdPRHAWyVzyNLy7+hbA1sUMGGsHaLpOiEgLIXoXzmNIXX2m2IUGjtYlzXfSv9/x4iEojKXNm0TTxtFVsRFGu3nQq068mRw3bB/YF3PCu297esfzcHJMMRl3zZLfsZSVwAR/qoj/Fod2I8c6bcWIn+Ps600F+L5yKeglrWLu/73Fqm0nOXezRxtFrMxoMIOp66HQIimLDYB+uNyZBeN9jfjnDkES9oyF4K1Zi0ssj4JViR7nPUCrkHQvZ2VUD', u'custom_script': u'yum_repos:\n  centos-ovirt42:\n    baseurl: http://mirror.centos.org/centos/7/virt/x86_64/ovirt-4.2\n    enabled: true\n    gpgcheck: false\npackages:\n  - ovirt-guest-agent\n  - epel-release\n  - NetworkManager\n  - centos-release-openshift-origin311\nruncmd:\n  - sed -i \'s/# ignored_nics =.*/ignored_nics = docker0 tun0 /\' etc/ovirt-guest-agent.conf\n  - systemctl enable ovirt-guest-agent\n  - systemctl start ovirt-guest-agent\n  - systemctl enable NetworkManager\n  - systemctl start NetworkManager\n  - mkdir -p /var/lib/docker\n  - /usr/sbin/mkfs.xfs -L dockervo /dev/vdb\n  - echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC1UqCmHfVOvm6IzFnlfw+rwR5zLWlt9NeYki+Y1MEFm38nEI0D3so8i3aHeA/GId7dS4Txu114ic7yV1AzYhYHUgq8j0EbHdb2HoydNmPdmJPmZY/5gp9JRCoaCpsB12sqpjvZzEm0JXuxOF1TPlJzHz8Nm3KVAVwApeIqEi5C9ucoBFG8gDCRURvjGeF+ArOk+yUp5dumBnhBwutxG+hFWj1OKX7ejZ/6oNbJguHQH5Qw+zotc8sXAgX4gVkZSppSoguxyP6uk6NGPe4PL3MeRDXACaxEJy5/eqDO1PL7gwu9MN85/IlpNo7trvoOJzWqxUBwAlmDskliaIKPbBL/" >> /root/.ssh/authorized_keys\n  - echo "Defaults !requiretty" > /etc/sudoers.d/999-cloud-init-requiretty\n  - mkdir -p \'/var/lib/elasticsearch\'\n  - /usr/sbin/mkfs.xfs -L elasticvo /dev/vdc\nmounts:\n  - [ \'/dev/vdb\', \'/var/lib/docker\', \'xfs\', \'defaults,gquota\' ]\n  - [ \'/dev/vdc\', \'/var/lib/elasticsearch\', \'xfs\', \'defaults,gquota\' ]\npower_state:\n  mode: reboot\n  message: cloud init finished - boot and install openshift\n  condition: true\n', u'root_password': u'******'}, 'host_name': u''})



playbook  is not waiting for reboot to complete and attempts to gather_facts on master VM:

019-07-22 08:41:52,460 p=22837 u=root |  TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
2019-07-22 08:41:55,516 p=22837 u=root |  fatal: [master0.lago.local]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.200.176 port 22: No route to host", "unreachable": true}
2019-07-22 08:41:55,517 p=22837 u=root |  PLAY RECAP ********************************************************************************************************************************************************************************************************
2019-07-22 08:41:55,518 p=22837 u=root |  localhost                  : ok=113  changed=25   unreachable=0    failed=0    skipped=69   rescued=0    ignored=0
2019-07-22 08:41:55,518 p=22837 u=root |  master0.lago.local         : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0

Comment 3 Ivana Saranova 2019-07-26 14:08:31 UTC
Steps:
1) Install metrics store according to the documentation
2) Check that in the first playbook, there is a task that waits for the VM to be rebooted

Results:
``` TASK [Wait for the reboot to complete] ```
Playbook ends successfully, metrics store is installed without error.

Verified in:
ovirt-engine-4.3.5.4-0.1.el7.noarch
ovirt-engine-metrics-1.3.4-0.0.master.20190723115530.gitf329e5a.el7.noarch
ansible-2.8.2-1.201907241649git.42c2b3e496.el7.ans.noarch

Comment 4 Sandro Bonazzola 2019-07-31 10:57:53 UTC
This bugzilla is included in oVirt 4.3.5 first async release, published on July 31th 2019.

Since the problem described in this bug report should be
resolved in oVirt 4.3.5 first async release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.