Bug 1944600
| Summary: | deploy using 'ovirt-ansible-collection' fail on rhvh | ||
|---|---|---|---|
| Product: | [oVirt] ovirt-engine-sdk-python | Reporter: | Roni <reliezer> |
| Component: | General | Assignee: | Ori Liel <oliel> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Guilherme Santos <gdeolive> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | --- | CC: | arachman, bugs, cshao, gdeolive, jmacku, juan.hernandez, khakimi, lsvaty, lveyde, mavital, michal.skrivanek, mnecas, mperina, peyu, sbonazzo, shlei, weiwang, yaniwang |
| Target Milestone: | ovirt-4.4.5-1 | Keywords: | AutomationBlocker |
| Target Release: | 4.4.10 | Flags: | pm-rhel:
ovirt-4.4+
pm-rhel: devel_ack+ pm-rhel: testing_ack+ |
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | python3-ovirt-engine-sdk4-4.4.10-1.el8.x86_64 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-04-15 07:41:55 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1857841 | ||
Try to reproduce this bug with rhvh-4.4.4.2-0.20210307.0+1
1. Clean install rhvh-4.4.4.2-0.20210307.0+1
2. yum install rhvm-appliance-4.4-20201117.0.el8ev.x86_64
3. cd /usr/share/ansible/collections/ansible_collections/redhat/rhv/roles/hosted_engine_setup/examples
4. Modify passwords.yml and nfs_deployment.json
5. ansible-playbook with hosted_engine_deploy_localhost.yml
Result:
Hosted engine deploy successful.
[root@hp-xxxxxx-xx examples]# hosted-engine --vm-status
--== Host hp-xxxxxx-xx.lab.eng.pek2.redhat.com (id: 1) status ==--
Host ID : 1
Host timestamp : 7201
Score : 3400
Engine status : {"vm": "up", "health": "good", "detail": "Up"}
Hostname : hp-xxxxxx-xx.lab.eng.pek2.redhat.com
Local maintenance : False
stopped : False
crc32 : 37b4d8e7
conf_on_shared_storage : True
local_conf_timestamp : 7201
Status up-to-date : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=7201 (Fri Apr 2 12:45:06 2021)
host-id=1
score=3400
vm_conf_refresh_time=7201 (Fri Apr 2 12:45:06 2021)
conf_on_shared_storage=True
maintenance=False
state=EngineUp
stopped=False
QE cannnot reproduce this bug. I am not sure my steps are right or not.
@Roni,
Could you please to check my steps? Or give me the detail steps? Thanks!
Hi Wei The question is not if the whole process was succeeded or not, because it depends on your boot process of your host you will need a host with a long boot process to see the issue. Instead please check if the host has been rebooted, while it should not be expected to because the key 'reboot_after_installation' is set to 'false' Note that this bug already fixed with python3-ovirt-engine-sdk4-4.4.10-1.el8.x86_64 Thx Roni I'm verifying it as it has been fixed already in python3-ovirt-engine-sdk4-4.4.10-1.el8.x86_64, present in ovirt-engine-4.4.5.11 |
Created attachment 1767605 [details] logs Description of problem: deploy using 'ovirt-ansible-collection' fail on RHVH Version-Release number of selected component (if applicable): rhvh-4.4.4.2-0.20210307.0+1 How reproducible: 100% Steps to Reproduce: 1. Provision host with rhvh rhvh-4.4.4.2-0.20210307 2. Deploy hoste_engine using ovirt-ansible-collection role: 'hosted_engine_setup' 3. Actual results: Deploy fails when it tries to deploy HE on the first host (not when adding new hosts) It seems that the host has been rebooted while it not expected to, because the parameter 'reboot_after_installation' is set to 'false' NOTE: Martin Nacas found that the issue is fixed after upgrading the package: python3-ovirt-engine-sdk4-4.4.7-1.el8.x86_64 to: python3-ovirt-engine-sdk4-4.4.10-1.el8.x86_64 Meaning with v10 the host is not rebooted as expected Expected results: Deploy should pass, the host should not be rebooted Additional info: Below are Ansible console logs on failure + see attached full ovirt logs 18:55:07 TASK [ovirt.ovirt.hosted_engine_setup : Obtain SSO token using username/password credentials] *** 18:55:13 ok: [lynx25.lab.eng.tlv2.redhat.com] 18:55:13 18:55:13 TASK [ovirt.ovirt.hosted_engine_setup : Wait for the host to be up] ************ 18:55:19 FAILED - RETRYING: Wait for the host to be up (120 retries left). 18:55:35 FAILED - RETRYING: Wait for the host to be up (119 retries left). 18:55:51 FAILED - RETRYING: Wait for the host to be up (118 retries left). 18:56:06 FAILED - RETRYING: Wait for the host to be up (117 retries left). 18:56:22 FAILED - RETRYING: Wait for the host to be up (116 retries left). 18:56:37 FAILED - RETRYING: Wait for the host to be up (115 retries left). 18:56:53 FAILED - RETRYING: Wait for the host to be up (114 retries left). 18:57:09 FAILED - RETRYING: Wait for the host to be up (113 retries left). 18:57:24 FAILED - RETRYING: Wait for the host to be up (112 retries left). 18:57:42 FAILED - RETRYING: Wait for the host to be up (111 retries left). 18:57:58 FAILED - RETRYING: Wait for the host to be up (110 retries left). 18:58:13 FAILED - RETRYING: Wait for the host to be up (109 retries left). 18:58:29 FAILED - RETRYING: Wait for the host to be up (108 retries left). 18:58:44 FAILED - RETRYING: Wait for the host to be up (107 retries left). 18:59:00 FAILED - RETRYING: Wait for the host to be up (106 retries left). 18:59:15 FAILED - RETRYING: Wait for the host to be up (105 retries left). 18:59:31 FAILED - RETRYING: Wait for the host to be up (104 retries left). 18:59:46 FAILED - RETRYING: Wait for the host to be up (103 retries left). 19:00:02 FAILED - RETRYING: Wait for the host to be up (102 retries left). 19:00:21 FAILED - RETRYING: Wait for the host to be up (101 retries left). 19:04:00 fatal: [lynx25.lab.eng.tlv2.redhat.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host lynx25.lab.eng.tlv2.redhat.com port 22: No route to host", "unreachable": true} 19:04:00 19:04:00 RUNNING HANDLER [ci-map : yum-clean-all] *************************************** 19:08:43 fatal: [lynx25.lab.eng.tlv2.redhat.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Connection timed out during banner exchange", "unreachable": true} 19:08:43 19:08:43 PLAY RECAP ********************************************************************* 19:08:43 hosted-engine-07.lab.eng.tlv2.redhat.com : ok=0 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0 19:08:43 localhost : ok=5 changed=2 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0 19:08:43 lynx25.lab.eng.tlv2.redhat.com : ok=315 changed=109 unreachable=2 failed=0 skipped=146 rescued=0 ignored=4 19:08:43 lynx26.lab.eng.tlv2.redhat.com : ok=23 changed=11 unreachable=0 failed=0 skipped=12 rescued=0 ignored=1 19:08:43 lynx27.lab.eng.tlv2.redhat.com : ok=23 changed=11 unreachable=0 failed=0