Bug 1512472 - "hosted-engine --deploy --ansible" deployment fails.
Summary: "hosted-engine --deploy --ansible" deployment fails.
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: General
Version: 2.2.0
Hardware: x86_64
OS: Linux
high vote
Target Milestone: ovirt-4.2.0
: ---
Assignee: Simone Tiraboschi
QA Contact: Nikolai Sednev
Depends On:
Blocks: 1455169
TreeView+ depends on / blocked
Reported: 2017-11-13 10:20 UTC by Nikolai Sednev
Modified: 2017-12-20 11:31 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Last Closed: 2017-12-20 11:31:03 UTC
oVirt Team: Integration
rule-engine: ovirt-4.2+

Attachments (Terms of Use)
sosreport from alma03 (9.27 MB, application/x-xz)
2017-11-13 10:22 UTC, Nikolai Sednev
no flags Details
sosreport from alma04 (9.24 MB, application/x-xz)
2017-11-14 16:17 UTC, Nikolai Sednev
no flags Details
sosreport from alma04 (9.20 MB, application/x-xz)
2017-11-15 14:51 UTC, Nikolai Sednev
no flags Details
Screenshot from 2017-11-16 17-10-29.png (94.99 KB, image/png)
2017-11-16 15:11 UTC, Nikolai Sednev
no flags Details

System ID Priority Status Summary Last Updated
oVirt gerrit 83989 'None' MERGED ansible: additional checks on ovirt_hosts_facts result structure 2020-02-13 04:09:33 UTC

Description Nikolai Sednev 2017-11-13 10:20:13 UTC
Description of problem:

[ ERROR ] fatal: [localhost]: FAILED! => {"failed": true, "msg": "The conditional check 'host_result.ansible_facts.ovirt_hosts|length >= 1 and (\"'non_operational' in host_result.ansible_facts.ovirt_hosts[0].status\" or \"'up' in host_result.ansible_facts.ovirt_hosts[0].status\")' failed. The error was: error while evaluating conditional (host_result.ansible_facts.ovirt_hosts|length >= 1 and (\"'non_operational' in host_result.ansible_facts.ovirt_hosts[0].status\" or \"'up' in host_result.ansible_facts.ovirt_hosts[0].status\")): 'dict object' has no attribute 'ansible_facts'"}
[ ERROR ] Failed to execute stage 'Closing up': Failed running ansible playbook

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.Run "hosted-engine --deploy --ansible" on clean RHEL7 host.

Actual results:
Deployment failed.

Expected results:
Deployment should succeed.

Additional info:
Sosreport from host is attached.

Comment 1 Nikolai Sednev 2017-11-13 10:22:28 UTC
Created attachment 1351498 [details]
sosreport from alma03

Comment 2 Martin Sivák 2017-11-13 11:28:13 UTC
What version of ovirt ansible modules is used?

Comment 3 Nikolai Sednev 2017-11-13 11:34:34 UTC

Comment 4 Ondra Machacek 2017-11-13 13:02:19 UTC
When ovirt_host_facts module is executed it should always return ansible_facts, unless the execution of module fail.

I would suggest to simplify the condition to following, it can be better readable then:

 - name: Wait for the host to become non operational
     pattern: name={{ HOST_NAME }} status=nonresponsive or status=up
       username: admin@internal
       password: "{{ ADMIN_PASSWORD }}"
       url: https://{{ FQDN }}/ovirt-engine/api
       insecure: true
   register: host_result
   until: host_result|succeeded and host_result.ansible_facts.ovirt_hosts|length >= 1
   retries: 50
   delay: 10

One other note is that, when the task is executed like this, it is always re-autheticated. Not sure if ovirt-engine service is restarted somewhere in between. But I would suggest to create an SSO token[1] and reuse the token, rather then login/logout all the time.

[1] http://docs.ansible.com/ansible/ovirt_auth_module.html

Comment 8 Nikolai Sednev 2017-11-14 16:17:33 UTC
Created attachment 1352063 [details]
sosreport from alma04

Comment 13 Nikolai Sednev 2017-11-15 14:51:54 UTC
Created attachment 1352677 [details]
sosreport from alma04

Comment 14 Simone Tiraboschi 2017-11-15 15:18:02 UTC
It's still SELinux on the engine VM:

engine-setup still hanged on:
system_u:system_r:cloud_init_t:s0 root   10429  0.0  0.1 282300 24580 ?        S    16:34   0:01 /bin/python -B -m otopi.__main__  APPEND:BASE/pluginPath=str:/usr/share/ovirt-engine/setup/bin/../plugins APPEND:B
system_u:system_r:cloud_init_t:s0 root   10567  0.0  0.1 254608 22196 ?        S    16:34   0:00 /usr/bin/python -Es /usr/bin/firewall-cmd --list-all-zones

We reintroduced it only yesterday: https://gerrit.ovirt.org/#/c/84073/

Comment 16 Nikolai Sednev 2017-11-16 15:11:16 UTC
Created attachment 1353557 [details]
Screenshot from 2017-11-16 17-10-29.png

Comment 18 Nikolai Sednev 2017-12-12 14:33:18 UTC
Works for me on these components:
Using ovirt-hosted-engine-setup-2.2.1-0.0.master.20171206172737.gitd3001c8.el7.centos.noarch and ovirt-engine-appliance-4.2-20171210.1.el7.centos.noarch:
1.Deployed over NFS storage - success.
2.Deployed over iSCSI storage - success.
3.Deployed over Gluster storage - success.

The initial SHE-VM that was created during deployment and then powered-off, still appears and shown as "external-HostedEngineLocal".

Comment 19 Sandro Bonazzola 2017-12-20 11:31:03 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Note You need to log in before you can comment on or make changes to this bug.