Bug 1512472

Summary: "hosted-engine --deploy --ansible" deployment fails.
Product: [oVirt] ovirt-hosted-engine-setup Reporter: Nikolai Sednev <nsednev>
Component: GeneralAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED CURRENTRELEASE QA Contact: Nikolai Sednev <nsednev>
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.2.0CC: bugs, nsednev, omachace
Target Milestone: ovirt-4.2.0Keywords: Triaged
Target Release: ---Flags: rule-engine: ovirt-4.2+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-20 11:31:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1455169    
Attachments:
Description Flags
sosreport from alma03
none
sosreport from alma04
none
sosreport from alma04
none
Screenshot from 2017-11-16 17-10-29.png none

Description Nikolai Sednev 2017-11-13 10:20:13 UTC
Description of problem:

[ ERROR ] fatal: [localhost]: FAILED! => {"failed": true, "msg": "The conditional check 'host_result.ansible_facts.ovirt_hosts|length >= 1 and (\"'non_operational' in host_result.ansible_facts.ovirt_hosts[0].status\" or \"'up' in host_result.ansible_facts.ovirt_hosts[0].status\")' failed. The error was: error while evaluating conditional (host_result.ansible_facts.ovirt_hosts|length >= 1 and (\"'non_operational' in host_result.ansible_facts.ovirt_hosts[0].status\" or \"'up' in host_result.ansible_facts.ovirt_hosts[0].status\")): 'dict object' has no attribute 'ansible_facts'"}
[ ERROR ] Failed to execute stage 'Closing up': Failed running ansible playbook

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.2.0-0.0.master.20171110120732.git35143a6.el7.centos.noarch
ovirt-hosted-engine-ha-2.2.0-0.0.master.20171110162946.20171110162942.git3717fac.el7.centos.noarch

How reproducible:
100%

Steps to Reproduce:
1.Run "hosted-engine --deploy --ansible" on clean RHEL7 host.

Actual results:
Deployment failed.

Expected results:
Deployment should succeed.

Additional info:
Sosreport from host is attached.

Comment 1 Nikolai Sednev 2017-11-13 10:22:28 UTC
Created attachment 1351498 [details]
sosreport from alma03

Comment 2 Martin Sivák 2017-11-13 11:28:13 UTC
What version of ovirt ansible modules is used?

Comment 3 Nikolai Sednev 2017-11-13 11:34:34 UTC
ansible-2.4.0.0-5.el7.noarch

Comment 4 Ondra Machacek 2017-11-13 13:02:19 UTC
When ovirt_host_facts module is executed it should always return ansible_facts, unless the execution of module fail.


I would suggest to simplify the condition to following, it can be better readable then:

 - name: Wait for the host to become non operational
   ovirt_hosts_facts:
     pattern: name={{ HOST_NAME }} status=nonresponsive or status=up
     auth:
       username: admin@internal
       password: "{{ ADMIN_PASSWORD }}"
       url: https://{{ FQDN }}/ovirt-engine/api
       insecure: true
   register: host_result
   until: host_result|succeeded and host_result.ansible_facts.ovirt_hosts|length >= 1
   retries: 50
   delay: 10

One other note is that, when the task is executed like this, it is always re-autheticated. Not sure if ovirt-engine service is restarted somewhere in between. But I would suggest to create an SSO token[1] and reuse the token, rather then login/logout all the time.

[1] http://docs.ansible.com/ansible/ovirt_auth_module.html

Comment 8 Nikolai Sednev 2017-11-14 16:17:33 UTC
Created attachment 1352063 [details]
sosreport from alma04

Comment 13 Nikolai Sednev 2017-11-15 14:51:54 UTC
Created attachment 1352677 [details]
sosreport from alma04

Comment 14 Simone Tiraboschi 2017-11-15 15:18:02 UTC
It's still SELinux on the engine VM:

engine-setup still hanged on:
system_u:system_r:cloud_init_t:s0 root   10429  0.0  0.1 282300 24580 ?        S    16:34   0:01 /bin/python -B -m otopi.__main__  APPEND:BASE/pluginPath=str:/usr/share/ovirt-engine/setup/bin/../plugins APPEND:B
system_u:system_r:cloud_init_t:s0 root   10567  0.0  0.1 254608 22196 ?        S    16:34   0:00 /usr/bin/python -Es /usr/bin/firewall-cmd --list-all-zones

We reintroduced it only yesterday: https://gerrit.ovirt.org/#/c/84073/

Comment 16 Nikolai Sednev 2017-11-16 15:11:16 UTC
Created attachment 1353557 [details]
Screenshot from 2017-11-16 17-10-29.png

Comment 18 Nikolai Sednev 2017-12-12 14:33:18 UTC
Works for me on these components:
Using ovirt-hosted-engine-setup-2.2.1-0.0.master.20171206172737.gitd3001c8.el7.centos.noarch and ovirt-engine-appliance-4.2-20171210.1.el7.centos.noarch:
1.Deployed over NFS storage - success.
2.Deployed over iSCSI storage - success.
3.Deployed over Gluster storage - success.

The initial SHE-VM that was created during deployment and then powered-off, still appears and shown as "external-HostedEngineLocal".

Comment 19 Sandro Bonazzola 2017-12-20 11:31:03 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.