Bug 1386256

Summary: RHV self hosted fails to deploy.
Product: Red Hat Quickstart Cloud Installer Reporter: James Olin Oden <joden>
Component: Installation - RHEVAssignee: Fabian von Feilitzsch <fabian>
Status: CLOSED ERRATA QA Contact: Tasos Papaioannou <tpapaioa>
Severity: unspecified Docs Contact: Dan Macpherson <dmacpher>
Priority: high    
Version: 1.1CC: bthurber, jmatthew, joden, qci-bugzillas, tpapaioa
Target Milestone: ---Keywords: Triaged
Target Release: 1.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-28 01:40:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1386293, 1411435    
Bug Blocks:    
Attachments:
Description Flags
sosreport archive from hypervisor with failure. none

Description James Olin Oden 2016-10-18 13:46:35 UTC
Created attachment 1211748 [details]
sosreport archive from hypervisor with failure.

Description of problem:
The RHV self-hosted deployment runs along for quite a while and then towards the end it hangs on some percentage done and then eventually fails.   The output for the task follows:

================================
ansible-ovirt returned a non-zero return code

PLAY [self_hosted_first_host] **************************************************

TASK [wait_for_host_up : wait for SSH to respond on host] **********************
ok: [hypervisor2.b.b -> localhost]

TASK [wait_for_host_up : Gather facts] *****************************************
ok: [hypervisor2.b.b]

TASK [override_tty : Override tty] *********************************************
changed: [hypervisor2.b.b]

TASK [subscription : print repositories] ***************************************
ok: [hypervisor2.b.b] => {
    "msg": [
        "rhel-7-server-beta-rpms", 
        "rhel-7-server-satellite-tools-6.2-rpms", 
        "rhel-7-server-rhv-4-mgmt-agent-rpms", 
        "rhel-7-server-supplementary-beta-rpms", 
        "rhel-7-server-optional-beta-rpms"
    ]
}

TASK [subscription : disable all] **********************************************
changed: [hypervisor2.b.b]

TASK [subscription : enable repos] *********************************************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Install dependences (~2GB)] *********************
changed: [hypervisor2.b.b] => (item=[u'genisoimage', u'rhevm-appliance', u'glusterfs-fuse', u'ovirt-hosted-engine-setup'])

TASK [self_hosted_first_host : Stop and disable NetworkManager] ****************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Create qemu group] ******************************
ok: [hypervisor2.b.b]

TASK [self_hosted_first_host : Create qemu user] *******************************
ok: [hypervisor2.b.b]

TASK [self_hosted_first_host : Find the path to the appliance image] ***********
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : get the provisioning nic for the machine] *******
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : create config directory] ************************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Get the answer file over there] *****************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Create cloud init temp directory] ***************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Copy over the cloud init data] ******************
changed: [hypervisor2.b.b] => (item={u'dest': u'/etc/qci//cloud_init/user-data', u'src': u'user-data.j2'})
changed: [hypervisor2.b.b] => (item={u'dest': u'/etc/qci//cloud_init/meta-data', u'src': u'meta-data.j2'})

TASK [self_hosted_first_host : Generate cloud-init iso] ************************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Fix permissions on iso] *************************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : check if the setup has already run] *************
fatal: [hypervisor2.b.b]: FAILED! => {"changed": false, "cmd": ["systemctl", "status", "ovirt-ha-agent"], "delta": "0:00:00.014686", "end": "2016-10-14 15:19:25.162781", "failed": true, "rc": 3, "start": "2016-10-14 15:19:25.148095", "stderr": "", "stdout": "● ovirt-ha-agent.service - RHEV Hosted Engine High Availability Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled)
   Active: inactive (dead)", "stdout_lines": ["● ovirt-ha-agent.service - RHEV Hosted Engine High Availability Monitoring Agent", "   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled)", "   Active: inactive (dead)"], "warnings": []}
...ignoring

TASK [self_hosted_first_host : Execute hosted-engine setup] ********************
fatal: [hypervisor2.b.b]: FAILED! => {"async_result": {"ansible_job_id": "170668647605.13966", "changed": false, "finished": 0, "invocation": {"module_args": {"jid": "170668647605.13966", "mode": "status"}, "module_name": "async_status"}, "started": 1}, "changed": false, "failed": true, "msg": "async task produced unparseable results"}

NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************
hypervisor2.b.b            : ok=19   changed=13   unreachable=0    failed=1   
==================================


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 James Olin Oden 2016-10-18 18:12:51 UTC
Forgot to enter some essential information:


Version-Release number of selected component (if applicable):
QCI-1.1-RHEL-7-20161013.t.0

How reproducible:
Very

Steps to Reproduce:
1.   Do a self hosted RHV deployment.

Actual results:
It fails with the error show above.

Expected results:
No failures.

Comment 3 Fabian von Feilitzsch 2016-11-01 17:05:12 UTC
What version of ansible is installed on the Satellite?

Comment 4 James Olin Oden 2016-11-01 19:00:17 UTC
I don't have that version of QCI installed anywhere anymore.   Do you need me to install it again somewhere and see what version of ansible gets installed (I'm assuming ansible comes in from running the fusor-installer)?

Comment 5 James Olin Oden 2016-11-01 19:00:50 UTC
Oh, wait is the version of RPM's on the system not in the sos report?

Comment 6 Fabian von Feilitzsch 2016-11-01 19:06:35 UTC
No, the SOS report is for the hypervisor, ansible is only installed on Satellite.

Comment 7 James Olin Oden 2016-11-02 13:28:56 UTC
It was ansible 2.1.1.0-2.el7

Comment 8 Fabian von Feilitzsch 2016-11-16 15:22:25 UTC
https://github.com/fusor/ansible-ovirt/pull/9

Comment 9 John Matthews 2016-11-22 13:38:33 UTC
Expected in 11/21 ISO

Comment 10 Tasos Papaioannou 2016-11-29 19:41:35 UTC
Verified on QCI-1.1-RHEL-7-20161128.t.0.

The older version of ebtables is installed on the RHV host:

# rpm -q ebtables
ebtables-2.0.10-13.el7.x86_64

and the engine VM starts up successfully:

# cat /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20161129162356-56bko9.log
[...]
2016-11-29 16:47:01 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:96 VDSM host in initializing state
2016-11-29 16:47:02 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:96 VDSM host in initializing state
2016-11-29 16:47:03 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:96 VDSM host in up state
2016-11-29 16:47:03 INFO otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:107 The VDSM Host is now operational
[...]

Comment 13 errata-xmlrpc 2017-02-28 01:40:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:0335