Created attachment 1211748 [details] sosreport archive from hypervisor with failure. Description of problem: The RHV self-hosted deployment runs along for quite a while and then towards the end it hangs on some percentage done and then eventually fails. The output for the task follows: ================================ ansible-ovirt returned a non-zero return code PLAY [self_hosted_first_host] ************************************************** TASK [wait_for_host_up : wait for SSH to respond on host] ********************** ok: [hypervisor2.b.b -> localhost] TASK [wait_for_host_up : Gather facts] ***************************************** ok: [hypervisor2.b.b] TASK [override_tty : Override tty] ********************************************* changed: [hypervisor2.b.b] TASK [subscription : print repositories] *************************************** ok: [hypervisor2.b.b] => { "msg": [ "rhel-7-server-beta-rpms", "rhel-7-server-satellite-tools-6.2-rpms", "rhel-7-server-rhv-4-mgmt-agent-rpms", "rhel-7-server-supplementary-beta-rpms", "rhel-7-server-optional-beta-rpms" ] } TASK [subscription : disable all] ********************************************** changed: [hypervisor2.b.b] TASK [subscription : enable repos] ********************************************* changed: [hypervisor2.b.b] TASK [self_hosted_first_host : Install dependences (~2GB)] ********************* changed: [hypervisor2.b.b] => (item=[u'genisoimage', u'rhevm-appliance', u'glusterfs-fuse', u'ovirt-hosted-engine-setup']) TASK [self_hosted_first_host : Stop and disable NetworkManager] **************** changed: [hypervisor2.b.b] TASK [self_hosted_first_host : Create qemu group] ****************************** ok: [hypervisor2.b.b] TASK [self_hosted_first_host : Create qemu user] ******************************* ok: [hypervisor2.b.b] TASK [self_hosted_first_host : Find the path to the appliance image] *********** changed: [hypervisor2.b.b] TASK [self_hosted_first_host : get the provisioning nic for the machine] ******* changed: [hypervisor2.b.b] TASK [self_hosted_first_host : create config directory] ************************ changed: [hypervisor2.b.b] TASK [self_hosted_first_host : Get the answer file over there] ***************** changed: [hypervisor2.b.b] TASK [self_hosted_first_host : Create cloud init temp directory] *************** changed: [hypervisor2.b.b] TASK [self_hosted_first_host : Copy over the cloud init data] ****************** changed: [hypervisor2.b.b] => (item={u'dest': u'/etc/qci//cloud_init/user-data', u'src': u'user-data.j2'}) changed: [hypervisor2.b.b] => (item={u'dest': u'/etc/qci//cloud_init/meta-data', u'src': u'meta-data.j2'}) TASK [self_hosted_first_host : Generate cloud-init iso] ************************ changed: [hypervisor2.b.b] TASK [self_hosted_first_host : Fix permissions on iso] ************************* changed: [hypervisor2.b.b] TASK [self_hosted_first_host : check if the setup has already run] ************* fatal: [hypervisor2.b.b]: FAILED! => {"changed": false, "cmd": ["systemctl", "status", "ovirt-ha-agent"], "delta": "0:00:00.014686", "end": "2016-10-14 15:19:25.162781", "failed": true, "rc": 3, "start": "2016-10-14 15:19:25.148095", "stderr": "", "stdout": "● ovirt-ha-agent.service - RHEV Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled) Active: inactive (dead)", "stdout_lines": ["● ovirt-ha-agent.service - RHEV Hosted Engine High Availability Monitoring Agent", " Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled)", " Active: inactive (dead)"], "warnings": []} ...ignoring TASK [self_hosted_first_host : Execute hosted-engine setup] ******************** fatal: [hypervisor2.b.b]: FAILED! => {"async_result": {"ansible_job_id": "170668647605.13966", "changed": false, "finished": 0, "invocation": {"module_args": {"jid": "170668647605.13966", "mode": "status"}, "module_name": "async_status"}, "started": 1}, "changed": false, "failed": true, "msg": "async task produced unparseable results"} NO MORE HOSTS LEFT ************************************************************* PLAY RECAP ********************************************************************* hypervisor2.b.b : ok=19 changed=13 unreachable=0 failed=1 ================================== Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Forgot to enter some essential information: Version-Release number of selected component (if applicable): QCI-1.1-RHEL-7-20161013.t.0 How reproducible: Very Steps to Reproduce: 1. Do a self hosted RHV deployment. Actual results: It fails with the error show above. Expected results: No failures.
What version of ansible is installed on the Satellite?
I don't have that version of QCI installed anywhere anymore. Do you need me to install it again somewhere and see what version of ansible gets installed (I'm assuming ansible comes in from running the fusor-installer)?
Oh, wait is the version of RPM's on the system not in the sos report?
No, the SOS report is for the hypervisor, ansible is only installed on Satellite.
It was ansible 2.1.1.0-2.el7
https://github.com/fusor/ansible-ovirt/pull/9
Expected in 11/21 ISO
Verified on QCI-1.1-RHEL-7-20161128.t.0. The older version of ebtables is installed on the RHV host: # rpm -q ebtables ebtables-2.0.10-13.el7.x86_64 and the engine VM starts up successfully: # cat /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20161129162356-56bko9.log [...] 2016-11-29 16:47:01 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:96 VDSM host in initializing state 2016-11-29 16:47:02 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:96 VDSM host in initializing state 2016-11-29 16:47:03 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:96 VDSM host in up state 2016-11-29 16:47:03 INFO otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:107 The VDSM Host is now operational [...]
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:0335