Bug 1382678

Summary: RHV: ansible: OSError: [Errno 38] Function not implemented
Product: Red Hat Quickstart Cloud Installer Reporter: Antonin Pagac <apagac>
Component: Installation - RHEVAssignee: dgao
Status: CLOSED ERRATA QA Contact: James Olin Oden <joden>
Severity: unspecified Docs Contact: Dan Macpherson <dmacpher>
Priority: unspecified    
Version: 1.1CC: bthurber, jmatthew, joden
Target Milestone: ---Keywords: Triaged
Target Release: 1.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-28 01:39:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
excerpt from production.log none

Description Antonin Pagac 2016-10-07 11:36:51 UTC
Created attachment 1208139 [details]
excerpt from production.log

Description of problem:
While deploying RHV there was a timeout caused by HW machine not booting to the correct interface (known user config error) with message:

'[foreman-tasks/action] [E] Connection timed out - connect(2) for "mac6451060d9f95.example.com" port 22 (Errno::ETIMEDOUT)'

After fixing the problem and resuming the task, it failed with:

'ansible-RHEV returned a non-zero return code
Traceback (most recent call last):
  File "/usr/bin/ansible-playbook", line 42, in <module>
    debug_lock = Lock()
  File "/usr/lib64/python2.7/multiprocessing/__init__.py", line 176, in Lock
    return Lock()
  File "/usr/lib64/python2.7/multiprocessing/synchronize.py", line 147, in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1)
  File "/usr/lib64/python2.7/multiprocessing/synchronize.py", line 75, in __init__
    sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)
OSError: [Errno 38] Function not implemented'

See attachment for complete traceback from production.log.

Version-Release number of selected component (if applicable):
QCI-1.1-RHEL-7-20161006.t.0

How reproducible:
Happened to me once

Steps to Reproduce:
1. Start a RHV deployment
2. When HW hypervisor restarts during the installation shut the machine down
3. Wait for a timeout error to be displayed in the UI
4. Start the hypervisor machine
5. Try to resume the failing task in the UI

Actual results:
Error appears, task is not resumed

Expected results:
Able to resume the task when the problem is fixed

Additional info:

Comment 2 John Matthews 2016-10-07 19:56:50 UTC
https://github.com/fusor/fusor-selinux/pull/35

Will be in the 1/7/2016 compose

Comment 3 James Olin Oden 2016-10-18 13:30:00 UTC
Verified in QCI-1.1-RHEL-7-20161013.t.0

Note, RHV self-hosted still fails to install but it gets past this problem and now has a different error.   I will create a new bugzilla report on it.

Here is the output from the new error:

---------------------------------------------------------------
ansible-ovirt returned a non-zero return code

PLAY [self_hosted_first_host] **************************************************

TASK [wait_for_host_up : wait for SSH to respond on host] **********************
ok: [hypervisor2.b.b -> localhost]

TASK [wait_for_host_up : Gather facts] *****************************************
ok: [hypervisor2.b.b]

TASK [override_tty : Override tty] *********************************************
changed: [hypervisor2.b.b]

TASK [subscription : print repositories] ***************************************
ok: [hypervisor2.b.b] => {
    "msg": [
        "rhel-7-server-beta-rpms", 
        "rhel-7-server-satellite-tools-6.2-rpms", 
        "rhel-7-server-rhv-4-mgmt-agent-rpms", 
        "rhel-7-server-supplementary-beta-rpms", 
        "rhel-7-server-optional-beta-rpms"
    ]
}

TASK [subscription : disable all] **********************************************
changed: [hypervisor2.b.b]

TASK [subscription : enable repos] *********************************************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Install dependences (~2GB)] *********************
changed: [hypervisor2.b.b] => (item=[u'genisoimage', u'rhevm-appliance', u'glusterfs-fuse', u'ovirt-hosted-engine-setup'])

TASK [self_hosted_first_host : Stop and disable NetworkManager] ****************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Create qemu group] ******************************
ok: [hypervisor2.b.b]

TASK [self_hosted_first_host : Create qemu user] *******************************
ok: [hypervisor2.b.b]

TASK [self_hosted_first_host : Find the path to the appliance image] ***********
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : get the provisioning nic for the machine] *******
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : create config directory] ************************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Get the answer file over there] *****************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Create cloud init temp directory] ***************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Copy over the cloud init data] ******************
changed: [hypervisor2.b.b] => (item={u'dest': u'/etc/qci//cloud_init/user-data', u'src': u'user-data.j2'})
changed: [hypervisor2.b.b] => (item={u'dest': u'/etc/qci//cloud_init/meta-data', u'src': u'meta-data.j2'})

TASK [self_hosted_first_host : Generate cloud-init iso] ************************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : Fix permissions on iso] *************************
changed: [hypervisor2.b.b]

TASK [self_hosted_first_host : check if the setup has already run] *************
fatal: [hypervisor2.b.b]: FAILED! => {"changed": false, "cmd": ["systemctl", "status", "ovirt-ha-agent"], "delta": "0:00:00.014686", "end": "2016-10-14 15:19:25.162781", "failed": true, "rc": 3, "start": "2016-10-14 15:19:25.148095", "stderr": "", "stdout": "● ovirt-ha-agent.service - RHEV Hosted Engine High Availability Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled)
   Active: inactive (dead)", "stdout_lines": ["● ovirt-ha-agent.service - RHEV Hosted Engine High Availability Monitoring Agent", "   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled; vendor preset: disabled)", "   Active: inactive (dead)"], "warnings": []}
...ignoring

TASK [self_hosted_first_host : Execute hosted-engine setup] ********************
fatal: [hypervisor2.b.b]: FAILED! => {"async_result": {"ansible_job_id": "170668647605.13966", "changed": false, "finished": 0, "invocation": {"module_args": {"jid": "170668647605.13966", "mode": "status"}, "module_name": "async_status"}, "started": 1}, "changed": false, "failed": true, "msg": "async task produced unparseable results"}

NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************
hypervisor2.b.b            : ok=19   changed=13   unreachable=0    failed=1

Comment 6 errata-xmlrpc 2017-02-28 01:39:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:0335