Bug 1439277

Summary: Ansible Install is unable to complete install due to module losing issues.
Product: OpenShift Container Platform Reporter: Eric Rich <erich>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED ERRATA QA Contact: Johnny Liu <jialiu>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.4.0CC: abutcher, aos-bugs, erich, ghuang, gpei, jmeyer, jokerman, jtanner, mmccomas, mnozell, sdodson
Target Milestone: ---   
Target Release: 3.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Running ansible via 'batch' systems like the `nohup` command caused ansible to leak file descriptors and abort playbooks whenever the maximum number of open file descriptors was reached. Ansible 2.2.3.0 includes a fix for this problem and OCP channels have been updated to include this version.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-17 17:39:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eric Rich 2017-04-05 14:57:21 UTC
Description of problem:

The conditional check 'openshift.common.is_containerized | bool' failed. The error was: No module named ''

The modules that fail to load are: log, ntpath, cmd

Version-Release number of selected component (if applicable): 3.4
How reproducible: Unconfirmed 

Steps to Reproduce:
1. Run ansible install using defined hosts file (attached). 

Actual results:

TASK [openshift_version : fail] ************************************************
fatal: [HOST1.DOMAIN.com]: FAILED! => {
    "failed": true
}

MSG:

The conditional check 'openshift.common.is_containerized | bool' failed. The error was: cannot import name log

The error appears to have been in '/usr/share/ansible/openshift-ansible/roles/openshift_version/tasks/main.yml': line 10, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

# be used by default. Users must indicate what they want.
- fail:
  ^ here

fatal: [HOST3.DOMAIN.com]: FAILED! => {
    "failed": true
}

MSG:

The conditional check 'openshift.common.is_containerized | bool' failed. The error was: No module named ntpath

The error appears to have been in '/usr/share/ansible/openshift-ansible/roles/openshift_version/tasks/main.yml': line 10, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

# be used by default. Users must indicate what they want.
- fail:
  ^ here

fatal: [HOST2.DOMAIN.com]: FAILED! => {
    "failed": true
}

MSG:

The conditional check 'openshift.common.is_containerized | bool' failed. The error was: No module named cmd

The error appears to have been in '/usr/share/ansible/openshift-ansible/roles/openshift_version/tasks/main.yml': line 10, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

# be used by default. Users must indicate what they want.
- fail:
  ^ here

Expected results:

The install should complete. 

Additional info:

Comment 16 Scott Dodson 2017-04-24 14:26:24 UTC
Root cause is ansible leaks file descriptors when run via nohup, this will be fixed in the next ansible releases.

https://github.com/ansible/ansible/issues/23541

Comment 18 Scott Dodson 2017-05-01 19:49:57 UTC
Lowering severity as a clear workaround is available, ie: don't use 'nohup' should be fixed in the next ansible release which we'll ship when it becomes available.

Comment 19 Scott Dodson 2017-05-15 18:30:28 UTC
Fixed in ansible-2.2.3.0

Comment 21 Gan Huang 2017-05-16 05:16:18 UTC
Verified with ansible-2.2.3.0-1.el7.noarch, openshift-ansible-3.4.89-1.git.0.ac29ce8.el7.noarch

1) Installation failed with ansible-2.2.2.0-1.el7.noarch

#nohup ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml -i /tmp/hosts  -vvvv

TASK [openshift_version : set_fact] ********************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_version/tasks/main.yml:27
Process WorkerProcess-1011:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
  File "/usr/lib/python2.7/site-packages/ansible/executor/process/worker.py", line 99, in run
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/__init__.py", line 37, in atfork
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 224, in reinit
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 215, in _get_singleton
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 159, in __init__
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 86, in __init__
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 53, in __init__
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/OSRNG/posix.py", line 83, in new
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/OSRNG/posix.py", line 44, in __init__
IOError: [Errno 24] Too many open files: '/dev/urandom'
Exception AttributeError: "'DevURandomRNG' object has no attribute 'closed'" in <bound method DevURandomRNG.__del__ of <Crypto.Random.OSRNG.posix.DevURandomRNG object at 0x87de510>> ignored
skipping: [qe-ghuang-master-1.0516-uzt.qe.rhcloud.com] => {
    "changed": false,
    "skip_reason": "Conditional check failed",
    "skipped": true
}
 [WARNING]: Failure using method (v2_runner_on_skipped) in callback plugin
(</usr/lib/python2.7/site-packages/ara/plugins/callbacks/log_ara.CallbackModule
object at 0x501eb50>): (sqlite3.OperationalError) unable to open database file
[SQL: u'INSERT INTO task_results (id, task_id, host_id, status, changed,
failed, skipped, unreachable, ignore_errors, result, time_start, time_end)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)'] [parameters: ('26e29845-127b-
4c2a-a372-8f1bef050bc1', u'3cc09168-a79a-48b3-805f-5a291cf31bd4', u'481e015a-
beda-4d7f-a3d0-6fdaf23b9bfc', 'skipped', 0, 0, 1, 0, 0, <read-only buffer for
0x7f60d50, size -1, offset 0 at 0x8cb4c70>, '2017-05-16 01:04:41.675759',
'2017-05-16 01:04:41.773230')]

2) Installation succeed after upgrading to ansible-2.2.3.0-1.el7.noarch 

#nohup ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml -i /tmp/hosts  -vvvv

Comment 24 errata-xmlrpc 2017-05-17 17:39:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1244