Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1497421 - Cannot allocate memory when deploy logging on one Env [NEEDINFO]
Cannot allocate memory when deploy logging on one Env
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.7.0
Unspecified Unspecified
urgent Severity urgent
: ---
: 3.7.z
Assigned To: Michael Gugino
Anping Li
: Reopened
Depends On:
Blocks: 1557290
  Show dependency treegraph
 
Reported: 2017-09-30 05:34 EDT by Anping Li
Modified: 2018-05-04 08:44 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1557290 (view as bug list)
Environment:
Last Closed: 2018-05-04 08:44:22 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
sdodson: needinfo? (mmariyan)


Attachments (Terms of Use)
The memory allocate error (52.22 KB, application/x-gzip)
2017-09-30 05:35 EDT, Anping Li
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0636 None None None 2018-04-05 05:30 EDT

  None (edit)
Description Anping Li 2017-09-30 05:34:49 EDT
Description of problem:
Cannot allocate memory was reported when deploy logging. It seem the free memory  is enough for tasks.


On localhosts:
# free -h
              total        used        free      shared  buff/cache   available
Mem:           3.7G        2.1G        984M        182M        675M        1.2G
Swap:            0B          0B          0B

On the first master:
        "              total        used        free      shared  buff/cache   available", 
        "Mem:           7.1G        883M        1.5G        796K        4.8G        5.9G", 
        "Swap:            0B          0B          0B"


Version-Release number of the following components:
openshift-ansible-3.7.0-0.134.0.git.0.6f43fc3.el7.noarch

How reproducible:
Always on one Env.  but no such issue on the other Env


Steps to Reproduce:
1. deploy loggging on OCP by playbook

Actual results:
TASK [openshift_logging : copy] ***********************************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_certs.yaml:106
skipping: [ec2-54-85-72-229.compute-1.amazonaws.com] => {
    "changed": false, 
    "skip_reason": "Conditional result was False", 
    "skipped": true
}

TASK [openshift_logging : Generate PEM certs] *********************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_certs.yaml:115
included: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_pems.yaml for ec2-54-85-72-229.compute-1.amazonaws.com
included: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_pems.yaml for ec2-54-85-72-229.compute-1.amazonaws.com
included: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_pems.yaml for ec2-54-85-72-229.compute-1.amazonaws.com
included: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_pems.yaml for ec2-54-85-72-229.compute-1.amazonaws.com

TASK [openshift_logging : Checking for system.logging.fluentd.key] ************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_pems.yaml:3
ERROR! Unexpected Exception, this is probably a bug: [Errno 12] Cannot allocate memory
the full traceback was:

Traceback (most recent call last):
  File "/usr/bin/ansible-playbook", line 106, in <module>
    exit_code = cli.run()
  File "/usr/lib/python2.7/site-packages/ansible/cli/playbook.py", line 130, in run
    results = pbex.run()
  File "/usr/lib/python2.7/site-packages/ansible/executor/playbook_executor.py", line 154, in run
    result = self._tqm.run(play=play)
  File "/usr/lib/python2.7/site-packages/ansible/executor/task_queue_manager.py", line 292, in run
    play_return = strategy.run(iterator, play_context)
  File "/usr/lib/python2.7/site-packages/ansible/plugins/strategy/linear.py", line 277, in run
    self._queue_task(host, task, task_vars, play_context)
  File "/usr/lib/python2.7/site-packages/ansible/plugins/strategy/__init__.py", line 222, in _queue_task
    worker_prc.start()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 130, in start
    self._popen = Popen(self)
  File "/usr/lib64/python2.7/multiprocessing/forking.py", line 121, in __init__
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory


Expected results:

Additional info:
Comment 1 Anping Li 2017-09-30 05:35 EDT
Created attachment 1332643 [details]
The memory allocate error
Comment 5 Anping Li 2018-02-03 20:17:52 EST
I guess the bug should be fixed in openshift-ansible:v3.7.25 and later.
Comment 7 Anping Li 2018-02-06 03:36:33 EST
The error appears again  when install Openshift with logging by openshift-ansile-3.9.0-0.38.0.0

ansible slave:
free -h
              total        used        free      shared  buff/cache   available
Mem:           3.9G        1.5G        1.5G        1.2M        853M        1.9G


TASK [openshift_logging_fluentd : Generate logging-fluentd daemonset definition] ***
Tuesday 06 February 2018  07:46:31 +0000 (0:00:02.093)       0:19:01.565 ****** 
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 12] Cannot allocate memory
fatal: [ec2-54-87-30-170.compute-1.amazonaws.com]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""}
Comment 8 Jeff Cantrill 2018-02-06 15:22:10 EST
Isn't this related to ansible and not logging specifically?
Comment 9 Anping Li 2018-02-06 21:02:33 EST
we didn't hit this issue for the separate logging deploy.
Comment 12 Jeff Cantrill 2018-02-09 10:55:06 EST
@Scott,  do we advocate using ansible 2.4.x with ose-ansible 3.7?
Comment 13 Scott Dodson 2018-02-09 10:59:00 EST
(In reply to Jeff Cantrill from comment #12)
> @Scott,  do we advocate using ansible 2.4.x with ose-ansible 3.7?

Ansible 2.4 is not required until OCP 3.9 but should work. If downgrading to ansible-2.3.2 as shipped in the OCP channel fixes the problem then that's a perfectly valid workaround and we can lower priority based on having a confirmed workaround.
Comment 14 Zisis Lianas 2018-03-02 08:13:25 EST
I had the same issue with the OCP 3.7 installer:
openshift-ansible-3.7.23-1.git.0.bc406aa.el7.noarch

$ ansible --version
ansible 2.3.1.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = Default w/o overrides
  python version = 2.7.5 (default, May  3 2017, 07:55:04) [GCC 4.8.5 20150623 (Red Hat 4.8.5-14)]



Workaround was to increase the memory of the bastion/installer host (in my case from 2GB RAM to 8GB RAM).
Comment 15 Scott Dodson 2018-03-27 09:50:45 EDT
The current release-3.7 code no longer has include_tasks calls which lead to this problem. Can you please test the latest 3.7.z?
Comment 16 Michael Gugino 2018-03-27 10:02:40 EDT
We're not going to be able to apply the same work around as we did for the node role in this case.  Logging role requires the use of dynamic imports due to the way that it's constructed.

For now, the workaround for logging is to increase memory sufficiently.
Comment 17 Michael Gugino 2018-03-27 13:55:57 EDT
I have been unsuccessful in replicating this with either the logging play (playbooks/openshift-logging/config.yml) or synthetically with contrived plays and tasks.  With release-3.9 + RPM installed ansible 2.4.3 w/ RHEL localhost.
Comment 18 Anping Li 2018-03-28 01:13:08 EDT
The logging can be installed with OCP. And redpeloyed. the memory are reduced with openshift3/ose-ansible/images/v3.7.40-1.   So move to verified.
Comment 22 errata-xmlrpc 2018-04-05 05:29:37 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0636
Comment 25 Scott Dodson 2018-05-04 08:39:23 EDT
Please have your customer downgrade to ansible-2.3.2 and let us know if that improves the situation. That's the version that ships in the OCP 3.7 channels and it's the preferred version for use with 3.7.
Comment 26 Scott Dodson 2018-05-04 08:44:22 EDT
Also, if you find that doesn't resolve the issue please open a new bug. We don't re-open bugs that have an errata shipped for them.

Note You need to log in before you can comment on or make changes to this bug.