Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1575063 - OSError: [Errno 12] Cannot allocate memory when deploy logging
OSError: [Errno 12] Cannot allocate memory when deploy logging
Status: CLOSED WONTFIX
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.7.1
Unspecified Unspecified
urgent Severity urgent
: ---
: 3.7.z
Assigned To: Scott Dodson
Johnny Liu
:
Depends On:
Blocks: 1610420
  Show dependency treegraph
 
Reported: 2018-05-04 12:29 EDT by mmariyan
Modified: 2018-10-25 08:08 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1610420 (view as bug list)
Environment:
Last Closed: 2018-07-31 10:59:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description mmariyan 2018-05-04 12:29:13 EDT
Description of problem:

Cannot allocate memory error when deploying logging, more over its same as the Bugzilla [0]. this bugfix available 3.7.42-1 but it was not resolved issue.

[0]https://bugzilla.redhat.com/show_bug.cgi?id=1497421

As per Engineering team suggested workaround the ansible 2.3.2 version also not resolved the issue.

Version-Release number of the following components:

openshift-ansible-3.7.42-1.git.2.9ee4e71.el7.noarch
openshift-ansible-playbooks-3.7.42-1.git.2.9ee4e71.el7.noarch
ansible-2.4.3.0-1.el7.noarch

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

TASK [openshift_logging_fluentd : include] ********************************************************************************************
included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxxx
included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxx
included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxx
included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxx
included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxx


TASK [openshift_logging_fluentd : Label xxxxxxxxxx.xxx.xx Fluentd deployment] ************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 12] Cannot allocate memory
fatal: [xxxx]: FAILED! => {"failed": true, "msg": "Unexpected failure during module execution.", "stdout": ""}


Expected results:

should get install without error
Comment 1 Scott Dodson 2018-05-04 12:42:56 EDT
Needs backport of https://github.com/openshift/openshift-ansible/pull/8165
Comment 2 Michael Gugino 2018-05-07 10:43:39 EDT
PR Created: https://github.com/openshift/openshift-ansible/pull/8284
Comment 4 Anping Li 2018-05-09 23:04:25 EDT
Still Cannot allocate memory, the fix is not in openshift-ansible-3.7.46-1.git.0.37f607e.el7.noarch.
Comment 6 Scott Dodson 2018-05-14 10:50:43 EDT
(In reply to Anping Li from comment #4)
> Still Cannot allocate memory, the fix is not in
> openshift-ansible-3.7.46-1.git.0.37f607e.el7.noarch.

The fix is only in openshift-ansible-3.7.47-1 and newer.
Comment 7 Anping Li 2018-05-22 03:56:07 EDT
The "Cannot allocate memory" reported in 'Generate Kibana DC template' this time when I redeploy logging. 

TASK [openshift_logging_kibana : Set Kibana Proxy secret] **********************
ok: [ec2-34-230-65-4.compute-1.amazonaws.com]

TASK [openshift_logging_kibana : Generate Kibana DC template] ******************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 12] Cannot allocate memory
fatal: [ec2-34-230-65-4.compute-1.amazonaws.com]: FAILED! => {"failed": true, "msg": "Unexpected failure during module execution.", "stdout": ""}

RUNNING HANDLER [openshift_logging_elasticsearch : Restarting logging-{{ _cluster_component }} cluster] ***

RUNNING HANDLER [openshift_logging_elasticsearch : set_fact] *******************
Comment 8 Michael Gugino 2018-05-22 09:50:41 EDT
Please describe host details such as memory on both the ansible host and the target host.

This latest failure does not resemble previous scenarios.  There are no dynamic includes and no looping for that task.
Comment 9 Anping Li 2018-05-22 21:39:05 EDT
ansible slave 8Gi on, openshift-ansible-3.7.48. Running as docker containers
hosts: 8Gi in AWS
Comment 10 Anping Li 2018-05-22 22:32:13 EDT
Logging Inventory varaibles

openshift_logging_fluentd_audit_container_engine=true
openshift_logging_install_eventrouter=true
openshift_logging_elasticsearch_kibana_index_mode=shared_ops
openshift_logging_es_allow_external=True
openshift_logging_es_ops_pvc_dynamic=true
openshift_logging_use_ops=true
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_number_of_shards=1
openshift_logging_es_number_of_replicas=1
openshift_logging_es_memory_limit=2Gi
openshift_logging_es_cluster_size=3
openshift_logging_image_prefix=registry.reg-aws.openshift.com:443/openshift3/
openshift_logging_install_logging=true
Comment 16 Michael Gugino 2018-07-06 14:18:56 EDT
I don't see a reason for this to be happening with this role.  Perhaps you have reverted to using a newer version of ansible with 3.7?  It's important to use a 2.3 release as in 2.4 'include_role' and similar statements are dynamic includes; in 2.3 those same statements would be static by default.

Also, it's possible that the host or container is running out of memory due to memory consumption by other processes.

You can try adding a temporary swap file on the host running ansible to increase memory, or you can try limiting the number of nodes in inventory when running the kibana plays.
Comment 17 Scott Dodson 2018-07-24 13:03:17 EDT
Anping, which version of ansible was used in your testing?

mmariyan@redhat.com, can you please confirm whether the problem is alleviated by running ansible 2.3? They should be able to simply run `yum downgrade ansible-2.3*` to get ansible 2.3 re-installed.
Comment 18 Scott Dodson 2018-07-24 13:20:18 EDT
(In reply to Scott Dodson from comment #17)
> Anping, which version of ansible was used in your testing?
> 
> mmariyan@redhat.com, can you please confirm whether the problem is
> alleviated by running ansible 2.3? They should be able to simply run `yum
> downgrade ansible-2.3*` to get ansible 2.3 re-installed.

Specifically with Ansible 2.3 and openshift-ansible-3.7.47 and newer, the original bug was opened against openshift-ansible-3.7.42.
Comment 19 Anping Li 2018-07-24 21:46:21 EDT
Scott, I am using ose-ansible image, it should be ansible 2.3.2.0
Comment 21 Brenton Leanhardt 2018-07-31 10:52:35 EDT
For Documentation, could the 3.7 "known issues" page be updated to state that only the released version of Ansible 2.3 in the OCP channel should be used with OCP 3.7?

The challenge is that RHEL released Ansible 2.4 which means some customers install it.  They need to be instructed to downgrade if they are using OCP 3.7.
Comment 22 Brenton Leanhardt 2018-07-31 10:59:55 EDT
Sorry for the noise, originally I moved this to documentation bug I later thought it would be less confusing to clone the bug and change the title.

Note You need to log in before you can comment on or make changes to this bug.