Description of problem: Cannot allocate memory error when deploying logging, more over its same as the Bugzilla [0]. this bugfix available 3.7.42-1 but it was not resolved issue. [0]https://bugzilla.redhat.com/show_bug.cgi?id=1497421 As per Engineering team suggested workaround the ansible 2.3.2 version also not resolved the issue. Version-Release number of the following components: openshift-ansible-3.7.42-1.git.2.9ee4e71.el7.noarch openshift-ansible-playbooks-3.7.42-1.git.2.9ee4e71.el7.noarch ansible-2.4.3.0-1.el7.noarch How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: TASK [openshift_logging_fluentd : include] ******************************************************************************************** included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxxx included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxx included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxx included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxx included: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml for xxx TASK [openshift_logging_fluentd : Label xxxxxxxxxx.xxx.xx Fluentd deployment] ************************************************ An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 12] Cannot allocate memory fatal: [xxxx]: FAILED! => {"failed": true, "msg": "Unexpected failure during module execution.", "stdout": ""} Expected results: should get install without error
Needs backport of https://github.com/openshift/openshift-ansible/pull/8165
PR Created: https://github.com/openshift/openshift-ansible/pull/8284
Still Cannot allocate memory, the fix is not in openshift-ansible-3.7.46-1.git.0.37f607e.el7.noarch.
(In reply to Anping Li from comment #4) > Still Cannot allocate memory, the fix is not in > openshift-ansible-3.7.46-1.git.0.37f607e.el7.noarch. The fix is only in openshift-ansible-3.7.47-1 and newer.
The "Cannot allocate memory" reported in 'Generate Kibana DC template' this time when I redeploy logging. TASK [openshift_logging_kibana : Set Kibana Proxy secret] ********************** ok: [ec2-34-230-65-4.compute-1.amazonaws.com] TASK [openshift_logging_kibana : Generate Kibana DC template] ****************** An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 12] Cannot allocate memory fatal: [ec2-34-230-65-4.compute-1.amazonaws.com]: FAILED! => {"failed": true, "msg": "Unexpected failure during module execution.", "stdout": ""} RUNNING HANDLER [openshift_logging_elasticsearch : Restarting logging-{{ _cluster_component }} cluster] *** RUNNING HANDLER [openshift_logging_elasticsearch : set_fact] *******************
Please describe host details such as memory on both the ansible host and the target host. This latest failure does not resemble previous scenarios. There are no dynamic includes and no looping for that task.
ansible slave 8Gi on, openshift-ansible-3.7.48. Running as docker containers hosts: 8Gi in AWS
Logging Inventory varaibles openshift_logging_fluentd_audit_container_engine=true openshift_logging_install_eventrouter=true openshift_logging_elasticsearch_kibana_index_mode=shared_ops openshift_logging_es_allow_external=True openshift_logging_es_ops_pvc_dynamic=true openshift_logging_use_ops=true openshift_logging_es_pvc_dynamic=true openshift_logging_es_number_of_shards=1 openshift_logging_es_number_of_replicas=1 openshift_logging_es_memory_limit=2Gi openshift_logging_es_cluster_size=3 openshift_logging_image_prefix=registry.reg-aws.openshift.com:443/openshift3/ openshift_logging_install_logging=true
I don't see a reason for this to be happening with this role. Perhaps you have reverted to using a newer version of ansible with 3.7? It's important to use a 2.3 release as in 2.4 'include_role' and similar statements are dynamic includes; in 2.3 those same statements would be static by default. Also, it's possible that the host or container is running out of memory due to memory consumption by other processes. You can try adding a temporary swap file on the host running ansible to increase memory, or you can try limiting the number of nodes in inventory when running the kibana plays.
Anping, which version of ansible was used in your testing? mmariyan, can you please confirm whether the problem is alleviated by running ansible 2.3? They should be able to simply run `yum downgrade ansible-2.3*` to get ansible 2.3 re-installed.
(In reply to Scott Dodson from comment #17) > Anping, which version of ansible was used in your testing? > > mmariyan, can you please confirm whether the problem is > alleviated by running ansible 2.3? They should be able to simply run `yum > downgrade ansible-2.3*` to get ansible 2.3 re-installed. Specifically with Ansible 2.3 and openshift-ansible-3.7.47 and newer, the original bug was opened against openshift-ansible-3.7.42.
Scott, I am using ose-ansible image, it should be ansible 2.3.2.0
For Documentation, could the 3.7 "known issues" page be updated to state that only the released version of Ansible 2.3 in the OCP channel should be used with OCP 3.7? The challenge is that RHEL released Ansible 2.4 which means some customers install it. They need to be instructed to downgrade if they are using OCP 3.7.
Sorry for the noise, originally I moved this to documentation bug I later thought it would be less confusing to clone the bug and change the title.