Clone bug to trace this isuse in v3.9. Cannot allocate memory when redeploy logging with openshift3/ose-ansible/images/v3.9.11-1 Ansible Host free -h total used free shared buff/cache available Mem: 7.6G 1.6G 5.2G 121M 893M 5.6G
After add 8Gi swapfile, the deploy succeed. 16G memory are enough to deploy logging with both ops and eventrouter enabled. [OSEv3:vars] openshift_logging_install_eventrouter=true openshift_logging_elasticsearch_kibana_index_mode=shared_ops openshift_logging_es_ops_pvc_dynamic=true openshift_logging_es_ops_memory_limit=2Gi openshift_logging_use_ops=true openshift_logging_es_pvc_dynamic=true openshift_logging_es_memory_limit=2Gi openshift_logging_es_cluster_size=1 openshift_logging_namespace=logging openshift_logging_image_prefix=registry.example.com/openshift3/ openshift_logging_install_logging=true [masters] master1.example.com [nodes] node1.example.com node2.example.com node3.example.com node4.example.com
The issue still exist with openshift3/ose-ansible/images/v3.9.14-3 TASK [openshift_logging_curator : Generate Curator deploymentconfig] *********** An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 12] Cannot allocate memory fatal: [host-8-246-167.host.centralci.eng.rdu2.redhat.com]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""} RUNNING HANDLER [openshift_logging_elasticsearch : Restarting logging-{{ _cluster_component }} cluster] ***
There's suspicion that this has been linked to a python garbage collection bug[1] that was fixed in 7.5, anyone hitting this we'd be interested to see if updating to python-2.7.5-68.el7.x86_64 fixes the problem. 1 - https://bugzilla.redhat.com/show_bug.cgi?id=1468410
Hitting this as well: Env: ansible-2.4.3.0-1.el7ae.noarch python-2.7.5-68.el7.x86_64 openshift-ansible-3.9.14-1.git.3.c62bc34.el7.noarch Red Hat Enterprise Linux Server release 7.5 HW: Memory: Total: 8192 MiB (8 GiB) .... TASK [openshift_logging_fluentd : Label example.node for Fluentd deployment] ************************************************************************ task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging_fluentd/tasks/label_and_wait.yaml:2 ERROR! Unexpected Exception, this is probably a bug: [Errno 12] Cannot allocate memory the full traceback was: Traceback (most recent call last): File "/usr/bin/ansible-playbook", line 106, in <module> exit_code = cli.run() File "/usr/lib/python2.7/site-packages/ansible/cli/playbook.py", line 122, in run results = pbex.run() File "/usr/lib/python2.7/site-packages/ansible/executor/playbook_executor.py", line 154, in run result = self._tqm.run(play=play) File "/usr/lib/python2.7/site-packages/ansible/executor/task_queue_manager.py", line 290, in run play_return = strategy.run(iterator, play_context) File "/usr/lib/python2.7/site-packages/ansible/plugins/strategy/linear.py", line 277, in run self._queue_task(host, task, task_vars, play_context) File "/usr/lib/python2.7/site-packages/ansible/plugins/strategy/__init__.py", line 254, in _queue_task worker_prc.start() File "/usr/lib64/python2.7/multiprocessing/process.py", line 130, in start self._popen = Popen(self) File "/usr/lib64/python2.7/multiprocessing/forking.py", line 121, in __init__ self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory Variables used: # Aggregated logging openshift_logging_image_prefix=openshift3/ openshift_logging_image_version=v3.9 # No separate ops logging openshift_logging_use_ops=False openshift_logging_install_eventrouter=True # run all logging on infra nodes openshift_logging_curator_nodeselector={'region': 'infra', 'component': 'logging'} openshift_logging_es_nodeselector={'region': 'infra', 'component': 'logging'} openshift_logging_kibana_nodeselector={'region': 'infra', 'component': 'logging'} openshift_logging_eventrouter_nodeselector={'region': 'infra', 'component': 'logging'} openshift_logging_es_cluster_size=4 openshift_logging_es_number_of_replicas=1 openshift_logging_es_allows_cluster_reader=True openshift_logging_es_pv_selector={'logging-infra':'es'} openshift_logging_es_pvc_size=200G openshift_logging_es_pvc_storage_class_name="" openshift_logging_es_cpu_request=400m openshift_logging_fluentd_cpu_request=50m openshift_logging_kibana_cpu_request=50m openshift_logging_kibana_proxy_cpu_request=50m openshift_logging_curator_cpu_request=50m openshift_logging_eventrouter_cpu_request=50m openshift_logging_es_memory_limit=12Gi openshift_logging_curator_run_timezone=Europe/Oslo openshift_logging_curator_default_days=90
PR created against master: https://github.com/openshift/openshift-ansible/pull/8165
PR For 3.9: https://github.com/openshift/openshift-ansible/pull/8210
The redeploy succeed without memory limit error with openshift-ansible:v3.9.40. So move bug to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1796