Red Hat Bugzilla – Bug 1434069
[RFE] max_memory_per_executor support
Last modified: 2018-06-01 10:24:05 EDT
Given once Ruby allocates some memory, it doesn't give it back, bigger set of larger actions can lead to quite big memory consumption that persists and can accumulate over time. With this, it's hard to keep memory consumption fully under control, especially in an environment with other systems (passenger, pulp, candlepin, qpid). Since the executors can terminate nicely without affecting the tasks itselves, it should be pretty easy to extend it to watch the memory consumption. The idea: 1. config options: max_memory_per_executor - the threshold for the memory size per executor min_executors_count - minimal count executors (default 1) minimal_executor_age - the period it will check whether the memory consumption didn't grow (default 1h) 2. the executor will periodically check it's memory usage, (http://stackoverflow.com/a/24423978/457560 seems to be a sane approach for us) 3. if memory usage exceeds `max_memory_per_executor`, the executor is older than `minimal_executor_age` (to prevent situation, where the memory would grow too fast over the max_memory_per_executor, which would mean we wouldn't do anything than restarting the executors without getting anything done and the amount of current executors would not go under `min_executors_count`, politely terminate executor 4. the polite termination should be able to hand over all the tasks to the other executors and once everything is finalized on the executor, it would just exit 5. the daemon monitor would notice the executor getting closed and running a new executor It would be configurable, turned off by default (for development) but we would configure this in production, where we can rely on the monitor being present.
Created from redmine issue http://projects.theforeman.org/issues/17175
Upstream bug assigned to sshtein@redhat.com
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/17175 has been resolved.
Version Tested: Satellite-6.3 Snap 11 I've set this in /etc/sysconfig/foreman-tasks: EXECUTOR_MEMORY_LIMIT=400MB EXECUTOR_MEMORY_MONITOR_DELAY=60 Than I've followed https://bugzilla.redhat.com/show_bug.cgi?id=1406489#c19, while watching watch 'ps aux | grep "\bdynflow_executor\b"' The executor get though the 400MB threshold, it paused the task, but it didn't finished termination of dynflow process, this is due to the fact that the termination wait for some actions to finish without defining any timeouts, so it can hang forever.
Upstream bug assigned to inecas@redhat.com
*** Bug 1492768 has been marked as a duplicate of this bug. ***
*** Bug 1416241 has been marked as a duplicate of this bug. ***
VERIFIED Version tested: Satellite 6.3 snap 35 # rpm -qa | grep get_process_mem tfm-rubygem-get_process_mem-0.2.1-1.el7sat.noarch #rpm -q tfm-rubygem-foreman-tasks tfm-rubygem-foreman-tasks-0.9.6.4-1.fm1_15.el7sat.noarch # rpm -q tfm-rubygem-dynflow tfm-rubygem-dynflow-0.8.34-1.fm1_15.el7sat.noarch Steps: 1. Configured /etc/sysconfig/foreman-tasks: EXECUTOR_MEMORY_LIMIT=400MB EXECUTOR_MEMORY_MONITOR_DELAY=10 2. Run Remote Execution job on 350 hosts 3. watch 'ps aux | grep "\bdynflow_executor\b"' Found termination of dynflow process occured once memory usage reached to 400 MB (see attachment). Also tried configuring : EXECUTOR_MEMORY_MONITOR_INTERVAL=15 EXECUTORS_COUNT=3 Once memory usage exceeds limit for a executor, another executor started running. (see attachment 2 [details]).
Created attachment 1392687 [details] screenrecord of real memory size(RSS) for dynflow_executor
Created attachment 1392688 [details] screenrecord for EXECUTORS_COUNT=3
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:0336