Red Hat Bugzilla – Bug 1624038
Improve MonitorEventQueue performance for large workloads
Last modified: 2018-10-11 11:18:30 EDT
+++ This bug was initially created as a clone of Bug #1622206 +++ +++ This bug was initially created as a clone of Bug #1614063 +++ On systems with large numbers of registered hosts (20-50,000) there are cases when synchronization events may trigger large volume of entries landing in the katello_events table (40,000+ rows). The majority of these rows are import_host_applicability events that need to be processed. This processing can often take hours and sometimes an entire day. Due to load induced in processing all these events, there are times when foreman-tasks gets overloaded and may stop processing entirely and the queue ceases to drain. We need to optimize the processing time of this queue of events to ensure they can be worked through in a faster manner than is being done now. Some analysis has taken place that shows that a good portion of the work going through the katello_events table entries is SELECTS/INSERTS/UPDATES to the dynflow_steps table which can often take 100-300ms or more. We may update the routine to reduce this and perform other changes to make this task perform better. --- Additional comment from pm-rhel@redhat.com on 20180808T22:45:11 Since this issue was entered in Red Hat Bugzilla, the pm_ack has been set to + automatically for the next planned release --- Additional comment from jsherril@redhat.com on 20180809T20:19:03 Created redmine issue https://projects.theforeman.org/issues/24576 from this bug --- Additional comment from pm-sat@redhat.com on 20180809T22:10:21 Upstream bug assigned to jsherril@redhat.com --- Additional comment from pm-sat@redhat.com on 20180809T22:10:24 Upstream bug assigned to jsherril@redhat.com --- Additional comment from jdickers@redhat.com on 20180810T18:20:10 Walmart attempted this hotfix. The MonitorEventQueue task would only run for a few minutes and then go to paused/error state. Name: Monitor Event Queue {"locale"=>"en"} Result: Triggered by: Execution type: Immediate Start at: 3 minutes ago Started at: 3 minutes ago Ended at: N/A Start before: - State: paused 50% Complete Output: {"not_serializable"=>true} Errors: undefined method `to_hash' for 2018-08-10 17:48:42 +0000:Time Did you mean? to_s Here's the trace from dynflow console: 2: Actions::Katello::EventQueue::Monitor (error) [ 81.07s / 0.04s ] Skip Started at: 2018-08-10 17:47:21 UTC Ended at: 2018-08-10 17:48:42 UTC Real time: 81.07s Execution time (excluding suspended state): 0.04s Input: --- locale: en Output: --- not_serializable: true Error: NoMethodError undefined method `to_hash' for 2018-08-10 17:48:42 +0000:Time Did you mean? to_s --- - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/serializable.rb:50:in `recursive_to_hash'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/serializable.rb:46:in `block in recursive_to_hash'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/serializable.rb:46:in `each'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/serializable.rb:46:in `inject'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/serializable.rb:46:in `recursive_to_hash'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/action.rb:548:in `check_serializable'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/action.rb:525:in `block in execute_run'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/action.rb:431:in `block in with_error_handling'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/action.rb:431:in `catch'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/action.rb:431:in `with_error_handling'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/action.rb:511:in `execute_run'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/action.rb:268:in `execute'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:9:in `block (2 levels) in execute'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/execution_plan/steps/abstract.rb:155:in `with_meta_calculation'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:8:in `block in execute'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:22:in `open_action'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:7:in `execute'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/director.rb:55:in `execute'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/executors/parallel/worker.rb:11:in `on_message'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/context.rb:46:in `on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/executes_context.rb:7:in `on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.34/lib/dynflow/actor.rb:26:in `on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/awaits.rb:15:in `on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/sets_results.rb:14:in `on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/buffer.rb:38:in `process_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/buffer.rb:31:in `process_envelopes?'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/buffer.rb:20:in `on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/termination.rb:55:in `on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/removes_child.rb:10:in `on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/abstract.rb:25:in `pass'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/behaviour/sets_results.rb:14:in `on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/core.rb:161:in `process_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/core.rb:95:in `block in on_envelope'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/core.rb:118:in `block (2 levels) in schedule_execution'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `block in synchronize'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/synchronization/mri_lockable_object.rb:38:in `synchronize'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.3/lib/concurrent/actor/core.rb:115:in `block in schedule_execution'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/serialized_execution.rb:18:in `call'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/serialized_execution.rb:96:in `work'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/serialized_execution.rb:77:in `block in call_job'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:348:in `run_task'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:337:in `block (3 levels) in create_worker'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:320:in `loop'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:320:in `block (2 levels) in create_worker'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:319:in `catch'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.0.3/lib/concurrent/executor/ruby_thread_pool_executor.rb:319:in `block in create_worker'" - "/opt/theforeman/tfm/root/usr/share/gems/gems/logging-1.8.2/lib/logging/diagnostic_context.rb:323:in `block in create_with_logging_context'" --- Additional comment from pm-sat@redhat.com on 20180820T16:10:45 Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/24576 has been resolved. --- Additional comment from mmccune@redhat.com on 20180821T14:01:40 Follow on comment to Jason's error from above: We re-worked the code to accommodate this and delivered a follow on hotfix that is available upon request. --- Additional comment from sat6-jenkins@redhat.com on 20180821T22:28:29 build status: succeeded brew: * rubygem-katello: closed - https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=17983455 --- Additional comment from pm-rhel@redhat.com on 20180824T17:41:16 This bug report previously had all acks and release flag approved. However since at least one of its acks has been changed, the release flag has been reset to ? by the bugbot (pm-rhel). The ack needs to become approved before the release flag can become approved again. --- Additional comment from pm-rhel@redhat.com on 20180824T17:45:20 Since this issue was entered in Red Hat Bugzilla, the pm_ack has been set to + automatically for the next planned release --- Additional comment from jcallaha@redhat.com on 20180827T17:14:41 Mike, can you add in your verification steps for this bug? Thanks!
Justin, This failed cherrypick, can you take a look?
MR open!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2915