Red Hat Bugzilla – Bug 1399877
Improve ListenOnCandlepinEvents throughput
Last modified: 2018-04-24 03:41:47 EDT
Main loop of the task has redundant "sleep 1", see (current) https://github.com/Katello/katello/blob/master/app/lib/actions/candlepin/candlepin_listening_service.rb#L86 . That means, the task can process at most 1 candlepin event per second. This could be limiting factor when a bigger burst of events come, as well as in situations when katello_event_queue has thousands of messages of backlog and processing it takes ridiculously huge time in hours (this is something I see in field relatively often - ListenOnCandlepinEvents task is stopped/paused due to whatever issue, and fixing it means several hours of processing its backlog - only due to the "sleep 1" after each and every message processing). I have successfully tested removal of the sleep by having >20k messages in the queue and restarting foreman-tasks service. Since the time this task subscribed to the queue, the 21k messages were consumed within 35 seconds with the only impact of dynflow_executor consuming high CPU. Therefore I see no reason for having the sleep there (that is present there rather due to historical reasons, as far as I understood).
Created from redmine issue http://projects.theforeman.org/issues/17498
Upstream bug component is Candlepin
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/17498 has been resolved.
Created attachment 1243776 [details] Patch
To reproduce: 1. edit /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.*/lib/katello/engine.rb comment out this line: ::Actions::Candlepin::ListenOnCandlepinEvents.ensure_running(world) (around line 118), so it look like: #::Actions::Candlepin::ListenOnCandlepinEvents.ensure_running(world) 2. restart foreman-tasks 3. Register a bunch of clients, checking: qpid-stat -q --ssl-certificate=/etc/pki/katello/qpid_client_striped.crt -b amqps://localhost:5671 | grep katello_event_queue and you should see the queue message count going up. When it gets up to 1-2 thousand: 4. Uncomment the same line in engine.rb 5. Restart foreman-tasks 6. monitor the qpid-stat command again, you see the count decrease fairly quickly. 1000 messages should be processed in ~10-20 seconds.
VERIFIED on sat6.2.7-1 message count in katello_event_queue queue managed to drop from about 2.79k to 27 in about 30seconds. No complications detected
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0197