Bug 1399877 - Improve ListenOnCandlepinEvents throughput
Summary: Improve ListenOnCandlepinEvents throughput
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Candlepin
Version: 6.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: Unspecified
Assignee: Eric Helms
QA Contact: Roman Plevka
URL:
Whiteboard:
Depends On:
Blocks: 1405526
TreeView+ depends on / blocked
 
Reported: 2016-11-30 00:45 UTC by Mike McCune
Modified: 2020-04-15 14:55 UTC (History)
9 users (show)

Fixed In Version: rubygem-katello-3.0.0.91-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1405526 (view as bug list)
Environment:
Last Closed: 2017-01-26 10:46:27 UTC
Target Upstream Version:


Attachments (Terms of Use)
Patch (1.72 KB, patch)
2017-01-23 22:24 UTC, Justin Sherrill
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 17498 0 Normal Closed Improve ListenOnCandlepinEvents throughput 2020-12-30 18:16:54 UTC
Red Hat Product Errata RHBA-2017:0197 0 normal SHIPPED_LIVE Satellite 6.2.7 Async Bug Release 2017-01-26 15:38:38 UTC

Description Mike McCune 2016-11-30 00:45:02 UTC
Main loop of the task has redundant "sleep 1", see (current) https://github.com/Katello/katello/blob/master/app/lib/actions/candlepin/candlepin_listening_service.rb#L86 . That means, the task can process at most 1 candlepin event per second.

This could be limiting factor when a bigger burst of events come, as well as in situations when katello_event_queue has thousands of messages of backlog and processing it takes ridiculously huge time in hours (this is something I see in field relatively often - ListenOnCandlepinEvents task is stopped/paused due to whatever issue, and fixing it means several hours of processing its backlog - only due to the "sleep 1" after each and every message processing).

I have successfully tested removal of the sleep by having >20k messages in the queue and restarting foreman-tasks service. Since the time this task subscribed to the queue, the 21k messages were consumed within 35 seconds with the only impact of dynflow_executor consuming high CPU.

Therefore I see no reason for having the sleep there (that is present there rather due to historical reasons, as far as I understood).

Comment 1 Mike McCune 2016-11-30 00:45:07 UTC
Created from redmine issue http://projects.theforeman.org/issues/17498

Comment 4 Bryan Kearney 2016-12-01 15:15:19 UTC
Upstream bug component is Candlepin

Comment 6 Bryan Kearney 2016-12-02 17:15:18 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/17498 has been resolved.

Comment 7 Justin Sherrill 2017-01-23 22:24:50 UTC
Created attachment 1243776 [details]
Patch

Comment 8 Justin Sherrill 2017-01-24 15:25:48 UTC
To reproduce:

1.  edit /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.0.*/lib/katello/engine.rb

comment out this line:

::Actions::Candlepin::ListenOnCandlepinEvents.ensure_running(world)

(around line 118), so it look like:

#::Actions::Candlepin::ListenOnCandlepinEvents.ensure_running(world)

2.  restart foreman-tasks

3.  Register a bunch of clients, checking:

qpid-stat -q --ssl-certificate=/etc/pki/katello/qpid_client_striped.crt -b amqps://localhost:5671 | grep katello_event_queue                                                       

and you should see the queue message count going up.  When it gets up to 1-2 thousand:

4.  Uncomment the same line in engine.rb

5.  Restart foreman-tasks

6.  monitor the qpid-stat command again, you see the count decrease fairly quickly.  1000 messages should be processed in ~10-20 seconds.

Comment 9 Roman Plevka 2017-01-24 17:44:40 UTC
VERIFIED
on sat6.2.7-1

message count in katello_event_queue queue managed to drop from about 2.79k to 27 in about 30seconds.
No complications detected

Comment 11 errata-xmlrpc 2017-01-26 10:46:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0197


Note You need to log in before you can comment on or make changes to this bug.