Bug 1672706 - candlepin's CertificateRevocationListTask does not scale well for 2M+ certificates
Summary: candlepin's CertificateRevocationListTask does not scale well for 2M+ certifi...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Candlepin
Version: 6.3.5
Hardware: Unspecified
OS: Unspecified
high
high vote
Target Milestone: 6.6.0
Assignee: satellite6-bugs
QA Contact: jcallaha
URL:
Whiteboard:
Depends On: 1620226
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-05 16:28 UTC by Pavel Moravec
Modified: 2019-12-02 17:40 UTC (History)
4 users (show)

Fixed In Version: candlepin-2.6.1-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-22 12:47:16 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3888591 None None None 2019-02-05 19:31:35 UTC
Red Hat Product Errata RHSA-2019:3172 None None None 2019-10-22 12:47:28 UTC

Description Pavel Moravec 2019-02-05 16:28:53 UTC
Description of problem:
Having 2.2M certificates in cp_cert_serial in Satellite 6.3.5, we noticed bug bz1620226 already (to some extent - candlepin gets slower and slower and slower, CRL job started but silently terminated after one hour, and since then performance is good again).

Please backport bz1620226 to 6.4.z and newer (bz1620226 will appear in 6.6 only, by default)


Version-Release number of selected component (if applicable):
6.3.5 / 6.4


How reproducible:
100% with customer DB
shall be straightforward on scaled environment (many hosts with many subs)


Steps to Reproduce:
1. have 40k hosts with 50 subs each (or similarly scaled Sat)
2. observe Sat/candlepin performance at noon
3. optionally, modify the "at noon" by adding:

pinsetter.org.candlepin.pinsetter.tasks.CertificateRevocationListTask.schedule=0 0/5 * * * ?

to candlepin.conf and restarting tomcat


Actual results:
2. shows logs like:
2019-02-05 10:50:00,177 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-b5e0a33b-7ecb-4951-9ba6-30c87d95f73f, org=, csid=] INFO  org.candlepin.pinsetter.tasks.KingpinJob - Starting job: org.candlepin.pinsetter.tasks.CertificateRevocationListTask
2019-02-05 10:50:00,178 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-b5e0a33b-7ecb-4951-9ba6-30c87d95f73f, org=, csid=] INFO  org.candlepin.pinsetter.tasks.CertificateRevocationListTask - Executing CRL Job. CRL filePath=/var/lib/candlepin/candlepin-crl.crl

but _without_ termination log like:

2019-02-04 12:00:00,110 [thread=QuartzScheduler_Worker-13] [job=CertificateRevocationListTask-f44c921f-8dc8-4928-ace1-3ebd9fb31f0c, org=, csid=] INFO  org.candlepin.pinsetter.tasks.KingpinJob - Job completed: time=8

2. shows high CPU, worse latency etc, worsening over time.


Expected results:
2. the job completes in reasonable time, no big CPU or latency impact


Additional info:

Comment 5 Barnaby Court 2019-02-07 19:00:39 UTC
Moving to modified & tagging with candlepin-2.6.1-1 as that is the build that already contains a fix for this.

Comment 7 jcallaha 2019-05-15 17:23:36 UTC
Verified in Satellite 6.6.0 Snap 2

Setup a system with 50,011 content hosts.
I then attached 51 subscriptions to each.

I added the cron line to candlepin.conf and tailed the candlepin logs to look for the task.

Results:
The task completed within a very reasonable time.

2019-05-15 12:00:00,205 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-f6f6356c-6241-40fc-a41e-c72d38d2d2c6, org=, csid=] INFO  org.candlepin.pinsetter.tasks.KingpinJob - Starting job: org.candlepin.pinsetter.tasks.CertificateRevocationListTask
2019-05-15 12:00:00,205 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-f6f6356c-6241-40fc-a41e-c72d38d2d2c6, org=, csid=] INFO  org.candlepin.pinsetter.tasks.CertificateRevocationListTask - Executing CRL Job. CRL filePath=/var/lib/candlepin/candlepin-crl.crl
2019-05-15 12:00:00,266 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-f6f6356c-6241-40fc-a41e-c72d38d2d2c6, org=, csid=] INFO  org.candlepin.util.CrlFileUtil - CRL sync processed a total of 0 serials.
2019-05-15 12:00:00,266 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-f6f6356c-6241-40fc-a41e-c72d38d2d2c6, org=, csid=] INFO  org.candlepin.pinsetter.tasks.KingpinJob - Job completed: time=61

Comment 9 errata-xmlrpc 2019-10-22 12:47:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3172


Note You need to log in before you can comment on or make changes to this bug.