Description of problem: Having 2.2M certificates in cp_cert_serial in Satellite 6.3.5, we noticed bug bz1620226 already (to some extent - candlepin gets slower and slower and slower, CRL job started but silently terminated after one hour, and since then performance is good again). Please backport bz1620226 to 6.4.z and newer (bz1620226 will appear in 6.6 only, by default) Version-Release number of selected component (if applicable): 6.3.5 / 6.4 How reproducible: 100% with customer DB shall be straightforward on scaled environment (many hosts with many subs) Steps to Reproduce: 1. have 40k hosts with 50 subs each (or similarly scaled Sat) 2. observe Sat/candlepin performance at noon 3. optionally, modify the "at noon" by adding: pinsetter.org.candlepin.pinsetter.tasks.CertificateRevocationListTask.schedule=0 0/5 * * * ? to candlepin.conf and restarting tomcat Actual results: 2. shows logs like: 2019-02-05 10:50:00,177 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-b5e0a33b-7ecb-4951-9ba6-30c87d95f73f, org=, csid=] INFO org.candlepin.pinsetter.tasks.KingpinJob - Starting job: org.candlepin.pinsetter.tasks.CertificateRevocationListTask 2019-02-05 10:50:00,178 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-b5e0a33b-7ecb-4951-9ba6-30c87d95f73f, org=, csid=] INFO org.candlepin.pinsetter.tasks.CertificateRevocationListTask - Executing CRL Job. CRL filePath=/var/lib/candlepin/candlepin-crl.crl but _without_ termination log like: 2019-02-04 12:00:00,110 [thread=QuartzScheduler_Worker-13] [job=CertificateRevocationListTask-f44c921f-8dc8-4928-ace1-3ebd9fb31f0c, org=, csid=] INFO org.candlepin.pinsetter.tasks.KingpinJob - Job completed: time=8 2. shows high CPU, worse latency etc, worsening over time. Expected results: 2. the job completes in reasonable time, no big CPU or latency impact Additional info:
Moving to modified & tagging with candlepin-2.6.1-1 as that is the build that already contains a fix for this.
Verified in Satellite 6.6.0 Snap 2 Setup a system with 50,011 content hosts. I then attached 51 subscriptions to each. I added the cron line to candlepin.conf and tailed the candlepin logs to look for the task. Results: The task completed within a very reasonable time. 2019-05-15 12:00:00,205 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-f6f6356c-6241-40fc-a41e-c72d38d2d2c6, org=, csid=] INFO org.candlepin.pinsetter.tasks.KingpinJob - Starting job: org.candlepin.pinsetter.tasks.CertificateRevocationListTask 2019-05-15 12:00:00,205 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-f6f6356c-6241-40fc-a41e-c72d38d2d2c6, org=, csid=] INFO org.candlepin.pinsetter.tasks.CertificateRevocationListTask - Executing CRL Job. CRL filePath=/var/lib/candlepin/candlepin-crl.crl 2019-05-15 12:00:00,266 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-f6f6356c-6241-40fc-a41e-c72d38d2d2c6, org=, csid=] INFO org.candlepin.util.CrlFileUtil - CRL sync processed a total of 0 serials. 2019-05-15 12:00:00,266 [thread=QuartzScheduler_Worker-12] [job=CertificateRevocationListTask-f6f6356c-6241-40fc-a41e-c72d38d2d2c6, org=, csid=] INFO org.candlepin.pinsetter.tasks.KingpinJob - Job completed: time=61
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:3172