Large customer environments occasionally generate a large CRL file after a big subscription update or refresh. Error: 2021-02-08 12:01:46,929 [thread=QuartzScheduler_Worker-6] [job=CertificateRevocationListTask-911d9c76-5768-4b10-b827-16dfac3c8b78, org=, csid=] ERROR org.quartz.core.JobRunShell - Job cron group.CertificateRevocationListTask-911d9c76-5768-4b10-b827-16dfac3c8b78 threw an unhandled Exception: java.lang.OutOfMemoryError: input is too large to fit in a byte array at com.google.common.io.ByteStreams.toByteArrayInternal(ByteStreams.java:195) Customer in question had a 2.8G CRL in: /var/lib/candlepin/candlepin-crl.crl this blew past the 1.99 GB limit for processing this file and the customer will be forced to take manual steps to get past the collection process. WORKAROUND: 1) stop services: # satellite-maintain service stop 2) start Postgres: # systemctl start postgresql 3) move CRL out of the way: # mv /var/lib/candlepin/candlepin-crl.crl /var/lib/candlepin/candlepin-crl.BAK 4) Update database: # echo "UPDATE cp_cert_serial SET collected=true WHERE revoked=true;" | sudo -u postgres psql -d candlepin UPDATE 134330 5) start services and resume operations # satellite-maintain service start
An update on how this issue will be resolved: We have a solution currently under review that will remove the CertificateRevocationListTask job entirely, and replace it with a new job called CertificateCleanupJob which will be running periodically and: - Will no longer be generating a CRL file. - Will be revoking all expired (but not yet revoked) Identity and SCA certificates (these might pile up when hosts register and then never unregister themselves, and time passes so those certs are never revoked, but are expired and therefore invalid). - Will be deleting all the certificate serials that are both expired and revoked (this includes serials of all 3 types of certs: identity, SCA and entitlement).
Verified on Satellite 7.0.0, snap 4 running on RHEL 7 and RHEL 8. Steps to Test: 1. Add the following line to /etc/candlepin/candlepin.conf to set the CRLUpdateJob to run every 3 minutes: candlepin.async.jobs.CRLUpdateJob.schedule=0 0/3 * * * ? 2. Restart the tomcat service: # systemctl restart tomcat 3. Register a host to Satellite. 4. Verify that /var/lib/candlepin/candlepin-crl.crl is not present. 5. Follow the Candlepin log and wait for the CRLUpdateJob to attempt to run. 6. Between two runs of the job, attach a subscription to the host registered in step 3, then immediately remove that subscription. 7. After the job runs again, verify that the CRL file is still not present. Expected Results: The CRL file is not present when Satellite is installed, and the file is not created when the CRLUpdateJob is triggered. Actual Results: The CRL file is not present when Satellite is installed, and the file is not created when the CRLUpdateJob is triggered. The attempted job run results in an error in /var/log/candlepin/candlepin.log: ``` 2022-01-11 14:57:00,006 [thread=QuartzScheduler_Worker-14] [=, org=, csid=] INFO org.candlepin.async.JobManager - Job queued: AsyncJobStatus [id: 8a81828b7e4a0179017e4ab716e4096f, name: CRLUpdateJob, key: CRLUpdateJob, state: QUEUED] 2022-01-11 14:57:00,048 [thread=Thread-151 (ActiveMQ-client-global-threads)] [job=8a81828b7e4a0179017e4ab716e4096f, job_key=CRLUpdateJob, org=, csid=] ERROR org.candlepin.async.JobManager - No registered job class for job: CRLUpdateJob 2022-01-11 14:57:00,048 [thread=Thread-151 (ActiveMQ-client-global-threads)] [=, org=, csid=] ERROR org.candlepin.async.JobMessageReceiver - Job processing failed terminally; committing job message as acknowledged: Message [id: 1292, address: job, body: {"jobId":"8a81828b7e4a0179017e4ab716e4096f","jobKey":"CRLUpdateJob"}] org.candlepin.async.JobInitializationException: No registered job class for job: CRLUpdateJob ``` This error reflects the fact that the CRLUpdateJob was removed from Candlepin and replaced with the CertificateCleanupJob in https://github.com/candlepin/candlepin/pull/3078.
*** Bug 1999089 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5498