Bug 1928161 - Large CRL file operation causes OOM error in Candlepin
Summary: Large CRL file operation causes OOM error in Candlepin
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Candlepin
Classification: Community
Component: candlepin
Version: 3.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: candlepin-bugs
QA Contact:
URL:
Whiteboard:
: 1806626 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-12 14:16 UTC by Rehana
Modified: 2021-09-24 11:39 UTC (History)
7 users (show)

Fixed In Version: candlepin-4.1.6-1
Clone Of: 1927532
Environment:
Last Closed: 2021-09-24 11:39:26 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github candlepin candlepin pull 3078 0 None open 1928161: Large CRL file operation causes OOM error in Candlepin 2021-07-19 07:38:55 UTC

Description Rehana 2021-02-12 14:16:53 UTC
+++ This bug was initially created as a clone of Bug #1927532 +++

Large customer environments occasionally generate a large CRL file after a big subscription update or refresh. Error:

2021-02-08 12:01:46,929 [thread=QuartzScheduler_Worker-6] [job=CertificateRevocationListTask-911d9c76-5768-4b10-b827-16dfac3c8b78, org=, csid=] ERROR org.quartz.core.JobRunShell - Job cron group.CertificateRevocationListTask-911d9c76-5768-4b10-b827-16dfac3c8b78 threw an unhandled Exception:
java.lang.OutOfMemoryError: input is too large to fit in a byte array
        at com.google.common.io.ByteStreams.toByteArrayInternal(ByteStreams.java:195)

Customer in question had a 2.8G CRL in:

/var/lib/candlepin/candlepin-crl.crl

this blew past the 1.99 GB limit for processing this file and the customer will be forced to take manual steps to get past the collection process.

WORKAROUND:

1) stop services:

# satellite-maintain service stop

2) start Postgres:

# systemctl start postgresql

3) move CRL out of the way:

# mv /var/lib/candlepin/candlepin-crl.crl /var/lib/candlepin/candlepin-crl.BAK

4) Update database:

# echo "UPDATE cp_cert_serial SET collected=true WHERE revoked=true;" | sudo -u postgres psql -d candlepin
UPDATE 134330

5) start services and resume operations

# satellite-maintain service start

Comment 1 Nikos Moumoulidis 2021-05-07 09:16:04 UTC
*** Bug 1806626 has been marked as a duplicate of this bug. ***

Comment 3 Samson Wick 2021-08-31 13:50:01 UTC
In case it assists in getting this bug assigned to a release, I'm working with a very large RH customer that has encountered this issue as well.  In their case it seems to be caused by the way they're doing content management - regularly reassigning large numbers of hosts to different LCEs.


Note You need to log in before you can comment on or make changes to this bug.