1928161 – Large CRL file operation causes OOM error in Candlepin

Bug 1928161 - Large CRL file operation causes OOM error in Candlepin

Summary: Large CRL file operation causes OOM error in Candlepin

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Candlepin
Classification:	Community
Component:	candlepin
Sub Component:
Version:	3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	candlepin-bugs
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1806626 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-02-12 14:16 UTC by Rehana
Modified:	2021-09-24 11:39 UTC (History)
CC List:	7 users (show)
Fixed In Version:	candlepin-4.1.6-1
Clone Of:	1927532
Environment:
Last Closed:	2021-09-24 11:39:26 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	candlepin candlepin pull 3078	0	None	open	1928161: Large CRL file operation causes OOM error in Candlepin	2021-07-19 07:38:55 UTC

Description Rehana 2021-02-12 14:16:53 UTC

+++ This bug was initially created as a clone of Bug #1927532 +++

Large customer environments occasionally generate a large CRL file after a big subscription update or refresh. Error:

2021-02-08 12:01:46,929 [thread=QuartzScheduler_Worker-6] [job=CertificateRevocationListTask-911d9c76-5768-4b10-b827-16dfac3c8b78, org=, csid=] ERROR org.quartz.core.JobRunShell - Job cron group.CertificateRevocationListTask-911d9c76-5768-4b10-b827-16dfac3c8b78 threw an unhandled Exception:
java.lang.OutOfMemoryError: input is too large to fit in a byte array
        at com.google.common.io.ByteStreams.toByteArrayInternal(ByteStreams.java:195)

Customer in question had a 2.8G CRL in:

/var/lib/candlepin/candlepin-crl.crl

this blew past the 1.99 GB limit for processing this file and the customer will be forced to take manual steps to get past the collection process.

WORKAROUND:

1) stop services:

# satellite-maintain service stop

2) start Postgres:

# systemctl start postgresql

3) move CRL out of the way:

# mv /var/lib/candlepin/candlepin-crl.crl /var/lib/candlepin/candlepin-crl.BAK

4) Update database:

# echo "UPDATE cp_cert_serial SET collected=true WHERE revoked=true;" | sudo -u postgres psql -d candlepin
UPDATE 134330

5) start services and resume operations

# satellite-maintain service start

Comment 1 Nikos Moumoulidis 2021-05-07 09:16:04 UTC

*** Bug 1806626 has been marked as a duplicate of this bug. ***

Comment 3 Samson Wick 2021-08-31 13:50:01 UTC

In case it assists in getting this bug assigned to a release, I'm working with a very large RH customer that has encountered this issue as well.  In their case it seems to be caused by the way they're doing content management - regularly reassigning large numbers of hosts to different LCEs.

Note You need to log in before you can comment on or make changes to this bug.