Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1840805

Summary: [candlepin] Unable to import subscription manifest in large environment (60K+ consumers/hosts)
Product: Red Hat Satellite Reporter: Mike McCune <mmccune>
Component: CandlepinAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED CURRENTRELEASE QA Contact: Danny Synk <dsynk>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.7.0CC: ahumbe, ehelms, hhudgeon, nmoumoul, pcreech, zhunting
Target Milestone: UnspecifiedKeywords: PrioBumpField, Reopened
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: candlepin-2.9.31-1, candlepin-3.1.24-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1840823 1841596 1841598 1919418 (view as bug list) Environment:
Last Closed: 2022-06-13 13:41:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1840823, 1840850, 1841596, 1841598    
Bug Blocks:    
Attachments:
Description Flags
Import, refresh, delete none

Description Mike McCune 2020-05-27 16:11:22 UTC
In one of our large customer environments we are completely unable to import new subscriptions due to an error during the transaction:

2020-05-22 11:56:13,039 [thread=http-bio-8443-exec-7] [req=eb941534-31bd-4116-a687-3263630c82f2, org=ExampleTech, csid=] INFO  org.candlepin.policy.js.pool.PoolRules - Checking if bonus pools need to be created for pool: Pool [id=null, type=NORMAL, product=ESA0001, productName=Red Hat Enterprise Linux, Premium (One Year, Enterprise Program), quantity=50000]
2020-05-22 11:56:13,110 [thread=http-bio-8443-exec-7] [req=eb941534-31bd-4116-a687-3263630c82f2, org=ExampleTech, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - Attempting to delete 4 pools...
2020-05-22 11:56:13,110 [thread=http-bio-8443-exec-7] [req=eb941534-31bd-4116-a687-3263630c82f2, org=ExampleTech, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - Fetching related pools and entitlements...
2020-05-22 11:56:13,807 [thread=http-bio-8443-exec-7] [req=eb941534-31bd-4116-a687-3263630c82f2, org=ExampleTech, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - Locked 4 pools for deletion...
2020-05-22 11:57:42,097 [thread=http-bio-8443-exec-7] [req=eb941534-31bd-4116-a687-3263630c82f2, org=ExampleTech, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - Revoking 56323 entitlements...
2020-05-22 11:58:28,327 [thread=http-bio-8443-exec-7] [req=eb941534-31bd-4116-a687-3263630c82f2, org=ExampleTech, csid=] WARN  org.hibernate.engine.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 08006
2020-05-22 11:58:28,327 [thread=http-bio-8443-exec-7] [req=eb941534-31bd-4116-a687-3263630c82f2, org=ExampleTech, csid=] ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - An I/O error occurred while sending to the backend.
2020-05-22 11:58:28,328 [thread=http-bio-8443-exec-7] [req=eb941534-31bd-4116-a687-3263630c82f2, org=ExampleTech, csid=] WARN  com.mchange.v2.c3p0.impl.NewPooledConnection - [c3p0] A PooledConnection that has already signalled a Connection error is still in use!
2020-05-22 11:58:28,328 [thread=http-bio-8443-exec-7] [req=eb941534-31bd-4116-a687-3263630c82f2, org=ExampleTech, csid=] WARN  com.mchange.v2.c3p0.impl.NewPooledConnection - [c3p0] Another error has occurred [ org.postgresql.util.PSQLException: This connection has been closed. ] which will not be reported to listeners!
org.postgresql.util.PSQLException: This connection has been closed.
        at org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:767)
        at org.postgresql.jdbc.PgConnection.rollback(PgConnection.java:774)
        at com.mchange.v2.c3p0.impl.NewProxyConnection.rollback(NewProxyConnection.java:1033)
        at org.hibernate.resource.jdbc.internal.AbstractLogicalConnectionImplementor.rollback(AbstractLogicalConnectionImplementor.java:116)
        at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.rollback(JdbcResourceLocalTransactionCoordinatorImpl.java:294)
        at org.hibernate.engine.transaction.internal.TransactionImpl.rollback(TransactionImpl.java:145)

Comment 1 Mike McCune 2020-08-27 16:01:44 UTC
we shipped candlepin-2.9.28-1 in Satellite 6.7.2 which includes the Candlepin portion of this fix:

https://access.redhat.com/errata/RHBA-2020:3255

The Katello portion remains in:

https://bugzilla.redhat.com/show_bug.cgi?id=1840784

Comment 3 Mike McCune 2020-10-05 16:33:40 UTC
You hit:

"""
Error:

RestClient::Exceptions::ReadTimeout

Katello::Resources::Candlepin::Owner: Timed out reading data from server (POST /candlepin/owners/Default_Organization/imports) 
"""

You can see the timeout here:

https://dhcp-3-175.vms.sat.rdu2.redhat.com/foreman_tasks/dynflow/94bc4def-d95d-416b-abe0-328d07436c29

Real time: 3601.18s
Execution time (excluding suspended state): 3600.15s


the timeout for the Katello API calling Candlepin is 1 hour, from the installer:


# satellite-installer --full-help |grep katello-rest-client-timeout
    --katello-rest-client-timeout  Timeout for Katello rest API (current: 3600)


Please run the installer and add a 0:

# satellite-installer --katello-rest-client-timeout 36000

Then repeat the refresh and above scenario.

Comment 4 Lai 2020-10-05 17:39:34 UTC
Steps to retest

1. On a satellite with 60k+ hosts, go to content->subscription
2. Perform the following tasks:
   a. Refresh manifest (if there is one)
   b. Delete existing manifest
   c. Import Manifest

Expected result:
Refreshing, deleting, and repimporting a manifest should all be successful and shouldn't take a long time.

Actual
Refreshing, deleting, and reimporting a manifest are all successful.  It took about 10 - 30 seconds to complete.

candlepin-3.1.21-1.el7sat.noarch

Verified on 6.8.0_017

Comment 11 Lai 2020-11-16 14:37:56 UTC
Created attachment 1729776 [details]
Import, refresh, delete

Attachment for import, fresh, delete

Comment 12 Lai 2020-11-16 14:39:39 UTC
Steps to retest

1. On a satellite with 60k+ hosts, go to content->subscription
2. Perform the following tasks:
   a. Refresh manifest (if there is one)
   b. Delete existing manifest
   c. Import Manifest

Expected result:
Refreshing, deleting, and repimporting a manifest should all be successful and shouldn't take a long time.

Actual
Refreshing, deleting, and reimporting a manifest are all successful.  It took about 30 seconds to complete.

Candlepin version check: candlepin-3.1.22-1.el7sat.noarch

Verified in 6.8.1_02

Comment 14 Lai 2020-11-17 21:07:51 UTC
Reassessing this.  The initial setup was just having regular chosts and not registering it.  After correctly registering all 60k+ hosts and running delete and refresh, there was a warning:

Katello::Errors::CandlepinError: Runtime Error This connection has been closed. at org.postgresql.jdbc.PgConnection.checkClosed:767

Failing QA

Candlepin version - candlepin-3.1.23-1.el7sat.noarch

On 6.8.1_snap 3

Comment 18 Brad Buckingham 2021-06-04 19:27:58 UTC
The fix for this bugzilla was delivered in Satellite 6.8.4 with bug 1919418.  The solution is also in 6.9.0 and beyond.

Based upon this, I am going to close this bugzilla.

Comment 19 Brad Buckingham 2021-06-04 19:33:06 UTC
Moving this back to MODIFIED.  This issue is still applicable for Satellite 6.7.z.  While we have no 6.7.z zstreams planned, it could be a candidate if we have one before Satellite 6.7.z goes EOL.