Bug 2245930

Summary: Capsule slowdown from 10 minutes to 2 hours
Product: Red Hat Satellite Reporter: Ian Ballou <iballou>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Vladimír Sedmík <vsedmik>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.13.5CC: dalley, dkliban, ggainey, momran, osousa, rchan, vsedmik
Target Milestone: 6.14.0Keywords: Regression, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-pulpcore-3.22.15-2.el8pc Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2246550 (view as bug list) Environment:
Last Closed: 2023-11-08 14:20:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ian Ballou 2023-10-24 15:39:18 UTC
Description of problem:

An upstream user reported an issue about slow capsule syncs (2 hours vs 10 minutes) on our upstream forums: https://community.theforeman.org/t/slow-content-proxy-sync/35162

ATIX also reported seeing the issue on "two internal systems": https://community.theforeman.org/t/slow-content-proxy-sync/35162/27

The bug has yet to be reproduced in-house, however, it was discovered that reverting the following Pulpcore patch fixes the issue: https://github.com/pulp/pulpcore/pull/4275

The issue has been identified with some certainty to be caused by differences in PostgreSQL 12 and 13. Pulp develops on 13+, where Satellite uses 12.  On PostgreSQL 13, the patch speeds up repository syncing. However, on PostgreSQL 12, the patch can slow syncing down. Based on this comment (https://community.theforeman.org/t/slow-content-proxy-sync/35162/31), there is a change in PostgreSQL 13 that lines up well with the slowdown issue.

The current recommendation is to revert the patch on Pulpcore 3.22 and 3.21, and create a Katello-only patch for Pulpcore 3.28.

Version-Release number of selected component (if applicable):

6.13.5 and up

How reproducible:

Not reproduced by anyone outside of the upstream thread yet

Steps to Reproduce:

No info on reproducing the issue yet.

Actual results:
Hours-longer capsule sync


Expected results:
Regular capsule sync

Additional info:
There are still unknowns to the issue. The only thing that was verified by two users was that reverting the Pulpcore patch above fixes the slowness issue.

Comment 2 Ian Ballou 2023-10-25 18:08:45 UTC
Reproducing details:

Repositories: RHEL 8 and RHEL 9 appstream and baseos (so 4), Satellite 6 Client, Satellite 6.13 maintenance, Satellite 6.13 RPMs, and the upstream Foreman repository. That makes 8 repositories total.

CVs: I have a "RHEL 8" content view with RHEL 8 appstream + baseos and the 3 satellite repositories).  RHEL 8 CV has 2 versions, with 4 LCEs promoted between the 2 of them. I also have a "RHEL 9" content view with RHEL 9 baseOS and appstream. The RHEL 9 CV also has 2 versions with 4 LCEs promoted between the 4 of them.

With all of this combined, it ends up being 36 repositories synced to a capsule (assuming the capsule is using every LCE)

I'd bet that you don't even need 36 repos synced to a capsule at the same time to see the problem. Maybe even 10-15.

With 36 repos synced to capsule, I saw 24 minutes without the bug and 1 hour 51 minutes with the bug.

You can check the number of repositories synced by running `pulp --profile proxy rpm repository list --limit 1` and looking at the count.

Comment 3 Daniel Alley 2023-10-26 01:33:27 UTC
Here's the patch to apply

https://github.com/pulp/pulpcore/pull/4615/files

Comment 5 Vladimír Sedmík 2023-10-26 20:16:09 UTC
Verified on 6.14.0 snap 22 with python39-pulpcore-3.22.15-2.el8pc.noarch

Using the setup from comment#2 set on two HW-identical boxes, the capsule sync time was improved 6.54 times (3845 seconds in snap 21 vs. 588 seconds in snap 22).

Repos count as well as artifacts count and size on both capsules were exactly the same:

[root@satellite ~]# pulp --profile capsule rpm repository list
Not all 36 entries were shown.
...
[root@capsule ~]# ll -R /var/lib/pulp/media/artifact/ | wc -l
342
[root@capsule ~]# du -s /var/lib/pulp/media/artifact/
454336	/var/lib/pulp/media/artifact/

Comment 7 errata-xmlrpc 2023-11-08 14:20:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.14 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6818