Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1970246 - Publishing a repository in Capsule can take more than an hour to finish if many cloned repositories have been synced to the Capsule
Summary: Publishing a repository in Capsule can take more than an hour to finish if ma...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.9.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact: Vladimír Sedmík
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-10 07:11 UTC by Hao Chang Yu
Modified: 2024-12-20 20:13 UTC (History)
11 users (show)

Fixed In Version: pulp-rpm-2.21.5.2-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-05-29 17:24:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github pulp pulp_rpm pull 2017 0 None Merged Fix slow publish when errata are associated to many repos 2022-11-01 07:20:29 UTC
Pulp Redmine 8890 0 Normal MODIFIED Publishing a repository can take longer time to finish if many same errata are in many synced repositories 2021-06-21 17:23:02 UTC
Red Hat Issue Tracker SAT-25390 0 None None None 2024-05-29 17:25:52 UTC

Description Hao Chang Yu 2021-06-10 07:11:30 UTC
Description of problem:
Capsule can take more than an hour to publish a repository when a large number of cloned repositories have been synced from the Satellite (many combinations of content views and composite content views and LCE) and same errata are existed in all the cloned repositories, such as RHEL 7.x, RHEL 7 EUS and different aches.

The more cloned repositories the Capsule has synced the more "erratum pkglist" entries will be created in the mongodb which can cause the performance degradation.

For example:
> db.erratum_pkglists.find({errata_id: "RHSA-2018:2557"}).count()
387
> db.erratum_pkglists.find({errata_id: "RHBA-2019:2180"}).count()
217

When publishing errata, Pulp will use the above query to get all package lists of the errata. This will take long time to process when they are many package lists returned by the query and each package list is consist of many packages.

As we can see below, the "Publish Errata" step is very slow. 53 minutes has passed, it has only processed about 2073 errata. It will take more than an hour to finish.
...
            {
                "description": "Publishing Errata",
                "details": "",
                "error_details": [],
                "items_total": 4789,
                "num_failures": 0,
                "num_processed": 2073, 
                "num_success": 2073,
                "state": "IN_PROGRESS",
                "step_id": "2f09190d-013a-4300-9445-eccb52ad94fe",
                "step_type": "errata"
            },
...
    "start_time": "2021-06-09T12:41:13Z",

# date
Wed Jun  9 13:32:41 UTC 2021

Comment 8 pulp-infra@redhat.com 2021-06-21 17:23:03 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 9 pulp-infra@redhat.com 2021-06-21 17:23:04 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 10 pulp-infra@redhat.com 2021-06-21 18:08:08 UTC
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.

Comment 15 Lai 2021-11-01 13:02:30 UTC
Steps to retest

1. Enable and sync the following repositories on the Satellite.

Red Hat Enterprise Linux 7 Server Debug RPMs x86_64 7.8
Red Hat Enterprise Linux 7 Server Debug RPMs x86_64 7.9
Red Hat Enterprise Linux 7 Server Debug RPMs x86_64 7Server
Red Hat Enterprise Linux 7 Server - Extended Update Support RPMs x86_64 7.6
Red Hat Enterprise Linux 7 Server - Extended Update Support RPMs x86_64 7.7
Red Hat Enterprise Linux 7 Server - Optional Debug RPMs x86_64 7.8
Red Hat Enterprise Linux 7 Server - Optional Debug RPMs x86_64 7.9
Red Hat Enterprise Linux 7 Server - Optional Debug RPMs x86_64 7Server
Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7.8
Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7.9
Red Hat Enterprise Linux 7 Server - Optional RPMs x86_64 7Server
Red Hat Enterprise Linux 7 Server RPMs x86_64 7.8
Red Hat Enterprise Linux 7 Server RPMs x86_64 7.9
Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server
Red Hat Enterprise Linux 7 Server - Extended Update Support Debug RPMs x86_64 7.6
Red Hat Enterprise Linux 7 Server - Extended Update Support Debug RPMs x86_64 7.7
Red Hat Enterprise Linux 7 Server - Extended Update Support - Optional Debug RPMs x86_64 7.6
Red Hat Enterprise Linux 7 Server - Extended Update Support - Optional Debug RPMs x86_64 7.7
Red Hat Enterprise Linux 7 Server - Extended Update Support - Optional RPMs x86_64 7.6
Red Hat Enterprise Linux 7 Server - Extended Update Support - Optional RPMs x86_64 7.7

2) Create a Content view and then add all the above 20 repositories to it.
3) Clone another 9 content views. (Content view page -> Select Action -> Copy Content View)
4) Publish and promote the content view to 1 or more lifecycle environments. Make sure Capsule is syncing the environments you promoted.
5) Wait until all Capsule syncs are done (should trigger automatically. If not trigger capsule sync manually)
6) Trigger a full capsule sync and wait until it is finished successfully: Infrastructure -> capsules -> <capsule name> -> synchronize -> complete sync
7) Note the time it takes to sync

Expected result:
Complete capsule sync should be successful

Actual result:
Error message: "RPM1015: Malformed repository: "primary" metadata is not found in repomd.xmlRPM1015: Malformed repository: "primary" metadata is not found in repomd.xmlRPM1015: Malformed repository: "primary" metadata is not found in repomd.xmlRPM1015: Malformed repository: "primary" metadata is not found in repomd.xmlRPM1015: Malformed repository: "primary" metadata is not found in repomd.xmlRPM1015: Malformed repository: "primary" metadata is not found in repomd.xmlRPM1015: Malformed repository: "primary" metadata is not found in repomd.xmlRPM1015: Malformed repository: "primary" metadata is not found in repomd.xmlRPM1015: Malformed repository: "primary" metadata is not found in repomd.xml"

Failed on 6.7.1 snap 1

You can check the vm: https://dhcp-3-207.vms.sat.rdu2.redhat.com/

Comment 16 pulp-infra@redhat.com 2021-11-01 13:17:23 UTC
Requesting needsinfo from upstream developer ttereshc, ggainey because the 'FailedQA' flag is set.

Comment 17 Pavel Moravec 2021-11-01 13:53:34 UTC
Checking the tester Satellite, *all* RHEL7 repositories lack primary metadata, e.g.:

(the 1st repository that Caps sync complains to is Red_Hat_Enterprise_Linux_7_Server_RPMs_x86_64_7Server / 1-v_10-dev-655c19d2-1a6a-471a-b624-15def02f0063):

# cat /var/lib/pulp/published/yum/master/yum_distributor/1-v_10-dev-655c19d2-1a6a-471a-b624-15def02f0063/1635542838.73/repodata/repomd.xml 
<?xml version='1.0' encoding='UTF-8'?>
<repomd xmlns="http://linux.duke.edu/metadata/repo" xmlns:rpm="http://linux.duke.edu/metadata/rpm"><revision>1635542844</revision>
<data type="filelists"><location href="repodata/f1ce899b6d83631af877d341f679cf87d29e7045-filelists.xml.gz"/><timestamp>1635542844</timestamp><size>49086817</size><checksum type="sha1">f1ce899b6d83631af877d341f679cf87d29e7045</checksum><open-size>684949560</open-size><open-checksum type="sha1">4fc5c734093fb14e02cfdcfc9191355bdf5302f7</open-checksum></data>
<data type="updateinfo"><location href="repodata/eb516980ab2908253ffb14e5f0d6a17ef5e9a877-updateinfo.xml.gz"/><timestamp>1635542844</timestamp><size>4276993</size><checksum type="sha1">eb516980ab2908253ffb14e5f0d6a17ef5e9a877</checksum><open-size>23742830</open-size><open-checksum type="sha1">ea5711920aed50fd7628411cce5d72240d800cbc</open-checksum></data>
<data type="group"><location href="repodata/fc25ff618b8438b53e302f944a2848e4493f6a9a-comps.xml"/><timestamp>1635542844</timestamp><size>645871</size><checksum type="sha1">fc25ff618b8438b53e302f944a2848e4493f6a9a</checksum></data>
<data type="productid"><location href="repodata/c76c2299-12f3-4f9c-b7bd-03bacee2c363"/><timestamp>1635542844</timestamp><size>2159</size><checksum type="sha1">c1e113a23f2e5caf3402ef1b0c4e5b270276afdf</checksum></data>
</repomd>
#

But *any* copy of that RHEL7 7Server repo lacks the primary repodata! :

# grep primary /var/lib/pulp/published/yum/master/yum_distributor/*655c19d2-1a6a-471a-b624-15def02f0063/*/repodata/repomd.xml
#

I am checking further (but does the above provide a hint to somebody?)

Comment 18 Pavel Moravec 2021-11-01 15:23:49 UTC
It *sounds* like some OOM or full disk issue caused this. Some facts:

1) Just the RHEL7 7Server repo (and *all* its clones) was affected.
2) Re-publishing the repo (on both patched and unpatched pulp) does generate proper metadata.
3) The root repo 655c19d2-1a6a-471a-b624-15def02f0063 was published *twice*, where:
  - the first publish (task_id 011e06ec-7bd9-4672-b87a-4d400b7b825b) was *incomplete* and _this_ repodata seems to persist and used to cloned repos
  - the second publish (task_id b3afceab-a141-4f59-af0f-88dfd32d192f) was complete, but apparently its repodata were not used (which is strange..)
4) foreman/dynflow tasks are aware of the second publish task only; where/why the first one was triggered, thta is a mystery (deleted foreman task?)

Anyway, I understood the same happened on 2 VMs where either was hit by OOM or full disk during repo sync/publish - that sounds as the root cause. Repo publish seems to work properly.

Comment 19 Grant Gainey 2021-11-01 15:34:36 UTC
Came here to say that it looks like "something catastrophic" happened (maybe on the 27th, based on some metadata timestamps). journalctl doesn't go back far enough to give me data, alas. Also, I note that this machine has no swap, which has caused similar problems in the past.

Comment 20 pulp-infra@redhat.com 2021-11-01 16:12:28 UTC
Requesting needsinfo from upstream developer ggainey because the 'FailedQA' flag is set.

Comment 33 pulp-infra@redhat.com 2021-11-02 13:16:38 UTC
Requesting needsinfo from upstream developer ttereshc because the 'FailedQA' flag is set.

Comment 44 Brad Buckingham 2024-05-29 17:24:55 UTC
Closing as this was VERIFIED on Satellite 6.9.


Note You need to log in before you can comment on or make changes to this bug.