2033174 – Large repo sync failed with "Katello::Errors::Pulp3Error: Response payload is not completed"

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2033174 - Large repo sync failed with "Katello::Errors::Pulp3Error: Response payload is not completed"

Summary: Large repo sync failed with "Katello::Errors::Pulp3Error: Response payload is...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Satellite
Classification:	Red Hat
Component:	Pulp
Sub Component:
Version:	6.10.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	6.11.0
Assignee:	satellite6-bugs
QA Contact:	Brian Herring
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-12-16 06:03 UTC by matt jia
Modified:	2023-12-20 00:42 UTC (History)
CC List:	21 users (show)
Fixed In Version:	pulpcore-3.16.8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	2053723 2170945 (view as bug list)
Environment:
Last Closed:	2022-07-05 14:31:10 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	pulp pulpcore issues 2576	None	closed	Downloader doesn't reset file on retries.	2022-05-11 13:26:39 UTC
Github	pulp pulpcore pull 1807	None	Merged	Retry downloads on checksum mismatch	2022-02-11 11:08:02 UTC
Github	pulp pulpcore pull 1851	None	Merged	Fix retry logic of partially downloaded files	2022-02-11 11:08:02 UTC
Github	pulp pulpcore pull 2122	None	Merged	Reset size and digests when retrying downloads	2022-02-11 11:08:02 UTC
Pulp Redmine	9667	None	None	None	2022-02-09 17:59:37 UTC
Red Hat Knowledge Base (Solution)	6596191	None	None	None	2022-01-31 10:52:52 UTC
Red Hat Product Errata	RHSA-2022:5498	None	None	None	2022-07-05 14:31:47 UTC

Internal Links: 2093028

Description matt jia 2021-12-16 06:03:32 UTC

Description of problem:

In some environments with slow network or system, syncing large repos, for example RHEL7 Server, failed with Katello::Errors::Pulp3Error: Response payload is not completed and also caused 
task error:

pulpcore.tasking.pulpcore_worker:INFO: Task 7fc09c6a-d7a1-4dfd-a9ed-117f7eb207c5 failed (A file located at the url https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/repodata/18f74a403ac596dbdf0aecb35a63e1cf43ad6b8a-filelists.xml.gz failed validation due to checksum.)


Version-Release number of selected component (if applicable):

6.10

How reproducible:

Easy


Steps to Reproduce:
1. install 6.10 on a slow machine
2. syncing RHEL7 server repo
3.

Actual results:

Sync failed with:


pulpcore.tasking.pulpcore_worker:INFO: Task 7fc09c6a-d7a1-4dfd-a9ed-117f7eb207c5 failed (A file located at the url https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/repodata/18f74a403ac596dbdf0aecb35a63e1cf43ad6b8a-filelists.xml.gz failed validation due to checksum.)

Expected results:

Sync should be successfully



Additional info:

Comment 1 matt jia 2021-12-16 06:04:30 UTC

Adjust the download_concurrency to 1 is a valid workaround from this upstream issue:

https://community.theforeman.org/t/katello-response-payload-is-not-completed/23918/2

Comment 3 matt jia 2022-02-02 05:30:21 UTC

We seem to have a fix for this:

https://pulp.plan.io/issues/9667
https://github.com/pulp/pulpcore/commit/42edd37de23c338d987c48b774c24f0d855459a8

Could we please backport it to 6.10?

Thanks,

Matt

Comment 17 Brian Herring 2022-06-02 19:15:48 UTC

Close Out:

* The performance of this feature is improved in Satellite 6.11 from 6.10
* A linked documentation / performance bug should highlight network and system expectations for repo syncing to function within reasonable expected parameters

Comment 22 errata-xmlrpc 2022-07-05 14:31:10 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498

Comment 23 Ben 2023-03-08 16:05:15 UTC

This issue is still occurring in 6.12.2 when trying to do an initial sync of RHEL 8.6 BaseOS and AppStream repositories.

Mar  8 15:59:58 satellite1 pulpcore-worker-6[901775]: Giving up download_wrapper(...) after 5 tries (aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed)
Mar  8 15:59:58 satellite1 pulpcore-worker-6[901775]: pulp [d9a81197-97f6-4cdc-8e47-1640adf60e1c]: backoff:ERROR: Giving up download_wrapper(...) after 5 tries (aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed)

2023-03-08T15:59:59 [E|bac|d9a81197] Response payload is not completed (Katello::Errors::Pulp3Error)

Comment 24 Daniel Alley 2023-03-08 20:56:13 UTC

@Ben Could you file a new BZ please with fresh details?  It may be a very similar issue but it is unlikely to have the exact same cause as the original.

Comment 25 Daniel Alley 2023-03-08 22:56:46 UTC

For what it's worth, I was just able to sync both repos without error.  It could be a very precise timing issue or maybe a CDN issue.  How reliably does it occur for you?

Comment 26 Ben 2023-03-09 09:09:57 UTC

Hi Daniel,

Amusingly, overnight, my Satellite has seemingly managed to sync both RHEL 8.6 repos without a whimper.  It was happening continually/repeatedly for most of yesterday afternoon from when I first started trying at 13:00 BST (GMT+1) until I gave up at 17:40 BST on 2023-03-08.  Coincidentally, all that morning yum updates for RHEL 7 were failing due to one or two packages not being downloadable (git and certain samba RPMs), and LEAPPs from RHEL 7 to RHEL 8 were failing with the target systems unable to get the dnf* packages from RHEL 8 repos (see Red Hat Support case #03456297).

So yes, I very much think it is/was a CDN issue.  I'm just having trouble getting word one from Red Hat about what was going on.

Note You need to log in before you can comment on or make changes to this bug.

ahumbe
ben.argyle
bherring
dalley
dkliban
ggainey
hakon.gislason
jkrajice
jsenkyri
mmccune
nikhjain
pcreech
rakumar
rchan
sabuchan
sadas
saydas
spurrier
vijsingh
wclark
wpinheir