Bug 2033174 - Large repo sync failed with "Katello::Errors::Pulp3Error: Response payload is not completed"
Summary: Large repo sync failed with "Katello::Errors::Pulp3Error: Response payload is...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.10.1
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: 6.11.0
Assignee: satellite6-bugs
QA Contact: Brian Herring
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-16 06:03 UTC by matt jia
Modified: 2023-12-20 00:42 UTC (History)
21 users (show)

Fixed In Version: pulpcore-3.16.8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2053723 2170945 (view as bug list)
Environment:
Last Closed: 2022-07-05 14:31:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github pulp pulpcore issues 2576 0 None closed Downloader doesn't reset file on retries. 2022-05-11 13:26:39 UTC
Github pulp pulpcore pull 1807 0 None Merged Retry downloads on checksum mismatch 2022-02-11 11:08:02 UTC
Github pulp pulpcore pull 1851 0 None Merged Fix retry logic of partially downloaded files 2022-02-11 11:08:02 UTC
Github pulp pulpcore pull 2122 0 None Merged Reset size and digests when retrying downloads 2022-02-11 11:08:02 UTC
Pulp Redmine 9667 0 None None None 2022-02-09 17:59:37 UTC
Red Hat Knowledge Base (Solution) 6596191 0 None None None 2022-01-31 10:52:52 UTC
Red Hat Product Errata RHSA-2022:5498 0 None None None 2022-07-05 14:31:47 UTC

Internal Links: 2093028

Description matt jia 2021-12-16 06:03:32 UTC
Description of problem:

In some environments with slow network or system, syncing large repos, for example RHEL7 Server, failed with Katello::Errors::Pulp3Error: Response payload is not completed and also caused 
task error:

pulpcore.tasking.pulpcore_worker:INFO: Task 7fc09c6a-d7a1-4dfd-a9ed-117f7eb207c5 failed (A file located at the url https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/repodata/18f74a403ac596dbdf0aecb35a63e1cf43ad6b8a-filelists.xml.gz failed validation due to checksum.)


Version-Release number of selected component (if applicable):

6.10

How reproducible:

Easy


Steps to Reproduce:
1. install 6.10 on a slow machine
2. syncing RHEL7 server repo
3.

Actual results:

Sync failed with:


pulpcore.tasking.pulpcore_worker:INFO: Task 7fc09c6a-d7a1-4dfd-a9ed-117f7eb207c5 failed (A file located at the url https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/repodata/18f74a403ac596dbdf0aecb35a63e1cf43ad6b8a-filelists.xml.gz failed validation due to checksum.)

Expected results:

Sync should be successfully



Additional info:

Comment 1 matt jia 2021-12-16 06:04:30 UTC
Adjust the download_concurrency to 1 is a valid workaround from this upstream issue:

https://community.theforeman.org/t/katello-response-payload-is-not-completed/23918/2

Comment 3 matt jia 2022-02-02 05:30:21 UTC
We seem to have a fix for this:

https://pulp.plan.io/issues/9667
https://github.com/pulp/pulpcore/commit/42edd37de23c338d987c48b774c24f0d855459a8

Could we please backport it to 6.10?

Thanks,

Matt

Comment 17 Brian Herring 2022-06-02 19:15:48 UTC
Close Out:

* The performance of this feature is improved in Satellite 6.11 from 6.10
* A linked documentation / performance bug should highlight network and system expectations for repo syncing to function within reasonable expected parameters

Comment 22 errata-xmlrpc 2022-07-05 14:31:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498

Comment 23 Ben 2023-03-08 16:05:15 UTC
This issue is still occurring in 6.12.2 when trying to do an initial sync of RHEL 8.6 BaseOS and AppStream repositories.

Mar  8 15:59:58 satellite1 pulpcore-worker-6[901775]: Giving up download_wrapper(...) after 5 tries (aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed)
Mar  8 15:59:58 satellite1 pulpcore-worker-6[901775]: pulp [d9a81197-97f6-4cdc-8e47-1640adf60e1c]: backoff:ERROR: Giving up download_wrapper(...) after 5 tries (aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed)

2023-03-08T15:59:59 [E|bac|d9a81197] Response payload is not completed (Katello::Errors::Pulp3Error)

Comment 24 Daniel Alley 2023-03-08 20:56:13 UTC
@Ben Could you file a new BZ please with fresh details?  It may be a very similar issue but it is unlikely to have the exact same cause as the original.

Comment 25 Daniel Alley 2023-03-08 22:56:46 UTC
For what it's worth, I was just able to sync both repos without error.  It could be a very precise timing issue or maybe a CDN issue.  How reliably does it occur for you?

Comment 26 Ben 2023-03-09 09:09:57 UTC
Hi Daniel,

Amusingly, overnight, my Satellite has seemingly managed to sync both RHEL 8.6 repos without a whimper.  It was happening continually/repeatedly for most of yesterday afternoon from when I first started trying at 13:00 BST (GMT+1) until I gave up at 17:40 BST on 2023-03-08.  Coincidentally, all that morning yum updates for RHEL 7 were failing due to one or two packages not being downloadable (git and certain samba RPMs), and LEAPPs from RHEL 7 to RHEL 8 were failing with the target systems unable to get the dnf* packages from RHEL 8 repos (see Red Hat Support case #03456297).

So yes, I very much think it is/was a CDN issue.  I'm just having trouble getting word one from Red Hat about what was going on.


Note You need to log in before you can comment on or make changes to this bug.