Bug 2033174

Summary: Large repo sync failed with "Katello::Errors::Pulp3Error: Response payload is not completed"
Product: Red Hat Satellite Reporter: matt jia <mjia>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Brian Herring <bherring>
Severity: medium Docs Contact:
Priority: high    
Version: 6.10.1CC: ahumbe, ben.argyle, bherring, dalley, dkliban, ggainey, hakon.gislason, jkrajice, jsenkyri, mmccune, nikhjain, pcreech, rakumar, rchan, sabuchan, sadas, saydas, spurrier, vijsingh, wclark, wpinheir
Target Milestone: 6.11.0Keywords: Triaged, WorkAround
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pulpcore-3.16.8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2053723 2170945 (view as bug list) Environment:
Last Closed: 2022-07-05 14:31:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description matt jia 2021-12-16 06:03:32 UTC
Description of problem:

In some environments with slow network or system, syncing large repos, for example RHEL7 Server, failed with Katello::Errors::Pulp3Error: Response payload is not completed and also caused 
task error:

pulpcore.tasking.pulpcore_worker:INFO: Task 7fc09c6a-d7a1-4dfd-a9ed-117f7eb207c5 failed (A file located at the url https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/repodata/18f74a403ac596dbdf0aecb35a63e1cf43ad6b8a-filelists.xml.gz failed validation due to checksum.)


Version-Release number of selected component (if applicable):

6.10

How reproducible:

Easy


Steps to Reproduce:
1. install 6.10 on a slow machine
2. syncing RHEL7 server repo
3.

Actual results:

Sync failed with:


pulpcore.tasking.pulpcore_worker:INFO: Task 7fc09c6a-d7a1-4dfd-a9ed-117f7eb207c5 failed (A file located at the url https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/repodata/18f74a403ac596dbdf0aecb35a63e1cf43ad6b8a-filelists.xml.gz failed validation due to checksum.)

Expected results:

Sync should be successfully



Additional info:

Comment 1 matt jia 2021-12-16 06:04:30 UTC
Adjust the download_concurrency to 1 is a valid workaround from this upstream issue:

https://community.theforeman.org/t/katello-response-payload-is-not-completed/23918/2

Comment 3 matt jia 2022-02-02 05:30:21 UTC
We seem to have a fix for this:

https://pulp.plan.io/issues/9667
https://github.com/pulp/pulpcore/commit/42edd37de23c338d987c48b774c24f0d855459a8

Could we please backport it to 6.10?

Thanks,

Matt

Comment 17 Brian Herring 2022-06-02 19:15:48 UTC
Close Out:

* The performance of this feature is improved in Satellite 6.11 from 6.10
* A linked documentation / performance bug should highlight network and system expectations for repo syncing to function within reasonable expected parameters

Comment 22 errata-xmlrpc 2022-07-05 14:31:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498

Comment 23 Ben 2023-03-08 16:05:15 UTC
This issue is still occurring in 6.12.2 when trying to do an initial sync of RHEL 8.6 BaseOS and AppStream repositories.

Mar  8 15:59:58 satellite1 pulpcore-worker-6[901775]: Giving up download_wrapper(...) after 5 tries (aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed)
Mar  8 15:59:58 satellite1 pulpcore-worker-6[901775]: pulp [d9a81197-97f6-4cdc-8e47-1640adf60e1c]: backoff:ERROR: Giving up download_wrapper(...) after 5 tries (aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed)

2023-03-08T15:59:59 [E|bac|d9a81197] Response payload is not completed (Katello::Errors::Pulp3Error)

Comment 24 Daniel Alley 2023-03-08 20:56:13 UTC
@Ben Could you file a new BZ please with fresh details?  It may be a very similar issue but it is unlikely to have the exact same cause as the original.

Comment 25 Daniel Alley 2023-03-08 22:56:46 UTC
For what it's worth, I was just able to sync both repos without error.  It could be a very precise timing issue or maybe a CDN issue.  How reliably does it occur for you?

Comment 26 Ben 2023-03-09 09:09:57 UTC
Hi Daniel,

Amusingly, overnight, my Satellite has seemingly managed to sync both RHEL 8.6 repos without a whimper.  It was happening continually/repeatedly for most of yesterday afternoon from when I first started trying at 13:00 BST (GMT+1) until I gave up at 17:40 BST on 2023-03-08.  Coincidentally, all that morning yum updates for RHEL 7 were failing due to one or two packages not being downloadable (git and certain samba RPMs), and LEAPPs from RHEL 7 to RHEL 8 were failing with the target systems unable to get the dnf* packages from RHEL 8 repos (see Red Hat Support case #03456297).

So yes, I very much think it is/was a CDN issue.  I'm just having trouble getting word one from Red Hat about what was going on.