Bug 2104498

Summary: Unable to sync jfrog artifactory-pro-rpms repository
Product: Red Hat Satellite Reporter: Jan Senkyrik <jsenkyri>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.10.5CC: dalley, dkliban, ggainey, pulp-infra, rchan, ssydoren, zhunting
Target Milestone: 6.12.0Keywords: Regression, Triaged
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2116566 (view as bug list) Environment:
Last Closed: 2022-11-16 13:34:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sync logs none

Description Jan Senkyrik 2022-07-06 13:18:30 UTC
Description of problem:
Repository sync for jfrog artifactory-pro-rpms repository [0] fails.

I am able to sync this repository successfully using pulp2 (Satellite 6.9.9) so this looks like a regression.

Version-Release number of selected component (if applicable):
Satellite 6.10.5

How reproducible:
Always

Steps to Reproduce:
1. Create custom product and within this product a custom repo:
~~~
hammer repository info --id 323
Id:                      323
Name:                    03257334_jfrogpro
Label:                   03257334_jfrogpro
Description:             
Organization:            Default Organization
Red Hat Repository:      no
Content Type:            yum
Mirror on Sync:          no
Url:                     https://releases.jfrog.io/artifactory/artifactory-pro-rpms/
Publish Via HTTP:        yes
Published At:            https://jsenkyri-satellite-latest.sysmgmt.lan/pulp/content/Default_Organization/Library/custom/03257334_jfrogpro/03257334_jfrogpro/
Relative Path:           Default_Organization/Library/custom/03257334_jfrogpro/03257334_jfrogpro
Download Policy:         immediate
Ignorable Content Units: 
HTTP Proxy:              
    HTTP Proxy Policy: global_default_http_proxy
Product:                 
    Id:   378
    Name: 03257334_jfrogpro
GPG Key:                 

Sync:                    
    Status:         Warning
    Last Sync Date: 5 minutes
Created:                 2022/07/06 10:13:49
Updated:                 2022/07/06 13:06:44
Content Counts:          
    Packages:       0
    Source RPMS:    0
    Package Groups: 0
    Errata:         0
    Module Streams: 0

~~~

2. Sync it.

3. Tasks fails:

# production.log
~~~
2022-07-06T12:30:35 [I|bac|d1ab3e27] Task {label: Actions::Katello::Repository::Sync, id: a58fcbad-cc37-47e5-a382-1f35abf8f39c, execution_plan_id: 578c829f-1d8e-4aaf-9dea-52eee5cb4ec7} state changed: running 
2022-07-06T12:30:45 [E|bac|d1ab3e27] File not found: https://releases-cdn.jfrog.io/filestore/8d/8d1ae0d797592d8c41d62e13b7e074aa0a7c82b2?response-content-type=application/x-gzip&response-content-disposition=attachment%3Bfilename%3D%228d1ae0d797592d8c41d62e13b7e074aa0a7c82b2-filelists.xml.gz%22&x-jf-traceId=146a32fc2c1f520e&X-Artifactory-repositoryKey=artifactory-pro-rpms&X-Artifactory-projectKey=default&X-Artifactory-artifactPath=repodata/8d1ae0d797592d8c41d62e13b7e074aa0a7c82b2-filelists.xml.gz&X-Artifactory-username=anonymous&Expires=1657103500&Signature=SwXx45ES9XCUEpp2ik5aA~3eaBvvk5Ct3mLM4E57yeGjUvslEZPygtccvPRhwVGHxVsZxP68JeonUnYs0JD3GtgWAW31vDQ3a-gkyu8xc1bNdb3yoi-vorXXDaH16wjD14mKGPXMx~Kv5rdsnf2LMoB0K8KDPlsGEgvTawj1CM~NVsVfJfhd-IwWyAF0WMgvacYdvG5Ap0T6VXRKvOBz1V56YDiaYIhfZmfohT08beXQyvEOJ94CILau6FoHlH1jtomh0H7MPNOWrG00EEDYRHTxLSARzoHu5ApSZYCIyqjzG1Nu5bUwFqNWl2oJIWDNoZZfBVmZTuleRl-2Kl8Wog__&Key-Pair-Id=APKAJ6NHFWMVU3M6DPBA (Katello::Errors::Pulp3Error)
 d1ab3e27 | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-4.1.1.56/app/lib/actions/pulp3/abstract_async_task.rb:108:in `block in check_for_errors'
~~~

# messages
~~~
Jul  6 12:30:44 jsenkyri-satellite-latest pulpcore-worker-2: Giving up download_wrapper(...) after 1 tries (aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://releases-cdn.jfrog.io/filestore/8d/8d1ae0d797592d8c41d62e13b7e074aa0a7c82b2?response-content-type=application/x-gzip&response-content-disposition=attachment%3Bfilename%3D%228d1ae0d797592d8c41d62e13b7e074aa0a7c82b2-filelists.xml.gz%22&x-jf-traceId=146a32fc2c1f520e&X-Artifactory-repositoryKey=artifactory-pro-rpms&X-Artifactory-projectKey=default&X-Artifactory-artifactPath=repodata/8d1ae0d797592d8c41d62e13b7e074aa0a7c82b2-filelists.xml.gz&X-Artifactory-username=anonymous&Expires=1657103500&Signature=SwXx45ES9XCUEpp2ik5aA~3eaBvvk5Ct3mLM4E57yeGjUvslEZPygtccvPRhwVGHxVsZxP68JeonUnYs0JD3GtgWAW31vDQ3a-gkyu8xc1bNdb3yoi-vorXXDaH16wjD14mKGPXMx~Kv5rdsnf2LMoB0K8KDPlsGEgvTawj1CM~NVsVfJfhd-IwWyAF0WMgvacYdvG5Ap0T6VXRKvOBz1V56YDiaYIhfZmfohT08beXQyvEOJ94CILau6FoHlH1jtomh0H7MPNOWrG00EEDYRHTxLSARzoHu5ApSZYCIyqjzG1Nu5bUwFqNWl2oJIWDNoZZfBVmZTuleRl-2Kl8Wog__&Key-Pair-Id=APKAJ6NHFWMVU3M6DPBA'))
~~~


Actual results:
Repository fails to sync.

Expected results:
Repository syncs successfully.

Additional Info:
Tried all Mirroring Policy options.


[0] https://releases.jfrog.io/artifactory/artifactory-pro-rpms/

Comment 1 Jan Senkyrik 2022-07-06 13:19:29 UTC
Created attachment 1894937 [details]
sync logs

Comment 2 Daniel Alley 2022-07-08 03:18:51 UTC
This appears to be an issue in the aiohttp library that we use, I can reproduce it with aiohttp alone.

In [2]: import aiohttp
   ...: import asyncio
   ...: 
   ...: async def main():
   ...: 
   ...:     async with aiohttp.ClientSession() as session:
   ...:         async with session.get('https://releases.jfrog.io/artifactory/artifactory-pro-rpms/repodata/8c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz') as response:
   ...: 
   ...:             print("Status:", response.status)
   ...:             print("Content-type:", response.headers['content-type'])
   ...: 
   ...:             html = await response.text()
   ...:             print("Body:", html)
   ...: 
   ...: loop = asyncio.get_event_loop()
   ...: loop.run_until_complete(main())
Status: 403
Content-type: text/xml
Body: <?xml version="1.0" encoding="UTF-8"?><Error><Code>AccessDenied</Code><Message>Access denied</Message></Error>


Artifactory serves up a redirect which aiohttp doesn't seem to encode to the webserver's liking.  The top URL is the redirect URL that pulp tries to use, the bottom one is the one that wget and httpie use.  If you look at the "X-Artifactory-artifactPath" parameter, the one on the top uses a forward slash and the one on bottom uses urlencoding.

https://releases-cdn.jfrog.io/filestore/8c/8c87521e43dbed223c90e23922c0c4bbe7159112?response-content-type=application/x-gzip&response-content-disposition=attachment%3Bfilename%3D%228c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz%22&x-jf-traceId=da959416009abe79&X-Artifactory-repositoryKey=artifactory-pro-rpms&X-Artifactory-projectKey=default&X-Artifactory-artifactPath=repodata/8c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz&X-Artifactory-username=anonymous&Expires=1657248968&Signature=MA21YnR3A8BncqlPFTcELv15ndn2B6yNQpaag2JuuPiWgiwjYnGHlbQkDbXp6Gk3ygrKOPvUETeb-gSzv6g64kFVnNzYzDxMRdV71nIP2YRWkufG-2R9AGSA9OtAevmAWYG-wmGazZt0L6VqX-4XDfpQPoVG5RITCMzQnQ3W~XbX8lB2AhliU0GI8QNOFzVKB8bAiFIFjh0HBV-P~~DlRaOn0ouKkdS-6paLjDq7NPEmlXrt2A~oOd5NvwViRpUNSsbov6WugJHPMMxPb5wj3B7hISYfwN6~6TWgbZ8Y6o3GXV-3zhxN4idpkFuvF90d2Aoep~gmeGKUysk3EohzVQ__&Key-Pair-Id=APKAJ6NHFWMVU3M6DPBA

https://releases-cdn.jfrog.io/filestore/8c/8c87521e43dbed223c90e23922c0c4bbe7159112?response-content-type=application/x-gzip&response-content-disposition=attachment%3Bfilename%3D%228c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz%22&x-jf-traceId=e4edec532ab037c3&X-Artifactory-repositoryKey=artifactory-pro-rpms&X-Artifactory-projectKey=default&X-Artifactory-artifactPath=repodata%2F8c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz&X-Artifactory-username=anonymous&Expires=1657249015&Signature=dwsfXduAz0Avg62JQfIgXuQGP2cLv6AExW4u9gR3B0wzOu5R599u~satGkD6yLIzB4Y1uBCLhU6IO7b-ovr0wT9AJTxMaE-eiF2awcaj2eGzXpFibJVoUuzhsG3CW5koby8cytTThxRYewn1nACLna58og14o~EkXIgA3~c32y5ltR1rtJDyclQ~wpQNTpc4vX4uk2hr5z0lolmCGy0fgLxmuiFkzNmsHFAUbo1qqb8kJF~SAE-l4erLPAIlyIuDpAl6myyJ1cHmS-dolXiz8M6yNel-sJY5jaOH-iAlZoZm9PHdqmKbMqrzdgzLS2emZF5zER9g068lY~7fLqgqrw__&Key-Pair-Id=APKAJ6NHFWMVU3M6DPBA

Comment 3 Daniel Alley 2022-07-08 04:12:51 UTC
Root cause: https://github.com/aio-libs/yarl/issues/245

Comment 4 Daniel Alley 2022-07-08 04:29:02 UTC
Although going by some of the comments, it's plausible that Yarl is abiding strictly by the spec, whereas whichever webserver Artifactory is using is rejecting requests that it ought to accept, because '/' is (supposed to be?) an allowed character.

It seems like most other implementations are more aggressive with urlencoding than Yarl, however.  As mentioned there's wget and httpie, but I believe also urllib.

Comment 5 Daniel Alley 2022-07-08 04:48:42 UTC
A decent argument can also be made for https://github.com/aio-libs/aiohttp/issues/5319 being the "real" issue.

Comment 7 Sviatoslav Sydorenko 2022-07-12 21:28:43 UTC
@dalley try https://docs.aiohttp.org/en/stable/client_reference.html#aiohttp.ClientSession.requote_redirect_url to see if it helps.

Comment 8 Daniel Alley 2022-07-13 05:34:32 UTC
Yes, this works.  Patch incoming.

Comment 11 Daniel Alley 2022-08-05 15:33:23 UTC
*** Bug 2115882 has been marked as a duplicate of this bug. ***

Comment 12 Daniel Alley 2022-08-09 13:57:53 UTC
*** Bug 2115878 has been marked as a duplicate of this bug. ***

Comment 13 Griffin Sullivan 2022-08-11 17:02:15 UTC
Verified in 6.12 snap 6

Repository sync runs successfully with URL: https://releases.jfrog.io/artifactory/artifactory-pro-rpms/

Steps to Reproduce:

1) Create a new product and yum repo with URL set to https://releases.jfrog.io/artifactory/artifactory-pro-rpms/

2) Sync the repository

Expected Results:

Sync runs successfully and all rpms are downloaded.

Actual Results:

Sync runs successfully and all rpms are downloaded.


Notes:

This repo is quite large and requires over 100Gs to sync. Took my Satellite about 30 minutes to finish.

Comment 18 errata-xmlrpc 2022-11-16 13:34:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.12 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8506