Description of problem: Repository sync for jfrog artifactory-pro-rpms repository [0] fails. I am able to sync this repository successfully using pulp2 (Satellite 6.9.9) so this looks like a regression. Version-Release number of selected component (if applicable): Satellite 6.10.5 How reproducible: Always Steps to Reproduce: 1. Create custom product and within this product a custom repo: ~~~ hammer repository info --id 323 Id: 323 Name: 03257334_jfrogpro Label: 03257334_jfrogpro Description: Organization: Default Organization Red Hat Repository: no Content Type: yum Mirror on Sync: no Url: https://releases.jfrog.io/artifactory/artifactory-pro-rpms/ Publish Via HTTP: yes Published At: https://jsenkyri-satellite-latest.sysmgmt.lan/pulp/content/Default_Organization/Library/custom/03257334_jfrogpro/03257334_jfrogpro/ Relative Path: Default_Organization/Library/custom/03257334_jfrogpro/03257334_jfrogpro Download Policy: immediate Ignorable Content Units: HTTP Proxy: HTTP Proxy Policy: global_default_http_proxy Product: Id: 378 Name: 03257334_jfrogpro GPG Key: Sync: Status: Warning Last Sync Date: 5 minutes Created: 2022/07/06 10:13:49 Updated: 2022/07/06 13:06:44 Content Counts: Packages: 0 Source RPMS: 0 Package Groups: 0 Errata: 0 Module Streams: 0 ~~~ 2. Sync it. 3. Tasks fails: # production.log ~~~ 2022-07-06T12:30:35 [I|bac|d1ab3e27] Task {label: Actions::Katello::Repository::Sync, id: a58fcbad-cc37-47e5-a382-1f35abf8f39c, execution_plan_id: 578c829f-1d8e-4aaf-9dea-52eee5cb4ec7} state changed: running 2022-07-06T12:30:45 [E|bac|d1ab3e27] File not found: https://releases-cdn.jfrog.io/filestore/8d/8d1ae0d797592d8c41d62e13b7e074aa0a7c82b2?response-content-type=application/x-gzip&response-content-disposition=attachment%3Bfilename%3D%228d1ae0d797592d8c41d62e13b7e074aa0a7c82b2-filelists.xml.gz%22&x-jf-traceId=146a32fc2c1f520e&X-Artifactory-repositoryKey=artifactory-pro-rpms&X-Artifactory-projectKey=default&X-Artifactory-artifactPath=repodata/8d1ae0d797592d8c41d62e13b7e074aa0a7c82b2-filelists.xml.gz&X-Artifactory-username=anonymous&Expires=1657103500&Signature=SwXx45ES9XCUEpp2ik5aA~3eaBvvk5Ct3mLM4E57yeGjUvslEZPygtccvPRhwVGHxVsZxP68JeonUnYs0JD3GtgWAW31vDQ3a-gkyu8xc1bNdb3yoi-vorXXDaH16wjD14mKGPXMx~Kv5rdsnf2LMoB0K8KDPlsGEgvTawj1CM~NVsVfJfhd-IwWyAF0WMgvacYdvG5Ap0T6VXRKvOBz1V56YDiaYIhfZmfohT08beXQyvEOJ94CILau6FoHlH1jtomh0H7MPNOWrG00EEDYRHTxLSARzoHu5ApSZYCIyqjzG1Nu5bUwFqNWl2oJIWDNoZZfBVmZTuleRl-2Kl8Wog__&Key-Pair-Id=APKAJ6NHFWMVU3M6DPBA (Katello::Errors::Pulp3Error) d1ab3e27 | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-4.1.1.56/app/lib/actions/pulp3/abstract_async_task.rb:108:in `block in check_for_errors' ~~~ # messages ~~~ Jul 6 12:30:44 jsenkyri-satellite-latest pulpcore-worker-2: Giving up download_wrapper(...) after 1 tries (aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://releases-cdn.jfrog.io/filestore/8d/8d1ae0d797592d8c41d62e13b7e074aa0a7c82b2?response-content-type=application/x-gzip&response-content-disposition=attachment%3Bfilename%3D%228d1ae0d797592d8c41d62e13b7e074aa0a7c82b2-filelists.xml.gz%22&x-jf-traceId=146a32fc2c1f520e&X-Artifactory-repositoryKey=artifactory-pro-rpms&X-Artifactory-projectKey=default&X-Artifactory-artifactPath=repodata/8d1ae0d797592d8c41d62e13b7e074aa0a7c82b2-filelists.xml.gz&X-Artifactory-username=anonymous&Expires=1657103500&Signature=SwXx45ES9XCUEpp2ik5aA~3eaBvvk5Ct3mLM4E57yeGjUvslEZPygtccvPRhwVGHxVsZxP68JeonUnYs0JD3GtgWAW31vDQ3a-gkyu8xc1bNdb3yoi-vorXXDaH16wjD14mKGPXMx~Kv5rdsnf2LMoB0K8KDPlsGEgvTawj1CM~NVsVfJfhd-IwWyAF0WMgvacYdvG5Ap0T6VXRKvOBz1V56YDiaYIhfZmfohT08beXQyvEOJ94CILau6FoHlH1jtomh0H7MPNOWrG00EEDYRHTxLSARzoHu5ApSZYCIyqjzG1Nu5bUwFqNWl2oJIWDNoZZfBVmZTuleRl-2Kl8Wog__&Key-Pair-Id=APKAJ6NHFWMVU3M6DPBA')) ~~~ Actual results: Repository fails to sync. Expected results: Repository syncs successfully. Additional Info: Tried all Mirroring Policy options. [0] https://releases.jfrog.io/artifactory/artifactory-pro-rpms/
Created attachment 1894937 [details] sync logs
This appears to be an issue in the aiohttp library that we use, I can reproduce it with aiohttp alone. In [2]: import aiohttp ...: import asyncio ...: ...: async def main(): ...: ...: async with aiohttp.ClientSession() as session: ...: async with session.get('https://releases.jfrog.io/artifactory/artifactory-pro-rpms/repodata/8c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz') as response: ...: ...: print("Status:", response.status) ...: print("Content-type:", response.headers['content-type']) ...: ...: html = await response.text() ...: print("Body:", html) ...: ...: loop = asyncio.get_event_loop() ...: loop.run_until_complete(main()) Status: 403 Content-type: text/xml Body: <?xml version="1.0" encoding="UTF-8"?><Error><Code>AccessDenied</Code><Message>Access denied</Message></Error> Artifactory serves up a redirect which aiohttp doesn't seem to encode to the webserver's liking. The top URL is the redirect URL that pulp tries to use, the bottom one is the one that wget and httpie use. If you look at the "X-Artifactory-artifactPath" parameter, the one on the top uses a forward slash and the one on bottom uses urlencoding. https://releases-cdn.jfrog.io/filestore/8c/8c87521e43dbed223c90e23922c0c4bbe7159112?response-content-type=application/x-gzip&response-content-disposition=attachment%3Bfilename%3D%228c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz%22&x-jf-traceId=da959416009abe79&X-Artifactory-repositoryKey=artifactory-pro-rpms&X-Artifactory-projectKey=default&X-Artifactory-artifactPath=repodata/8c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz&X-Artifactory-username=anonymous&Expires=1657248968&Signature=MA21YnR3A8BncqlPFTcELv15ndn2B6yNQpaag2JuuPiWgiwjYnGHlbQkDbXp6Gk3ygrKOPvUETeb-gSzv6g64kFVnNzYzDxMRdV71nIP2YRWkufG-2R9AGSA9OtAevmAWYG-wmGazZt0L6VqX-4XDfpQPoVG5RITCMzQnQ3W~XbX8lB2AhliU0GI8QNOFzVKB8bAiFIFjh0HBV-P~~DlRaOn0ouKkdS-6paLjDq7NPEmlXrt2A~oOd5NvwViRpUNSsbov6WugJHPMMxPb5wj3B7hISYfwN6~6TWgbZ8Y6o3GXV-3zhxN4idpkFuvF90d2Aoep~gmeGKUysk3EohzVQ__&Key-Pair-Id=APKAJ6NHFWMVU3M6DPBA https://releases-cdn.jfrog.io/filestore/8c/8c87521e43dbed223c90e23922c0c4bbe7159112?response-content-type=application/x-gzip&response-content-disposition=attachment%3Bfilename%3D%228c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz%22&x-jf-traceId=e4edec532ab037c3&X-Artifactory-repositoryKey=artifactory-pro-rpms&X-Artifactory-projectKey=default&X-Artifactory-artifactPath=repodata%2F8c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz&X-Artifactory-username=anonymous&Expires=1657249015&Signature=dwsfXduAz0Avg62JQfIgXuQGP2cLv6AExW4u9gR3B0wzOu5R599u~satGkD6yLIzB4Y1uBCLhU6IO7b-ovr0wT9AJTxMaE-eiF2awcaj2eGzXpFibJVoUuzhsG3CW5koby8cytTThxRYewn1nACLna58og14o~EkXIgA3~c32y5ltR1rtJDyclQ~wpQNTpc4vX4uk2hr5z0lolmCGy0fgLxmuiFkzNmsHFAUbo1qqb8kJF~SAE-l4erLPAIlyIuDpAl6myyJ1cHmS-dolXiz8M6yNel-sJY5jaOH-iAlZoZm9PHdqmKbMqrzdgzLS2emZF5zER9g068lY~7fLqgqrw__&Key-Pair-Id=APKAJ6NHFWMVU3M6DPBA
Root cause: https://github.com/aio-libs/yarl/issues/245
Although going by some of the comments, it's plausible that Yarl is abiding strictly by the spec, whereas whichever webserver Artifactory is using is rejecting requests that it ought to accept, because '/' is (supposed to be?) an allowed character. It seems like most other implementations are more aggressive with urlencoding than Yarl, however. As mentioned there's wget and httpie, but I believe also urllib.
A decent argument can also be made for https://github.com/aio-libs/aiohttp/issues/5319 being the "real" issue.
@dalley try https://docs.aiohttp.org/en/stable/client_reference.html#aiohttp.ClientSession.requote_redirect_url to see if it helps.
Yes, this works. Patch incoming.
*** Bug 2115882 has been marked as a duplicate of this bug. ***
*** Bug 2115878 has been marked as a duplicate of this bug. ***
Verified in 6.12 snap 6 Repository sync runs successfully with URL: https://releases.jfrog.io/artifactory/artifactory-pro-rpms/ Steps to Reproduce: 1) Create a new product and yum repo with URL set to https://releases.jfrog.io/artifactory/artifactory-pro-rpms/ 2) Sync the repository Expected Results: Sync runs successfully and all rpms are downloaded. Actual Results: Sync runs successfully and all rpms are downloaded. Notes: This repo is quite large and requires over 100Gs to sync. Took my Satellite about 30 minutes to finish.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Satellite 6.12 Release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:8506