Bug 2173692
Summary: | [Pulp-3] The Repair API fails to perform any further repair if it has encountered errors while processning any docker manifest | |||
---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Sayan Das <saydas> | |
Component: | Pulp | Assignee: | satellite6-bugs <satellite6-bugs> | |
Status: | CLOSED ERRATA | QA Contact: | visawant | |
Severity: | high | Docs Contact: | ||
Priority: | medium | |||
Version: | 6.13.0 | CC: | ahumbe, dalley, dkliban, ggainey, hyu, ipanova, osousa, pcreech, rchan, shwsingh, visawant | |
Target Milestone: | 6.14.0 | Keywords: | Triaged | |
Target Release: | Unused | |||
Hardware: | All | |||
OS: | All | |||
Whiteboard: | ||||
Fixed In Version: | pulpcore-3.21.12, pulpcore-3.22.9, pulpcore-3.23.10, pulpcore-3.18.22 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2218980 (view as bug list) | Environment: | ||
Last Closed: | 2023-11-08 14:18:33 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: |
Description
Sayan Das
2023-02-27 17:10:48 UTC
Is this a regression from 6.12? Thanks! Hello, The behavior is the same in 6.12 as well. So no regression. I was not able to test it during 6.12 GA properly and hence tested it during 6.13. The "Repair API" of pulp is something really important for Satellite to have working properly. This is the key to get back all downloadable pulp artifacts after a backup without pulp data was restored or the pulp filesystem was corrupted and later recreated. The Pulp upstream bug status is at closed. Updating the external tracker on this bug. Requesting needsinfo from upstream developer dkliban, ggainey because the 'FailedQA' flag is set. This fix was primary focused to cover ``raise ClientResponseError(\n", "description": "404, message='Not Found', url=URL('https://registry.redhat.io/containers/content/dist/containers/rhel8/multiarch/containers/redhat-rhel8-toolbox/manifests/1/sha256:0165232f07e945f1c1831eebb8c42e5695f80d47d7c3d787af20752c89333276?_auth_=exp=1677520840~hmac=4c1a924c0c8b5af741a5025eaa730f8b5fca99f539f5004d65aeb6bc136c3db4')"`` as stated in the BZ report. Verification steps fail with ``raise TimeoutException(self.url)\n", "description": "Request timed out for https://registry.redhat.io/v2/rhel8/toolbox/manifests/sha256:629545c824752b6d75f7f22a73b6c7fcba9413b6017f2f67f58494a958bcd4ce. Increasing the total_timeout value on the remote might help."`` which is a different error and not covered by this fix. We need agree on the definition of done this BZ request, because it was not intended to catch all the errors that might come from the server. What is the preferred way to proceed? Requesting needsinfo from upstream developer dkliban, ggainey because the 'FailedQA' flag is set. (In reply to Ina Panova from comment #11) > This fix was primary focused to cover ``raise ClientResponseError(\n", > "description": "404, message='Not Found', > url=URL('https://registry.redhat.io/containers/content/dist/containers/rhel8/ > multiarch/containers/redhat-rhel8-toolbox/manifests/1/sha256: > 0165232f07e945f1c1831eebb8c42e5695f80d47d7c3d787af20752c89333276?_auth_=exp=1 > 677520840~hmac=4c1a924c0c8b5af741a5025eaa730f8b5fca99f539f5004d65aeb6bc136c3d > b4')"`` as stated in the BZ report. > > Verification steps fail with ``raise TimeoutException(self.url)\n", > "description": "Request timed out for > https://registry.redhat.io/v2/rhel8/toolbox/manifests/sha256: > 629545c824752b6d75f7f22a73b6c7fcba9413b6017f2f67f58494a958bcd4ce. Increasing > the total_timeout value on the remote might help."`` which is a different > error and not covered by this fix. > > We need agree on the definition of done this BZ request, because it was not > intended to catch all the errors that might come from the server. > What is the preferred way to proceed? Hello, Definitely, 404 was the error that I received initially because of the way it was reproduced. From that point of view, ClientResponseError was handled via https://github.com/pulp/pulpcore/commit/72f1902bd9d5b4d4237297b3c68a617682f4b5f7 and it's fine. But that is not the only error that someone can receive. If Satellite\Pulp's remote is associated with a proxy and that proxy is limiting connections or misconfigured someway, Then TimeoutException would be a common thing to occur as well. If Firewall is not properly configured that may even result in "Connection Reset" but I don't know how Pulp will show that message. ( but this is not very important at this moment ). Anyways, The point is, No matter what the failure is, The Repair API, should not halt but continue processing rest of the units that it has identified as corrupted or missing and that has always been the goal of this BZ ( if you see the "Expected Results" section in Description ). My request would be to handle the TimeoutException in the same way as ClientResponseError was done as these two are the commonly seen errors when it comes to pulp trying to download something from upstream or another pulp instance. -- Sayan Sayan, thanks for clarifying the scope. I have filed a new upstream request https://github.com/pulp/pulpcore/issues/4111 Verified Version Tested: Satellite 6.14.0 Snap 9 *** Bug 2240519 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Satellite 6.14 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6818 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |