The image append test fails infrequently (1/30) with
error: uploading the source layer sha256:cca21acb641a96561e0cf9a0c1c7b7ffbaaefc92185bd8a9440f6049c838e33b failed: Patch "https://image-registry.openshift-image-registry.svc:5000/v2/e2e-test-image-append-vvcfr/test/blobs/uploads/5b6c7125-b1a6-4054-8757-bfb769013ce6?_state=8dVf5ydO_2F_Tz_oTw8iYngy2Bsl8F-wF3QaEJF7IVF7Ik5hbWUiOiJlMmUtdGVzdC1pbWFnZS1hcHBlbmQtdnZjZnIvdGVzdCIsIlVVSUQiOiI1YjZjNzEyNS1iMWE2LTQwNTQtODc1Ny1iZmI3NjkwMTNjZTYiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMDEtMzFUMDM6MjI6NTcuOTUyMTIwMDA1WiJ9": unknown blob
In the registry logs we see
time="2021-01-31T03:22:58.420412156Z" level=error msg="Error statting blob sha256:cca21acb641a96561e0cf9a0c1c7b7ffbaaefc92185bd8a9440f6049c838e33b in remote repository \"quay.io/openshift-release-dev/ocp-v4.0-art-dev\": Head \"https://quay.io/v2/openshift-release-dev/ocp-v4.0-art-dev/blobs/sha256:cca21acb641a96561e0cf9a0c1c7b7ffbaaefc92185bd8a9440f6049c838e33b\": error parsing HTTP 429 response body: invalid character '<' looking for beginning of value: \"<html>\\r\\n<head><title>429 Too Many Requests</title></head>\\r\\n<body bgcolor=\\\"white\\\">\\r\\n<center><h1>429 Too Many Requests</h1></center>\\r\\n<hr><center>nginx/1.12.1</center>\\r\\n</body>\\r\\n</html>\\r\\n\"" go.version=go1.15.5 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=e6614325-c675-4bda-897b-9723aaa3b1de http.request.method=GET http.request.remoteaddr="10.128.2.35:36468" http.request.uri="/v2/openshift/tools/blobs/sha256:cca21acb641a96561e0cf9a0c1c7b7ffbaaefc92185bd8a9440f6049c838e33b" http.request.useragent=Go-http-client/2.0 openshift.auth.user="system:serviceaccount:e2e-test-image-append-vvcfr:builder" vars.digest="sha256:cca21acb641a96561e0cf9a0c1c7b7ffbaaefc92185bd8a9440f6049c838e33b" vars.name=openshift/tools
which then becomes a 500
time="2021-01-31T03:22:58.420594499Z" level=error msg="response completed with error" err.code="blob unknown" err.detail="sha256:cca21acb641a96561e0cf9a0c1c7b7ffbaaefc92185bd8a9440f6049c838e33b" err.message="blob unknown to registry" go.version=go1.15.5 http.request.host="image-registry.openshift-image-registry.svc:5000" http.request.id=e6614325-c675-4bda-897b-9723aaa3b1de http.request.method=GET http.request.remoteaddr="10.128.2.35:36468" http.request.uri="/v2/openshift/tools/blobs/sha256:cca21acb641a96561e0cf9a0c1c7b7ffbaaefc92185bd8a9440f6049c838e33b" http.request.useragent=Go-http-client/2.0 http.response.contenttype="application/json; charset=utf-8" http.response.duration=380.131435ms http.response.status=404 http.response.written=157 openshift.auth.user="system:serviceaccount:e2e-test-image-append-vvcfr:builder" vars.digest="sha256:cca21acb641a96561e0cf9a0c1c7b7ffbaaefc92185bd8a9440f6049c838e33b" vars.name=openshift/tools
Our clients already support 429, so when an upstream tells us to slow down we should pass that back to the client so they can see the backpressure. Then we need to verify that this actuall resolves the issue (i.e. if the 429 continues we may want to flag backpressure on a per upstream basis from the registry so that all clients slow down).
*** Bug 1932643 has been marked as a duplicate of this bug. ***
*** Bug 1929767 has been marked as a duplicate of this bug. ***
Do you have idea how to reproduce this bug?
Increasing severity and priority as this problem complicates investigation of CI problems. For example, BZ 2026104.
I have tested https://github.com/openshift/image-registry/pull/273 and although it seems to address the original error, now I'm getting a different.
Looks like there are other places we need to handle the 429 from the origin registry before it can make its way to the client. I'm investigating.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days