Bug 2028960 - oc adm catalog mirror through the proxy with errors
Summary: oc adm catalog mirror through the proxy with errors
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: oc
Version: 4.8
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Nobody
QA Contact: zhou ying
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-03 20:13 UTC by borazem
Modified: 2023-09-18 04:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-10-05 13:09:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
mirror.log and something.log (260.22 KB, application/zip)
2021-12-03 20:13 UTC, borazem
no flags Details

Description borazem 2021-12-03 20:13:07 UTC
Created attachment 1844639 [details]
mirror.log and something.log

Created attachment 1844639 [details]
mirror.log and something.log

Description of problem:
If we run catalog mirror behind the http proxy we are experiencing various type of error messages. Interestingly if we pull some of the images (which parts are failing) manually with podman the images are pulled successfully.
for the mirror we followed the procedure from the installation guide: https://docs.openshift.com/container-platform/4.8/operators/admin/olm-restricted-networks.html


type of errors we experienced are:

-----
uploading: file://local/index/openshift4-ohi/redhat-operator-index/openshift4/ose-cli sha256:85208539a036cf3e1df573dc8985305c9680df9b74aee962e6b8511e1e5d9f9b 18.83MiB error: unable to copy layer sha256:85208539a036cf3e1df573dc8985305c9680df9b74aee962e6b8511e1e5d9f9b to file://local/index/openshift4-ohi/redhat-operator-index/openshift4/ose-cli: content integrity error: the blob streamed from digest sha256:85208539a036cf3e1df573dc8985305c9680df9b74aee962e6b8511e1e5d9f9b does not match the digest calculated from the content sha256:d4bd5dd16881046d857d5ace812f678173a95e8a4ca35f8ca289d777a6d46595 
-----
uploading: file://local/index/openshift4-ohi/redhat-operator-index/openshift-gitops-1/kam-delivery-rhel8 sha256:cd50c1c5fe4a03fb64368f18720df651ef9b17fa71955af2b1a45d7f6eb23dee 72.89MiB error: unable to copy layer sha256:cd50c1c5fe4a03fb64368f18720df651ef9b17fa71955af2b1a45d7f6eb23dee to file://local/index/openshift4-ohi/redhat-operator-index/openshift-gitops-1/kam-delivery-rhel8: unexpected EOF 
-----
error: unable to open source layer sha256:024e289f9d3059db8f47e89e0e13314df01f8651dcf9afb50c4c68d58cb92520 to copy to file://local/index/openshift4-ohi/redhat-operator-index/rhel8/skopeo: received unexpected HTTP status: 504 Connection Timed Out warning: Expected to mount sha256:024e289f9d3059db8f47e89e0e13314df01f8651dcf9afb50c4c68d58cb92520 from /local/index/openshift4-ohi/redhat-operator-index/rhel8/skopeo but mount was ignored 
-----
error: unable to push registry.redhat.io/jboss-eap-7/eap73-rhel8-operator: failed to retrieve blob sha256:dde5a2af38a210f820a27ca4ae77d9cdee95f68f67838831c6c6abdbcd78d822: received unexpected HTTP status: 502 Server Hangup

############
I can also see that sometimes where some blobs are shared, eventually the blog may be downloaded but I guess that does not mean the references for specific download were created currectly. Here I tried to trace the happening for some of shared component/digest/blob. 

In the following example where the blob with the digest sha256:ec168... is shared among couple of images, the last upload is without error:

sha256:ec1681b6a383e4ecedbeddd5abc596f3de835aed6db39a735f62395c8edbff30
      registry.redhat.io/jboss-eap-7/eap73-rhel8-operator
      registry.redhat.io/ocp-tools-43-tech-preview/source-to-image-rhel8
      registry.redhat.io/rhel8/buildah
      registry.redhat.io/rhel8/skopeo
uploading: file://local/index/openshift4-ohi/redhat-operator-index/rhel8/skopeo sha256:ec1681b6a383e4ecedbeddd5abc596f3de835aed6db39a735f62395c8edbff30 70.44MiB
error: unable to copy layer sha256:ec1681b6a383e4ecedbeddd5abc596f3de835aed6db39a735f62395c8edbff30 to file://local/index/openshift4-ohi/redhat-operator-index/rhel8/skopeo: unexpected EOF
warning: Expected to mount sha256:ec1681b6a383e4ecedbeddd5abc596f3de835aed6db39a735f62395c8edbff30 from /local/index/openshift4-ohi/redhat-operator-index/rhel8/skopeo but mount was ignored
uploading: file://local/index/openshift4-ohi/redhat-operator-index/rhel8/buildah sha256:ec1681b6a383e4ecedbeddd5abc596f3de835aed6db39a735f62395c8edbff30 70.44MiB

or, in the following case I can see two additional "mounted:" messages which I assume that the image part was successfully pulled:

sha256:024e289f9d3059db8f47e89e0e13314df01f8651dcf9afb50c4c68d58cb92520
      registry.redhat.io/jboss-eap-7/eap73-rhel8-operator
      registry.redhat.io/ocp-tools-43-tech-preview/source-to-image-rhel8
      registry.redhat.io/rhel8/buildah
      registry.redhat.io/rhel8/skopeo
error: unable to open source layer sha256:024e289f9d3059db8f47e89e0e13314df01f8651dcf9afb50c4c68d58cb92520 to copy to file://local/index/openshift4-ohi/redhat-operator-index/rhel8/skopeo: received unexpected HTTP status: 504 Connection Timed Out
warning: Expected to mount sha256:024e289f9d3059db8f47e89e0e13314df01f8651dcf9afb50c4c68d58cb92520 from /local/index/openshift4-ohi/redhat-operator-index/rhel8/skopeo but mount was ignored
uploading: file://local/index/openshift4-ohi/redhat-operator-index/rhel8/buildah sha256:024e289f9d3059db8f47e89e0e13314df01f8651dcf9afb50c4c68d58cb92520 68.07MiB
mounted: file://local/index/openshift4-ohi/redhat-operator-index/ocp-tools-43-tech-preview/source-to-image-rhel8 sha256:024e289f9d3059db8f47e89e0e13314df01f8651dcf9afb50c4c68d58cb92520 68.07MiB
mounted: file://local/index/openshift4-ohi/redhat-operator-index/jboss-eap-7/eap73-rhel8-operator sha256:024e289f9d3059db8f47e89e0e13314df01f8651dcf9afb50c4c68d58cb92520 68.07MiB

###########

If we repeat the oc adm catalog mirror some of the above errors are always there. We tried to mirror the operators directly (registry to registry) as well as through the portable device and experienced same type of errors, so I assume it is related to the pull through the proxy.


If I manually pull for example the following image the pull is successful:
podman pull registry.redhat.io/jboss-eap-7/eap73-rhel8-operator:2.2-1

I wonder if the successful manual podman pulls are pure luck or there is some difference between "podman" and "oc adm catalog mirror" in how images are pulled through the proxy?

Interestingly we were able to mirror the images for installation of OCP 4.0 successfully. Luck or pattern?

Are there any requirements what proxy can or must not do in order such operation would work well.

I was able to capture two mirroring exercises. Initial one (something.log) and the one after Proxy team changed some proxy settings I think related to how much of "file" is downloaded before it is started to streem to the requester. the log after that setting is (mirror.log) 




Version-Release number of selected component (if applicable):


How reproducible:
just happening in the specific environment. In my test environment with squid proxy I did not experience problems but I am not inspecting the traffic...

However, I often experience various issues with pulling images even when installing OCP online wia proxy :-(

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Jian Zhang 2021-12-22 10:25:17 UTC
I haven't met this issue yet. Pushing uses HTTP PATCH to send the layers over in segments, 
Maybe something was wrong with the Proxy server, the proxy was either reordered, truncated, or something. Refer to: https://docs.docker.com/registry/spec/api/#content-digests

Comment 3 Michal Fojtik 2022-02-07 08:23:23 UTC
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Whiteboard if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 4 Ross Peoples 2022-10-04 17:19:09 UTC
Is this issue still valid with the latest oc 4.11?

Comment 5 borazem 2022-10-05 07:48:10 UTC
(In reply to Ross Peoples from comment #4)
> Is this issue still valid with the latest oc 4.11?

Hi @rpeoples, After we eventually somehow transferred all the images needed to the local registry and performed the deployment we concluded the engagement, so I don't have further information of the clients subsequent oc mirror. Although, it was a client specific issue I often face problems with when working through the proxy.

As we don't have subsequent information I guess we can close that case and perhaps reopen or refer to it when the client will hopefully mirror the images for some upgrade/update.

Thank you, wish you a nice day.

Comment 6 Ross Peoples 2022-10-05 13:09:18 UTC
Hi @borazem.com,

Feel free to open a new bug if it happens again, and relate it to this one so we have some history on it. Thank you and enjoy your day as well!

Comment 7 Red Hat Bugzilla 2023-09-18 04:29:00 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.