Bug 1879877

Summary: oc adm catalog mirror --filter-by-os="linux/amd64" doesn't mirror all necessary images in "Using Operator Lifecycle Manager on restricted networks" guide
Product: OpenShift Container Platform Reporter: Andreas Karis <akaris>
Component: DocumentationAssignee: Alex Dellapenta <adellape>
Status: CLOSED CURRENTRELEASE QA Contact: Jian Zhang <jiazha>
Severity: high Docs Contact: Vikram Goyal <vigoyal>
Priority: high    
Version: 4.5CC: aos-bugs, hfukumot, jiazha, jokerman, jonwilli, kahara, mfuruta, mharri, rbohne
Target Milestone: ---   
Target Release: 4.5.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-30 20:36:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andreas Karis 2020-09-17 09:19:41 UTC
Document URL: 
https://docs.openshift.com/container-platform/4.5/operators/olm-restricted-networks.html#olm-updating-operator-catalog-image_olm-restricted-networks

Section Number and Name: 

Describe the issue: 

Suggestions for improvement: 

Additional information: 

One of our consultants ran into this while helping a customer set up their OpenShift environment. Recently they upgraded from OCP 4.4.3 to OCP 4.5.7 (going through 4.4.14). The customer runs a disconnected environment, and there is a mirror registry which has originally been setup as per [1]. 

As a preparation, they mirrored the new release into the mirror registry, and also created a new OLM catalog version and mirrored the content to the mirror registry as per [2]. For the oc adm catalog mirror command, they specified --filter-by-os="linux/amd64", so that no unnecessary images will be mirrored to the registry. The mirroring procedure went through without any troubles, but after activating the new OLM catalog in the upgraded cluster, OLM was not able to pull the operator images. They noticed the problem for the Elasticsearch operator and the Logging operator, it failed with an Image Pull Error. 

After digging some more, they found the following BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1800674

The last comment [3] pointed them toward using the following filter: "--filter-by-os=.*" The bug has been closed recently, stating that this filter should be used.

After running oc adm catalog mirror using this filter, it synced more images into the registry, amongst them also the missing images for ES and Logging.

The consultant thinks oc adm catalog mirror is not behaving correctly, he expects it to sync all the necessary images for a amd64 cluster when specifying linux/amd64 as a filter. Also, the documentation [2] tells the user to use this filter and doesn't mention the .* filter at all, so IMHO if users adhere to the documentation, they will end up in the situation he outlined here.

[1] https://docs.openshift.com/container-platform/4.2/installing/install_config/installing-restricted-networks-preparations.html#installation-creating-mirror-registry_installing-restricted-networks-preparations
https://docs.openshift.com/container-platform/4.5/operators/olm-restricted-networks.html
[2] https://docs.openshift.com/container-platform/4.5/operators/olm-restricted-networks.html#olm-updating-operator-catalog-image_olm-restricted-networks
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1800674#c42

Comment 2 Andreas Karis 2020-09-25 10:30:31 UTC
This might also be a problem rather in the product itself:

The way our consultant sees this, the oc tool doesn't do the job that it is supposed to do.  If a user follows the documentation, (s)he doesn't get the expected result.

The documentation makes sense, as a customer normally has a cluster with one architecture. They made it work by syncing all the images, but that's not what they want as unnecessary s390 images for instance just use storage on their mirror registry. Hence the oc tool itself needs a fix; the fix to documentation might just be a workaround.

Comment 3 Andreas Karis 2020-09-25 10:31:31 UTC
I can open a BZ against the oc client directly, or let you talk to the devs about how this should actually work. Any way you prefer?

Comment 4 Andreas Karis 2020-09-25 10:34:46 UTC
I created https://bugzilla.redhat.com/show_bug.cgi?id=1882689 against `oc`. So either the doc needs to be fixed or the `oc` command, let's see :-)

Comment 5 Jonquil Williams 2020-09-28 15:53:28 UTC
(In reply to Andreas Karis from comment #4)
> I created https://bugzilla.redhat.com/show_bug.cgi?id=1882689 against `oc`.
> So either the doc needs to be fixed or the `oc` command, let's see :-)

Hi, Andreas.

This might also be related to or a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1880421, though I cannot tell without some Dev assistance. Can you give me some Dev contacts to sort this out? Thanks.

Comment 9 Robert Bohne 2020-12-02 12:26:32 UTC
Document URL: https://docs.openshift.com/container-platform/4.6/operators/admin/olm-restricted-networks.html#olm-mirror-catalog_olm-restricted-networks

Section Number and Name: Mirroring an Operator catalog

Describe the issue: 

To solve Bug 1882689/1890951 it will take a while because it can not be solved on oc cli side. This needs some more design decisions. 


Suggestions for improvement: 

As a short/mid-term solution please add `--filter-by-os='.*'` to oc adm catalog mirror and a reference to:

- https://bugzilla.redhat.com/show_bug.cgi?id=1890951
- https://bugzilla.redhat.com/show_bug.cgi?id=1882689


Changed to high because this affects all customers who run operators in a restricted network!

Comment 12 Alex Dellapenta 2021-07-30 20:36:27 UTC
With the release of OCP 4.8, version 4.5 is now end-of-life (EOL); see https://docs.openshift.com/container-platform/4.8/release_notes/ocp-4-8-release-notes.html#ocp-4-8-about-this-release.

Because this BZ is against the now EOL and unsupported 4.5 docs, I am closing it, however the 4.5 catalog mirroring docs do currently have a WARNING admonition box that discusses the need for --filter-by-os='.*' in https://docs.openshift.com/container-platform/4.5/operators/admin/olm-restricted-networks.html#olm-restricted-networks-operatorhub_olm-restricted-networks:

====
[WARNING]	
If the --filter-by-os flag remains unset or set to any value other than .*, the command filters out different architectures, which changes the digest of the manifest list, also known as a multi-arch image. The incorrect digest causes deployments of those images and Operators on disconnected clusters to fail. For more information, see BZ#1890951.
====

In 4.7, the --filter-by-os flag was deprecated in favor of the --index-filter-by-os flag, as discussed in https://docs.openshift.com/container-platform/4.7/release_notes/ocp-4-7-release-notes.html#ocp-4-7-filterbyos-deprecated. This change was also backported to the 4.6 oc client, as seen in the following "oc adm catalog mirror -h" output:

====
--filter-by-os='': Use --index-filter-by-os instead. A regular expression to control which index image is picked
when multiple variants are available. Images will be passed as '<platform>/<architecture>[/<variant>]'. This does not
apply to images referenced by the index.

--index-filter-by-os='': A regular expression to control which index image is picked when multiple variants are
available. Images will be passed as '<platform>/<architecture>[/<variant>]'. This does not apply to images referenced by
the index.
====

The 4.6 and later catalog mirroring docs were also updated to no longer show usage of the --filter-by-os flag, and instead show the (optional) usage of the --index-filter-by-os flag, e.g. in https://docs.openshift.com/container-platform/4.6/operators/admin/olm-restricted-networks.html#olm-mirror-catalog_olm-restricted-networks.

Setting this BZ's CLOSED status to CURRENTRELEASE since the issue is better addressed in the currently supported releases (4.6 and later). If there are outstanding requests for the 4.6 and later docs related to this issue, please let me know.