Bug 1951203 - oc adm catalog mirror can generate ICSPs that exceed etcd's size limit
Summary: oc adm catalog mirror can generate ICSPs that exceed etcd's size limit
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.8.0
Assignee: Alexander Greene
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-19 19:34 UTC by Kevin Rizza
Modified: 2022-10-11 09:31 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: It is possible for the `oc adm catalog mirror` command to generate ICSPs that are greater than 262144 bytes in size. Consequence: ICSPs that exceed 262144 bytes are likely to fail when applied to the cluster if the objec already existed in an early state as the `kubectl.kubernetes.io/last-applied-configuration` annotation will likely exceed the 262144 annotation byte limit. Fix: Introduce a the max-icsp-size flag to the `oc adm catalog mirror` command, allowing users to specify the maximum byte size of ICSP files generated by the command. If an ICSP would exceed this limit, the command creates a new ICSP to write to. Result: Users are now able to generate ICSPs that can be applied to the cluster in cases where many registries are being mirrored.
Clone Of:
Environment:
Last Closed: 2021-07-27 23:01:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift oc pull 818 0 None open Bug 1951203: Allow users to set a limit on ICSP file size 2021-05-08 13:57:49 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:02:07 UTC

Description Kevin Rizza 2021-04-19 19:34:01 UTC
Description of problem:

oc adm catalog mirror will generate an image content source policy that the user needs to apply to their cluster to redirect public images referenced in catalogs and operators with their mirrored counterparts. With the right catalog images, ICSPs that are scoped to repositories can be generated that exceed the max etcd object size -- this is further exacerbated by the fact that the previous ICSP is stored as an annotation on update for any given ICSP which effectively cuts the size of an ICSP in half of the normal object size. oc adm catalog mirror should be aware of the size of the objects it is creating and instead generate multiple ICSPs if this limitation is hit


How reproducible:

Always

Steps to Reproduce:
1. Run oc adm catalog mirror --icsp-scope=repository against the official release redhat-operators index image
2. Apply the generated ICSP to a cluster, modify something in the spec, and apply it again

Actual results:

The ICSP fails to apply due to size requirements

Expected results:

Any ICSP that is generated by oc adm catalog mirror should be less than 50% of the default size limit of etcd (which is 2mb) or just under 1mb.

Additional info:

oc adm catalog mirror can use the --icsp-scope=registry flag which should effectively never hit the purported size limits.

Comment 3 Jian Zhang 2021-06-15 06:49:45 UTC
Hi,

Install the oc client which contains the fixed PR.
[cloud-user@preserve-olm-env jian]$ oc version -o yaml
clientVersion:
  buildDate: "2021-06-13T01:50:31Z"
  compiler: gc
  gitCommit: 1077b0516d5baf6f2717e4cb34f58236c0fb7a8c
  gitTreeState: clean
  gitVersion: 4.8.0-202106130124.p0.git.1077b05.assembly.stream-1077b05
  goVersion: go1.16.4
  major: ""
  minor: ""
  platform: linux/amd64
openshiftVersion: 4.8.0-0.nightly-2021-06-14-145150
releaseClientVersion: 4.8.0-0.nightly-2021-06-14-145150
serverVersion:
  buildDate: "2021-06-12T07:15:42Z"
  compiler: gc
  gitCommit: 120883fe7ff642c7de79ec1e533e4c63cf099eb0
  gitTreeState: clean
  gitVersion: v1.21.0-rc.0+120883f
  goVersion: go1.16.4
  major: "1"
  minor: 21+
  platform: linux/amd64

2, Test the scenario when the --max-icsp-size=0, but it hangs there for a long time.
[cloud-user@preserve-olm-env jian]$ oc --loglevel=8 adm catalog mirror --icsp-scope=repository --manifests-only --max-icsp-size=0  registry.redhat.io/redhat/redhat-operator-index:v4.8 upshift-quay.mirror-registry.qe.devcluster.openshift.com:5001/olmqe --to-manifests=bug-manifests-01
...


no digest mapping available for registry.redhat.io/rhpam-7/rhpam-rhel8-operator:7.5.1, skip writing to ImageContentSourcePolicy
no digest mapping available for registry.redhat.io/container-native-virtualization/virt-cdi-importer:v2.2.0-5, skip writing to ImageContentSourcePolicy
no digest mapping available for registry.redhat.io/container-native-virtualization/bridge-marker:v2.2.0-3, skip writing to ImageContentSourcePolicy


^C

3, Set the --max-icsp-size to a negative value, such as -1, it hangs there for a long time too.

[cloud-user@preserve-olm-env jian]$ oc --loglevel=8 adm catalog mirror --icsp-scope=repository --manifests-only --max-icsp-size=-1  registry.redhat.io/redhat/redhat-operator-index:v4.8 upshift-quay.mirror-registry.qe.devcluster.openshift.com:5001/olmqe --to-manifests=bug-manifests-01
...
...
no digest mapping available for registry.redhat.io/openshift-service-mesh/kiali-rhel7:1.12.7, skip writing to ImageContentSourcePolicy
no digest mapping available for registry.redhat.io/fuse7-tech-preview/fuse-apicurito-operator:1.7, skip writing to ImageContentSourcePolicy
no digest mapping available for registry.redhat.io/amq7/amq-interconnect:1.7, skip writing to ImageContentSourcePolicy
no digest mapping available for registry.redhat.io/container-native-virtualization/virt-launcher:v2.2.0-15, skip writing to ImageContentSourcePolicy


^C

I think here we should set the minimal size, or report errors.

4, When setting --max-icsp-size=1000, there are 181 ICSP generated.
[cloud-user@preserve-olm-env jian]$ oc --loglevel=8 adm catalog mirror --icsp-scope=repository --manifests-only --max-icsp-size=1000  registry.redhat.io/redhat/redhat-operator-index:v4.8 upshift-quay.mirror-registry.qe.devcluster.openshift.com:5001/olmqe --to-manifests=bug-manifests-01 
...

[cloud-user@preserve-olm-env jian]$ oc create -f bug-manifests-01/imageContentSourcePolicy.yaml 
imagecontentsourcepolicy.operator.openshift.io/redhat-operator-index-0 created
imagecontentsourcepolicy.operator.openshift.io/redhat-operator-index-1 created
...
imagecontentsourcepolicy.operator.openshift.io/redhat-operator-index-180 created
imagecontentsourcepolicy.operator.openshift.io/redhat-operator-index-181 created

But, one concern, there is no `metadata.annotations` fields generated for those ICSP objects, for example,
[cloud-user@preserve-olm-env jian]$ oc get imagecontentsourcepolicy redhat-operator-index-1    -o yaml
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
  creationTimestamp: "2021-06-15T06:37:33Z"
  generation: 1
  labels:
    operators.openshift.org/catalog: "true"
  name: redhat-operator-index-1
  resourceVersion: "165398"
  uid: 1c461f3a-4ef8-40ba-abed-69b8aa8e87f1
spec:
  repositoryDigestMirrors:
  - mirrors:
    - upshift-quay.mirror-registry.qe.devcluster.openshift.com:5001/olmqe/codeready-workspaces-crw-2-rhel8-operator-metadata
    source: registry.redhat.io/codeready-workspaces/crw-2-rhel8-operator-metadata
  - mirrors:
    - upshift-quay.mirror-registry.qe.devcluster.openshift.com:5001/olmqe/openshift4-ose-prometheus-alertmanager
    source: registry.redhat.io/openshift4/ose-prometheus-alertmanager
  - mirrors:
    - upshift-quay.mirror-registry.qe.devcluster.openshift.com:5001/olmqe/rhacm2-kui-web-terminal-rhel8
    source: registry.redhat.io/rhacm2/kui-web-terminal-rhel8

Comment 4 Jian Zhang 2021-06-15 06:53:27 UTC
One more concern, seems like we should not allow the max size higher than 250000, which means, `--max-icsp-size=270000` should NOT work, but it did work.

[cloud-user@preserve-olm-env jian]$ oc  adm catalog mirror --icsp-scope=repository --manifests-only --max-icsp-size=270000  registry.redhat.io/redhat/redhat-operator-index:v4.8 upshift-quay.mirror-registry.qe.devcluster.openshift.com:5001/olmqe --to-manifests=bug-manifests-03
...

Comment 5 Alexander Greene 2021-06-15 17:59:49 UTC
@jiazha 

Regarding points 2 and 3, we could introduce a requirement that the specified file size be greater than 0, but I am not convinced that we should arbitrarily set a limit to handle ICSP sizes that are too small to add a single entry.

Regarding Point 4, could you explain what the issue is? As far as I can tell, it correctly limited each ICSP size according to the max-icsp-size value you provided.

Regarding "One more concern, seems like we should not allow the max size higher than 250000, which means, `--max-icsp-size=270000` should NOT work, but it did work.", the flag is meant to configure the maximum ICSP generated by the command. There is no reason a customer cannot create an ICSP larger that 250000 as it will work when using `oc create` as opposed to `oc apply`. 250000 is simply a sane default.

Comment 6 Alexander Greene 2021-06-15 18:40:32 UTC
@jiazha it might make sense to:
- Fail immediately if the --max-icsp-size is set to a value less than 1
- Fail if a single mirror cannot be added to an ICSP without exceeding the provided icsp-size-limit

Comment 7 Alexander Greene 2021-06-15 18:59:30 UTC
@jiazha 

It also seems like the validation above should be done as a separate bug.

Comment 8 Jian Zhang 2021-06-16 01:59:31 UTC
Hi Alexander,

> Regarding points 2 and 3, we could introduce a requirement that the specified file size be greater than 0, but I am not convinced that we should arbitrarily set a limit to handle ICSP sizes that are too small to add a single entry.

Yes, but we should cover this situation. Actually, it will hang there until the --max-icsp-size >= 600, maybe we should set the minimum size limits.


> Regarding Point 4, could you explain what the issue is? As far as I can tell, it correctly limited each ICSP size according to the max-icsp-size value you provided.

Yes, it works as expected. Please ignore it, sorry, my mistake, the `metadata.annotations` fields will be generated when using `oc apply`, not `oc create`.


> the flag is meant to configure the maximum ICSP generated by the command. There is no reason a customer cannot create an ICSP larger that 250000 as it will work when using `oc create` as opposed to `oc apply`. 250000 is simply a sane default.


Yes, totally agreed. But, if the users set a very large limit, they will still encounter this issue when using `oc apply`. But, seems like this situation is more like a wrong operation, not a bug. It's OK for me.

> It also seems like the validation above should be done as a separate bug.

Yeah, but this bug introduced a new flag: "--max-icsp-size", if we fixed it in a new bug, I'm not sure if the fix can be released with 4.8.0.

Comment 9 Kevin Rizza 2021-06-16 12:41:52 UTC
Hi Jian

4.8 code freeze has already happened, and I do not believe that this particular edge case should be considered a blocker for the OpenShift release. I'm very strongly of the opinion that this request, while valid, is a nice to have improvement rather than something that should impact this bug from being verified. If you would like to continue tracking this in bugzilla (which I think is reasonable), we should open a separate bug for it rather than holding up the openshift release for another build to handle this edge case. The bug description in question has been resolved by the changes that were already merged, and the remaining is an additional artifact that won't block any customer from actually using the product.

I'm putting this bug back into the MODIFIED state. Please file another bug to handle the hanging case, we should fix that separately and add validation to prevent invalid values from getting passed in.

Comment 11 Jian Zhang 2021-06-17 01:22:26 UTC
Hi Kevin, 

Make sense, I open a new bug https://bugzilla.redhat.com/show_bug.cgi?id=1972962 to trace that issue. Verified this one.

Comment 14 errata-xmlrpc 2021-07-27 23:01:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.