Bug 1926893

Summary: When deploying the operator via OLM (after creating the respective catalogsource), the deployment "lost" the `resources` section.
Product: OpenShift Container Platform Reporter: Vinu K <vkochuku>
Component: OLMAssignee: Joe Lanford <jlanford>
OLM sub component: OLM QA Contact: xzha
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: anbhatta, jlanford, kuiwang, kwall, vdinh
Version: 4.6   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The subscription.spec.config.resources is ALWAYS applied to the installed deployment, even when it is unset/empty. Consequence: Resources defined in the CSV are ignored. Only resources defined in the subscription.spec.config.resources are honored. Fix: Update OLM to override deployment-specific resources only when the subscription.spec.config.resources field is non-nil and non-empty. Result: Resources defined in the deployment are only overridden when the subscription.spec.config.resources field is set to a non-empty value.
Story Points: ---
Clone Of:
: 1937375 (view as bug list) Environment:
Last Closed: 2021-07-27 22:42:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1937375    

Description Vinu K 2021-02-09 15:57:12 UTC
Description of problem:

When generating a CSV via the OperatorSDK (version 0.19.4, supported version in 4.6 as per docs) the requests and limits are taken from the operator deployment manifest as they should be. However, when then deploying the operator via OLM (after creating the respective catalogsource), the deployment "lost" the `resources` section. This means that the requests and limits are not used as specified in the CSV.

Version-Release number of selected component (if applicable):
v4u6

How reproducible:
Easy

Steps to Reproduce:

1. Create the following catalogsource in your cluster:

---
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  labels:
    olm-visibility: hidden
    opsrc-datastore: "true"
    opsrc-provider: reproducer
  name: reproducer-operators
  namespace: openshift-marketplace
spec:
  displayName: Reproducer Operators
  icon:
    base64data: ""
    mediatype: ""
  image: quay.io/jaeichle/reproducer-hello-world-index:0.0.1
  priority: -400
  publisher: jaeichle
  sourceType: grpc
  updateStrategy:
    registryPoll:
      interval: 100m0s
---

2. Once the catalogsource is there and has fetched the index image, go to the OperatorHub and install the operator called Reproducer Requests Limits into a specific namespace.

3. Check the clusterserviceversion and notice the requests and limits defined in the deployment.

4. When the operator installation has finished, check the deployment or pod of the operator -> we will see that there's no requests or limits defined.

Actual results:

The requests and limits should be used as specified in the CSV.

Expected results:

The requests and limits are not used as specified in the CSV.

Additional info:

This leads to OOM killed pods as the default limits are, by design, very low.

Comment 1 Joe Lanford 2021-02-16 17:40:22 UTC
It appears that the issue here is that the subscription.spec.config.resources is ALWAYS applied to the installed deployment, even when it is unset/empty.

To fix this, I think we'll need to use a pointer for subscription.spec.config.resources, and then only override the deployment resources when subscription.spec.config.resources is not nil.

The workaround for this issue is to set subscription.spec.config.resources to the desired resources, which will then be propagated back to the deployment when it is installed.

Comment 3 xzha 2021-03-05 09:42:37 UTC
verify:

[root@preserve-olm-env ~]# oc adm release info registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-03-05-015511 --commits|grep operator-lifecycle-manager
  operator-lifecycle-manager                     https://github.com/operator-framework/operator-lifecycle-manager            37878747f2019f3f840dcc414345aab6ae34a2d1

zhaoxia@wangshanshandeMacBook-Pro 1926893 % oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-03-05-015511   True        False         22m     Cluster version is 4.8.0-0.nightly-2021-03-05-015511

1) create catalogsource
zhaoxia@wangshanshandeMacBook-Pro 1926893 % cat catsrc.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  labels:
    olm-visibility: hidden
    opsrc-datastore: "true"
    opsrc-provider: reproducer
  name: reproducer-operators
  namespace: openshift-marketplace
spec:
  displayName: Reproducer Operators
  icon:
    base64data: ""
    mediatype: ""
  image: quay.io/jaeichle/reproducer-hello-world-index:0.0.1
  priority: -400
  publisher: jaeichle
  sourceType: grpc
  updateStrategy:
    registryPoll:
      interval: 100m0s
zhaoxia@wangshanshandeMacBook-Pro 1926893 % oc apply -f catsrc.yaml 
catalogsource.operators.coreos.com/reproducer-operators created
zhaoxia@wangshanshandeMacBook-Pro 1926893 % oc get catsrc -A
NAMESPACE               NAME                   DISPLAY                TYPE   PUBLISHER      AGE
openshift-marketplace   certified-operators    Certified Operators    grpc   Red Hat        38m
openshift-marketplace   community-operators    Community Operators    grpc   Red Hat        38m
openshift-marketplace   qe-app-registry        Production Operators   grpc   OpenShift QE   16m
openshift-marketplace   redhat-marketplace     Red Hat Marketplace    grpc   Red Hat        38m
openshift-marketplace   redhat-operators       Red Hat Operators      grpc   Red Hat        38m
openshift-marketplace   reproducer-operators   Reproducer Operators   grpc   jaeichle       39s
zhaoxia@wangshanshandeMacBook-Pro 1926893 % oc get packagemanifest -A | grep Reproducer
openshift-marketplace   helloworld-operator                                  Reproducer Operators   76s

2) install operator Reproducer Requests Limits

zhaoxia@wangshanshandeMacBook-Pro 1926893 % oc get csv helloworld-operator.v0.0.1  -o yaml | grep limit -A 5
                  limits:
                    cpu: 500m
                    memory: 500Mi
                  requests:
                    cpu: 100m
                    memory: 100Mi

zhaoxia@wangshanshandeMacBook-Pro 1926893 % oc get deployment
NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
helloworld-operator   1/1     1            1           9m9s
zzhaoxia@wangshanshandeMacBook-Pro 1926893 % oc get deployment helloworld-operator   -o yaml | grep resources -A 6

                f:resources:
                  .: {}
                  f:limits:
                    .: {}
                    f:cpu: {}
                    f:memory: {}
                  f:requests:
--
        resources:
          limits:
            cpu: 500m
            memory: 500Mi
          requests:
            cpu: 100m
            memory: 100Mi
zhaoxia@wangshanshandeMacBook-Pro 1926893 % oc get pod
NAME                                   READY   STATUS    RESTARTS   AGE
helloworld-operator-5b6988cc6d-vr4r4   1/1     Running   0          9m59s
zhaoxia@wangshanshandeMacBook-Pro 1926893 % oc get pod helloworld-operator-5b6988cc6d-vr4r4 -o yaml | grep resources -A 6
            f:resources:
              .: {}
              f:limits:
                .: {}
                f:cpu: {}
                f:memory: {}
              f:requests:
--
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 100m
        memory: 100Mi

verified

Comment 6 errata-xmlrpc 2021-07-27 22:42:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438