Bug 1938492 - Marketplace extract container does not request CPU or memory
Summary: Marketplace extract container does not request CPU or memory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Kevin Rizza
QA Contact: Salvatore Colangelo
URL:
Whiteboard:
Depends On:
Blocks: 1952851
TreeView+ depends on / blocked
 
Reported: 2021-03-13 22:27 UTC by Clayton Coleman
Modified: 2021-07-27 22:53 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1952851 (view as bug list)
Environment:
Last Closed: 2021-07-27 22:53:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift operator-framework-olm pull 55 0 None closed Bug 1938492: Add resource requests for bundle unpacker 2021-04-16 02:56:54 UTC
Github openshift origin pull 26085 0 None open Bug 1938492: test/extended/operators/resources: Marketplace Job sets requests 2021-04-16 02:58:20 UTC
Github operator-framework operator-lifecycle-manager pull 2099 0 None closed Add resource requests for bundle unpacker 2021-04-16 02:56:52 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:53:36 UTC

Description Clayton Coleman 2021-03-13 22:27:22 UTC
All payload components should request a reasonable minimum CPU and p90 memory usage

https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#resources-and-limits

The openshift-marketplace batch job container "extract" does not request cpu or  memory.  Please follow recommendations in that doc.

Referenced from the new e2e test which gates components without resource requests and enforces the resource conventions.

  batch/v1/Job/openshift-marketplace/ab0ec41ac51719de72554e09c32400b13c6d15dcf7d38302d5ed14fcb2e8839/container/extract does not have a cpu request (rule: "batch/v1/Job/openshift-marketplace/ab0ec41ac51719de72554e09c32400b13c6d15dcf7d38302d5ed14fcb2e8839/container/extract/request[cpu]")
  batch/v1/Job/openshift-marketplace/ab0ec41ac51719de72554e09c32400b13c6d15dcf7d38302d5ed14fcb2e8839/container/extract does not have a memory request (rule: "batch/v1/Job/openshift-marketplace/ab0ec41ac51719de72554e09c32400b13c6d15dcf7d38302d5ed14fcb2e8839/container/extract/request[memory]")

Comment 1 Joe Lanford 2021-03-15 13:59:14 UTC
From the CONVENTIONS doc:

The memory request of cluster components should be set to a value 10% higher than their 90th percentile actual consumption over a standard end-to-end suite run.
The CPU request of cluster components is based on the following formula and lower/upper bound rules:

> floor(baseline_request / baseline_actual * component_actual)
>
> Then, these rules for lower and upper bounds should be applied:
>
>    The CPU request should never be lower than 5m. Setting a 5m limit avoids extreme ratio calculations when the node is stressed, while still representing the noise of a mostly idle workload.
>    If the computed value is more than 100m, use the lower of the computed value and 200% of the usage of the component in an idle cluster. This cap means components that require bursts of CPU time may be throttled on busy hosts, but they are more likely to be schedulable in the first place.

Since the openshift-marketplace's "extract" batch job is part of a control plane component, we will use etcd as a baseline to compute its CPU resource requests.

Both CPU and memory request formulas use numbers based on the end-to-end parallel conformance test job. After running the tests, use the Prometheus instance in the cluster to query the kube_pod_resource_request and kube_pod_resource_limit metrics and find numbers for the Pod(s) for the component being tuned.

Comment 3 Clayton Coleman 2021-03-25 22:16:33 UTC
Moving to high severity, payload workloads may not run without requests. 

This may not be deferred from 4.8

Comment 4 W. Trevor King 2021-04-02 21:40:19 UTC
> This may not be deferred from 4.8

That means blocker+.

Comment 8 W. Trevor King 2021-04-16 02:55:58 UTC
pulling back to POST briefly while I update the test suite.  This will not affect verifying the marketplace fix, so feel free to go ahead with that.  Will be back to ON_QA shortly.

Comment 10 Salvatore Colangelo 2021-04-20 16:00:31 UTC
[scolange@scolange go]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-04-18-101412   True        False         26h     Cluster version is 4.8.0-0.nightly-2021-04-18-101412


1. Install an operator in a general namespaces

[scolange@scolange go]$ oc -n scolange get sub
NAME                             PACKAGE                          SOURCE                CHANNEL
couchbase-enterprise-certified   couchbase-enterprise-certified   certified-operators   stable
[scolange@scolange go]$ oc -n scolange get csv
NAME                        DISPLAY              VERSION   REPLACES                    PHASE
couchbase-operator.v2.1.0   Couchbase Operator   2.1.0     couchbase-operator.v2.0.2   Succeeded
[scolange@scolange go]$ oc -n scolange get ip
NAME            CSV                         APPROVAL    APPROVED
install-4vgjl   couchbase-operator.v2.1.0   Automatic   true

2. Verify the jobs from origin namespace in this case ( openshift-marketplace ) 

[scolange@scolange go]$ oc -n openshift-marketplace get jobs
NAME                                                              COMPLETIONS   DURATION   AGE
5c410a08445875ef0dd1a81b992b068f3a86bd2f5a79c433ad9e0bc4d62ef09   1/1           25s        19m
 

3. Verify inside the jobs the value of spec.containers[].resources.requests field are setted

scolange@scolange go]$ oc -n openshift-marketplace get jobs 5c410a08445875ef0dd1a81b992b068f3a86bd2f5a79c433ad9e0bc4d62ef09 -o yaml
apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: "2021-04-20T15:38:07Z"
  labels:
    controller-uid: 52bcb04d-571c-408b-9b5d-69f2466e7806
    job-name: 5c410a08445875ef0dd1a81b992b068f3a86bd2f5a79c433ad9e0bc4d62ef09
  managedFields:
  - apiVersion: batch/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:ownerReferences:
          .: {}
          k:{"uid":"dc09c497-b0c5-4f82-a70b-31fc03ce774a"}:
            .: {}
            f:apiVersion: {}
            f:blockOwnerDeletion: {}
....
...
....

        resources:
          requests:
            cpu: 10m
            memory: 50Mi
....








LGMT

Comment 13 errata-xmlrpc 2021-07-27 22:53:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.