Summary: | Marketplace extract container does not request CPU or memory | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Clayton Coleman <ccoleman> | |
Component: | OLM | Assignee: | Kevin Rizza <krizza> | |
OLM sub component: | OLM | QA Contact: | Salvatore Colangelo <scolange> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | jiazha, jlanford, nhale, tflannag, wking | |
Version: | 4.8 | |||
Target Milestone: | --- | |||
Target Release: | 4.8.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1952851 (view as bug list) | Environment: | ||
Last Closed: | 2021-07-27 22:53:17 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Bug Depends On: | ||||
Bug Blocks: | 1952851 |
Description
Clayton Coleman
2021-03-13 22:27:22 UTC
From the CONVENTIONS doc: The memory request of cluster components should be set to a value 10% higher than their 90th percentile actual consumption over a standard end-to-end suite run. The CPU request of cluster components is based on the following formula and lower/upper bound rules: > floor(baseline_request / baseline_actual * component_actual) > > Then, these rules for lower and upper bounds should be applied: > > The CPU request should never be lower than 5m. Setting a 5m limit avoids extreme ratio calculations when the node is stressed, while still representing the noise of a mostly idle workload. > If the computed value is more than 100m, use the lower of the computed value and 200% of the usage of the component in an idle cluster. This cap means components that require bursts of CPU time may be throttled on busy hosts, but they are more likely to be schedulable in the first place. Since the openshift-marketplace's "extract" batch job is part of a control plane component, we will use etcd as a baseline to compute its CPU resource requests. Both CPU and memory request formulas use numbers based on the end-to-end parallel conformance test job. After running the tests, use the Prometheus instance in the cluster to query the kube_pod_resource_request and kube_pod_resource_limit metrics and find numbers for the Pod(s) for the component being tuned. Moving to high severity, payload workloads may not run without requests. This may not be deferred from 4.8 > This may not be deferred from 4.8
That means blocker+.
pulling back to POST briefly while I update the test suite. This will not affect verifying the marketplace fix, so feel free to go ahead with that. Will be back to ON_QA shortly. [scolange@scolange go]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-04-18-101412 True False 26h Cluster version is 4.8.0-0.nightly-2021-04-18-101412 1. Install an operator in a general namespaces [scolange@scolange go]$ oc -n scolange get sub NAME PACKAGE SOURCE CHANNEL couchbase-enterprise-certified couchbase-enterprise-certified certified-operators stable [scolange@scolange go]$ oc -n scolange get csv NAME DISPLAY VERSION REPLACES PHASE couchbase-operator.v2.1.0 Couchbase Operator 2.1.0 couchbase-operator.v2.0.2 Succeeded [scolange@scolange go]$ oc -n scolange get ip NAME CSV APPROVAL APPROVED install-4vgjl couchbase-operator.v2.1.0 Automatic true 2. Verify the jobs from origin namespace in this case ( openshift-marketplace ) [scolange@scolange go]$ oc -n openshift-marketplace get jobs NAME COMPLETIONS DURATION AGE 5c410a08445875ef0dd1a81b992b068f3a86bd2f5a79c433ad9e0bc4d62ef09 1/1 25s 19m 3. Verify inside the jobs the value of spec.containers[].resources.requests field are setted scolange@scolange go]$ oc -n openshift-marketplace get jobs 5c410a08445875ef0dd1a81b992b068f3a86bd2f5a79c433ad9e0bc4d62ef09 -o yaml apiVersion: batch/v1 kind: Job metadata: creationTimestamp: "2021-04-20T15:38:07Z" labels: controller-uid: 52bcb04d-571c-408b-9b5d-69f2466e7806 job-name: 5c410a08445875ef0dd1a81b992b068f3a86bd2f5a79c433ad9e0bc4d62ef09 managedFields: - apiVersion: batch/v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:ownerReferences: .: {} k:{"uid":"dc09c497-b0c5-4f82-a70b-31fc03ce774a"}: .: {} f:apiVersion: {} f:blockOwnerDeletion: {} .... ... .... resources: requests: cpu: 10m memory: 50Mi .... LGMT Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |