1726455 – Limits are not reconciled by CVO

Bug 1726455 - Limits are not reconciled by CVO

Summary: Limits are not reconciled by CVO

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Cluster Version Operator
Sub Component:
Version:	4.1.z
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.2.0
Assignee:	Joseph Callen
QA Contact:	Oleg Nesterov
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-07-02 21:36 UTC by Justin Pierce
Modified:	2019-10-16 06:33 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-10-16 06:33:03 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-version-operator pull 240	0	None	None	None	2019-08-20 16:48:25 UTC
Red Hat Product Errata	RHBA-2019:2922	0	None	None	None	2019-10-16 06:33:14 UTC

Description Justin Pierce 2019-07-02 21:36:04 UTC

Description of problem:
It was noted during an upgrade from 4.1.3 to 4.1.4 that a change (the removal of) the memory limit spec for the CCO deployment was not reconciled. As a result, the overly restrictive limit remained active.

Version-Release number of the following components:
4.1.4

How reproducible:
by design: https://github.com/openshift/cluster-version-operator/blob/08cac1c02538c279d9aac094d42cf8788dc4e45c/lib/resourcemerge/core.go#L95-L149

Steps to Reproduce:
1. Install 4.1.3 and note the limits on the openshift-cloud-credential-operator pod.
2. Upgrade to 4.1.4.

Actual results:
The CVO will not remove the limit for the deployment/pod despite it being removed from the CCO manifest. 

Expected results:
ccoleman that the limit should reconciled.

Comment 2 Oleg Nesterov 2019-08-23 08:10:45 UTC

Checked memory limit of pod after upgrade from 4.1.11 to 4.1.12 and from 4.1.0-0.nightly-2019-08-21-150150 to 4.1.0-0.nightly-2019-08-22-165647. The memory limit still exists in both cases

[cloud-user@preserve-qe-olnester-ocp-workstation openshift]$ oc get pods cloud-credential-operator-7cdfbcf58b-ch7t6 -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    k8s.v1.cni.cncf.io/networks-status: ""
  creationTimestamp: "2019-08-23T08:00:29Z"
  deletionGracePeriodSeconds: 600
  deletionTimestamp: "2019-08-23T08:13:49Z"
  generateName: cloud-credential-operator-7cdfbcf58b-
  labels:
    control-plane: controller-manager
    controller-tools.k8s.io: "1.0"
    pod-template-hash: 7cdfbcf58b
  name: cloud-credential-operator-7cdfbcf58b-ch7t6
  namespace: openshift-cloud-credential-operator
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: cloud-credential-operator-7cdfbcf58b
    uid: cb70f11f-c57a-11e9-98f9-023b25b70b2c
  resourceVersion: "39590"
  selfLink: /api/v1/namespaces/openshift-cloud-credential-operator/pods/cloud-credential-operator-7cdfbcf58b-ch7t6
  uid: 132d5d0d-c57c-11e9-98f9-023b25b70b2c
spec:
  containers:
  - command:
    - /root/manager
    - --log-level
    - debug
    env:
    - name: RELEASE_VERSION
      value: 4.1.0-0.nightly-2019-08-22-165647
    image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4e5dfbb88c2ea1f35f68899ca5b40b309280ae996459e063fb1f5c05b0c568e8
    imagePullPolicy: IfNotPresent
    name: manager
    ports:
    - containerPort: 9876
      name: webhook-server
      protocol: TCP
    resources:
      requests:
        cpu: 10m
        memory: 150Mi
    terminationMessagePath: /dev/termination-log

Comment 4 Oleg Nesterov 2019-08-23 14:49:54 UTC

Ticket is opened for 4.1.x , it's why I tested with 4.1.x. I can test it with 4.2.x too

Comment 5 Joseph Callen 2019-08-23 15:35:40 UTC

(In reply to Oleg Nesterov from comment #4)
> Ticket is opened for 4.1.x , it's why I tested with 4.1.x. I can test it
> with 4.2.x too

Yes please test with 4.2.x, thanks!

Comment 6 Oleg Nesterov 2019-08-23 20:21:19 UTC

After upgrade from 4.2.0-0.nightly-2019-08-22-201424 to 4.2.0-0.nightly-2019-08-23-004712, I observe the same situation: memory: 150Mi spec still present

[onest@localhost go_openshift]$ oc get pods cloud-credential-operator-6cc8645548-4vxft -o yaml -n openshift-cloud-credential-operator
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2019-08-23T19:54:07Z"
  generateName: cloud-credential-operator-6cc8645548-
  labels:
    control-plane: controller-manager
    controller-tools.k8s.io: "1.0"
    pod-template-hash: 6cc8645548
  name: cloud-credential-operator-6cc8645548-4vxft
  namespace: openshift-cloud-credential-operator
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: cloud-credential-operator-6cc8645548
    uid: f7afb1d5-c5dd-11e9-8991-02684f45b248
  resourceVersion: "92004"
  selfLink: /api/v1/namespaces/openshift-cloud-credential-operator/pods/cloud-credential-operator-6cc8645548-4vxft
  uid: c4870c07-c5df-11e9-b245-0e41aff0a3ba
spec:
  containers:
  - command:
    - /root/manager
    - --log-level
    - debug
    env:
    - name: RELEASE_VERSION
      value: 4.2.0-0.nightly-2019-08-23-004712
    image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:60be08d77af406589526adafe8e9a3eb9b5f03df7c04f1826677751b521104eb
    imagePullPolicy: IfNotPresent
    name: manager
    ports:
    - containerPort: 9876
      name: webhook-server
      protocol: TCP
    resources:
      requests:
        cpu: 10m
        memory: 150Mi

Comment 7 Joseph Callen 2019-08-23 20:30:52 UTC

(In reply to Oleg Nesterov from comment #6)
> After upgrade from 4.2.0-0.nightly-2019-08-22-201424 to
> 4.2.0-0.nightly-2019-08-23-004712, I observe the same situation: memory:
> 150Mi spec still present

It is supposed to be present:

https://github.com/openshift/cloud-credential-operator/blob/14ea16f11aaf6b77d3bce0dcfb93546e12da6942/manifests/01_deployment.yaml#L168

Comment 8 Joseph Callen 2019-08-23 20:44:44 UTC

If the containers resource requirements change between releases the newest (required) is used.
To test this probably the easiest way is to modify a deployment and remove or change the requests or limits.  Upgrade then make sure the requirements are changed back to what is defined in the manifest.


https://github.com/openshift/cluster-version-operator/blob/817a2d5f256526bfab7368a978655c9cc06334ad/lib/resourcemerge/core.go#L449-L459

Comment 9 Oleg Nesterov 2019-08-24 04:59:28 UTC

Verified. It works with upgrade from 4.2.0-0.nightly-2019-08-23-004712 to 4.2.0-0.nightly-2019-08-23-034826

Comment 10 errata-xmlrpc 2019-10-16 06:33:03 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922

Note You need to log in before you can comment on or make changes to this bug.