Description of problem: It was noted during an upgrade from 4.1.3 to 4.1.4 that a change (the removal of) the memory limit spec for the CCO deployment was not reconciled. As a result, the overly restrictive limit remained active. Version-Release number of the following components: 4.1.4 How reproducible: by design: https://github.com/openshift/cluster-version-operator/blob/08cac1c02538c279d9aac094d42cf8788dc4e45c/lib/resourcemerge/core.go#L95-L149 Steps to Reproduce: 1. Install 4.1.3 and note the limits on the openshift-cloud-credential-operator pod. 2. Upgrade to 4.1.4. Actual results: The CVO will not remove the limit for the deployment/pod despite it being removed from the CCO manifest. Expected results: ccoleman that the limit should reconciled.
Checked memory limit of pod after upgrade from 4.1.11 to 4.1.12 and from 4.1.0-0.nightly-2019-08-21-150150 to 4.1.0-0.nightly-2019-08-22-165647. The memory limit still exists in both cases [cloud-user@preserve-qe-olnester-ocp-workstation openshift]$ oc get pods cloud-credential-operator-7cdfbcf58b-ch7t6 -o yaml apiVersion: v1 kind: Pod metadata: annotations: k8s.v1.cni.cncf.io/networks-status: "" creationTimestamp: "2019-08-23T08:00:29Z" deletionGracePeriodSeconds: 600 deletionTimestamp: "2019-08-23T08:13:49Z" generateName: cloud-credential-operator-7cdfbcf58b- labels: control-plane: controller-manager controller-tools.k8s.io: "1.0" pod-template-hash: 7cdfbcf58b name: cloud-credential-operator-7cdfbcf58b-ch7t6 namespace: openshift-cloud-credential-operator ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: cloud-credential-operator-7cdfbcf58b uid: cb70f11f-c57a-11e9-98f9-023b25b70b2c resourceVersion: "39590" selfLink: /api/v1/namespaces/openshift-cloud-credential-operator/pods/cloud-credential-operator-7cdfbcf58b-ch7t6 uid: 132d5d0d-c57c-11e9-98f9-023b25b70b2c spec: containers: - command: - /root/manager - --log-level - debug env: - name: RELEASE_VERSION value: 4.1.0-0.nightly-2019-08-22-165647 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4e5dfbb88c2ea1f35f68899ca5b40b309280ae996459e063fb1f5c05b0c568e8 imagePullPolicy: IfNotPresent name: manager ports: - containerPort: 9876 name: webhook-server protocol: TCP resources: requests: cpu: 10m memory: 150Mi terminationMessagePath: /dev/termination-log
Ticket is opened for 4.1.x , it's why I tested with 4.1.x. I can test it with 4.2.x too
(In reply to Oleg Nesterov from comment #4) > Ticket is opened for 4.1.x , it's why I tested with 4.1.x. I can test it > with 4.2.x too Yes please test with 4.2.x, thanks!
After upgrade from 4.2.0-0.nightly-2019-08-22-201424 to 4.2.0-0.nightly-2019-08-23-004712, I observe the same situation: memory: 150Mi spec still present [onest@localhost go_openshift]$ oc get pods cloud-credential-operator-6cc8645548-4vxft -o yaml -n openshift-cloud-credential-operator apiVersion: v1 kind: Pod metadata: creationTimestamp: "2019-08-23T19:54:07Z" generateName: cloud-credential-operator-6cc8645548- labels: control-plane: controller-manager controller-tools.k8s.io: "1.0" pod-template-hash: 6cc8645548 name: cloud-credential-operator-6cc8645548-4vxft namespace: openshift-cloud-credential-operator ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: cloud-credential-operator-6cc8645548 uid: f7afb1d5-c5dd-11e9-8991-02684f45b248 resourceVersion: "92004" selfLink: /api/v1/namespaces/openshift-cloud-credential-operator/pods/cloud-credential-operator-6cc8645548-4vxft uid: c4870c07-c5df-11e9-b245-0e41aff0a3ba spec: containers: - command: - /root/manager - --log-level - debug env: - name: RELEASE_VERSION value: 4.2.0-0.nightly-2019-08-23-004712 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:60be08d77af406589526adafe8e9a3eb9b5f03df7c04f1826677751b521104eb imagePullPolicy: IfNotPresent name: manager ports: - containerPort: 9876 name: webhook-server protocol: TCP resources: requests: cpu: 10m memory: 150Mi
(In reply to Oleg Nesterov from comment #6) > After upgrade from 4.2.0-0.nightly-2019-08-22-201424 to > 4.2.0-0.nightly-2019-08-23-004712, I observe the same situation: memory: > 150Mi spec still present It is supposed to be present: https://github.com/openshift/cloud-credential-operator/blob/14ea16f11aaf6b77d3bce0dcfb93546e12da6942/manifests/01_deployment.yaml#L168
If the containers resource requirements change between releases the newest (required) is used. To test this probably the easiest way is to modify a deployment and remove or change the requests or limits. Upgrade then make sure the requirements are changed back to what is defined in the manifest. https://github.com/openshift/cluster-version-operator/blob/817a2d5f256526bfab7368a978655c9cc06334ad/lib/resourcemerge/core.go#L449-L459
Verified. It works with upgrade from 4.2.0-0.nightly-2019-08-23-004712 to 4.2.0-0.nightly-2019-08-23-034826
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922