+++ This bug was initially created as a clone of Bug #2009424 +++ Description of problem: Github issue: https://github.com/openshift-psap/special-resource-operator/issues/94 When the image in deployment is updated new replicaSet is created for it. While checking the resource availability for deployment it is stuck, as it keeps referring to old replicaset available replica count. infoscale-vtas-licensing-controller-68649d5564 0 0 0 22h infoscale-vtas-licensing-controller-78c4c47dfd 1 1 1 117m 2021-08-24T12:00:04.553Z INFO wait Checking ReplicaSet {"name": "infoscale-vtas-licensing-controller-68649d5564"} 2021-08-24T12:00:04.553Z INFO wait Waiting for availability of {"Kind": "Deployment: infoscale-vtas/infoscale-vtas-licensing-controller"} 2021-08-24T12:00:04.578Z INFO infoscale-vtas RECONCILE REQUEUE: Could not reconcile chart {"error": "Cannot reconcile hardware states: Failed to create state: templates/1000-license-container.yaml: After CRUD hooks failed: Could not wait for resource: Waiting too long for resource: timed out waiting for the condition"} Version-Release number of selected component (if applicable): 4.9 How reproducible: Steps to Reproduce: 1. Create a CR that produces a Deployment. 2. Update the CR to use an upgraded helm chart. For example, changing the image. Actual results: CR is deployed but is never fully reconciled. It stays looping in: 2021-08-24T12:00:04.578Z INFO infoscale-vtas RECONCILE REQUEUE: Could not reconcile chart {"error": "Cannot reconcile hardware states: Failed to create state: templates/1000-license-container.yaml: After CRUD hooks failed: Could not wait for resource: Waiting too long for resource: timed out waiting for the condition"} Expected results: Reconcile loop finishes gracefully. Additional info:
Verified on OCP 4.9rc5 with latest SRO image from release-4.9 github branch (https://github.com/openshift/special-resource-operator.git) deployed SRO from github repo. # TAG=release-4.9 make deploy # oc get pods -n openshift-special-resource-operator # VERSION=0.0.1 REPO=example SPECIALRESOURCE=ping-pong make # oc get pods -n ping-pong NAME READY STATUS RESTARTS AGE ping-pong-client-7fd9cc6848-vbbq9 1/1 Running 0 11m ping-pong-server-7b8b5c98c4-2wxpb 1/1 Running 0 11m # oc get deployment -n ping-pong NAME READY UP-TO-DATE AVAILABLE AGE ping-pong-client 1/1 1 1 12m ping-pong-server 1/1 1 1 12m ## Change image for client and server in each deployment # oc edit deployment -n ping-pong ping-pong-client deployment.apps/ping-pong-client edited # oc edit deployment -n ping-pong ping-pong-server deployment.apps/ping-pong-server edited ## pod recreated with new image # oc get pods -n ping-pong NAME READY STATUS RESTARTS AGE ping-pong-client-c5bff68d7-tmnt2 1/1 Running 2 (25s ago) 65s ping-pong-server-7cbd48d69c-z9rph 1/1 Running 0 34s ## check SRO manager logs for reconcile success: # oc logs -n openshift-special-resource-operator special-resource-controller-manager-7b4898899d-kcbxr -c manager | grep RECONCILE 2021-10-07T00:36:35.975Z INFO status RECONCILE SUCCESS: Reconcile 2021-10-07T00:37:23.187Z INFO cert-manager RECONCILE REQUEUE: Dependency creation failed {"error": "Created new SpecialResource we need to Reconcile"} 2021-10-07T00:38:20.372Z INFO cert-manager RECONCILE REQUEUE: Could not reconcile chart {"error": "Cannot reconcile hardware states: failed post-install: hook execution failed cert-manager-startupapicheck cert-manager/templates/startupapicheck-job.yaml: After CRUD hooks failed: Could not wait for resource: Waiting too long for resource: timed out waiting for the condition"} 2021-10-07T00:40:14.893Z INFO ping-pong RECONCILE SUCCESS: All resources done 2021-10-07T00:40:14.905Z INFO status RECONCILE SUCCESS: Reconcile 2021-10-07T00:40:28.757Z INFO cert-manager RECONCILE SUCCESS: All resources done 2021-10-07T00:40:28.771Z INFO status RECONCILE SUCCESS: Reconcile 2021-10-07T00:41:51.989Z INFO ping-pong RECONCILE SUCCESS: All resources done 2021-10-07T00:41:52.002Z INFO status RECONCILE SUCCESS: Reconcile 2021-10-07T00:57:48.497Z INFO ping-pong RECONCILE SUCCESS: All resources done 2021-10-07T00:57:48.509Z INFO status RECONCILE SUCCESS: Reconcile 2021-10-07T00:59:12.105Z INFO ping-pong RECONCILE SUCCESS: All resources done 2021-10-07T00:59:12.116Z INFO status RECONCILE SUCCESS: Reconcile
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759