Description of problem: Github issue: https://github.com/openshift-psap/special-resource-operator/issues/94 When the image in deployment is updated new replicaSet is created for it. While checking the resource availability for deployment it is stuck, as it keeps referring to old replicaset available replica count. infoscale-vtas-licensing-controller-68649d5564 0 0 0 22h infoscale-vtas-licensing-controller-78c4c47dfd 1 1 1 117m 2021-08-24T12:00:04.553Z INFO wait Checking ReplicaSet {"name": "infoscale-vtas-licensing-controller-68649d5564"} 2021-08-24T12:00:04.553Z INFO wait Waiting for availability of {"Kind": "Deployment: infoscale-vtas/infoscale-vtas-licensing-controller"} 2021-08-24T12:00:04.578Z INFO infoscale-vtas RECONCILE REQUEUE: Could not reconcile chart {"error": "Cannot reconcile hardware states: Failed to create state: templates/1000-license-container.yaml: After CRUD hooks failed: Could not wait for resource: Waiting too long for resource: timed out waiting for the condition"} Version-Release number of selected component (if applicable): 4.9 How reproducible: Steps to Reproduce: 1. Create a CR that produces a Deployment. 2. Update the CR to use an upgraded helm chart. For example, changing the image. Actual results: CR is deployed but is never fully reconciled. It stays looping in: 2021-08-24T12:00:04.578Z INFO infoscale-vtas RECONCILE REQUEUE: Could not reconcile chart {"error": "Cannot reconcile hardware states: Failed to create state: templates/1000-license-container.yaml: After CRUD hooks failed: Could not wait for resource: Waiting too long for resource: timed out waiting for the condition"} Expected results: Reconcile loop finishes gracefully. Additional info:
Verified on OCP 4.10.0-0.nightly-2021-10-06-093151 1. git clone master branch of: https://github.com/openshift/special-resource-operator.git 2. cd special-resource-operator 3. untar the `ping-pong-0.0.2.tgz` which has a new version in the charts/example dir 4. Build local image of SRO: make local-image-build local-image-push deploy 5. tag and push new SRO image to your local quay.io account 6. IMAGE=quay.io/<your_accountname>/special-resource-operator:master make deploy . oc apply -f charts/example/ping-pong-0.0.1 8. oc get all -n ping-pong NAME READY STATUS RESTARTS AGE pod/ping-pong-client-7fd9cc6848-m6r5b 1/1 Running 0 7m1s pod/ping-pong-server-7b8b5c98c4-jr2qn 1/1 Running 0 7m12s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/ping-pong-service ClusterIP 172.30.45.234 <none> 12021/TCP 7m12s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/ping-pong-client 1/1 1 1 7m1s deployment.apps/ping-pong-server 1/1 1 1 7m12s NAME DESIRED CURRENT READY AGE replicaset.apps/ping-pong-client-7fd9cc6848 1 1 1 7m1s replicaset.apps/ping-pong-server-7b8b5c98c4 1 1 1 7m12s 9. oc apply -f charts/example/ping-pong-0.0.2 This will cause the ping-pong pods to redeploy 10. oc logs -n openshift-special-resource-operator special-resource-controller-manager-7867485ccd-lzh75 -c manager . . . 2021-10-07T21:21:08.103Z INFO ping-pong NODE {"Setting Label ": "specialresource.openshift.io/state-ping-pong-0004", "on ": "ip-10-0-153-25.us-east-2.compute.internal"} 2021-10-07T21:21:08.112Z INFO ping-pong NODE {"Setting Label ": "specialresource.openshift.io/state-ping-pong-0004", "on ": "ip-10-0-178-128.us-east-2.compute.internal"} 2021-10-07T21:21:08.121Z INFO ping-pong NODE {"Setting Label ": "specialresource.openshift.io/state-ping-pong-0004", "on ": "ip-10-0-192-153.us-east-2.compute.internal"} 2021-10-07T21:21:08.121Z INFO ping-pong Executing {"State": "templates/0005_client.yaml"} 2021-10-07T21:21:09.520Z INFO warning OnError: release: already exists 2021-10-07T21:21:09.520Z INFO helmer Release pre-install hooks 2021-10-07T21:21:09.525Z INFO helmer Hooks {"pre-install": "Ready (Get)"} 2021-10-07T21:21:09.525Z INFO helmer Release manifests 2021-10-07T21:21:09.530Z INFO resource Namespace empty settting {"namespace": "ping-pong"} 2021-10-07T21:21:09.535Z INFO resource Found, not updating, hash the same: Deployment/ping-pong-client {"Kind": "Deployment: ping-pong/ping-pong-client"} 2021-10-07T21:21:09.535Z INFO resource specialresource.openshift.io/wait 2021-10-07T21:21:09.535Z INFO wait ForResource {"Kind": "Deployment"} 2021-10-07T21:21:19.552Z INFO wait Checking ReplicaSet {"name": "ping-pong-client-7fd9cc6848"} 2021-10-07T21:21:19.552Z INFO wait ReplicaSet scheduled for termination {"name": "ping-pong-client-7fd9cc6848"} 2021-10-07T21:21:19.552Z INFO wait Checking ReplicaSet {"name": "ping-pong-client-c5bff68d7"} 2021-10-07T21:21:19.552Z INFO wait Status {"AvailableReplicas": 1, "Replicas": 1} 2021-10-07T21:21:19.552Z INFO wait Checking ReplicaSet {"name": "ping-pong-server-7b8b5c98c4"} 2021-10-07T21:21:19.552Z INFO wait ReplicaSet scheduled for termination {"name": "ping-pong-server-7b8b5c98c4"} 2021-10-07T21:21:19.552Z INFO wait Checking ReplicaSet {"name": "ping-pong-server-7cbd48d69c"} 2021-10-07T21:21:19.552Z INFO wait Status {"AvailableReplicas": 1, "Replicas": 1} 2021-10-07T21:21:19.552Z INFO wait Resource available {"Kind": "Deployment: ping-pong/ping-pong-client"} 2021-10-07T21:21:19.552Z INFO helmer Release post-install hooks 2021-10-07T21:21:19.556Z INFO helmer Hooks {"post-install": "Ready (Get)"} 2021-10-07T21:21:19.587Z INFO cache Nodes cached {"name": "ip-10-0-153-25.us-east-2.compute.internal"} 2021-10-07T21:21:19.587Z INFO cache Nodes cached {"name": "ip-10-0-178-128.us-east-2.compute.internal"} 2021-10-07T21:21:19.587Z INFO cache Nodes cached {"name": "ip-10-0-192-153.us-east-2.compute.internal"} 2021-10-07T21:21:19.587Z INFO cache Node list: {"length": 3} 2021-10-07T21:21:19.587Z INFO cache Nodes {"num": 3} 2021-10-07T21:21:19.596Z INFO ping-pong NODE {"Setting Label ": "specialresource.openshift.io/state-ping-pong-0005", "on ": "ip-10-0-153-25.us-east-2.compute.internal"} 2021-10-07T21:21:19.607Z INFO ping-pong NODE {"Setting Label ": "specialresource.openshift.io/state-ping-pong-0005", "on ": "ip-10-0-178-128.us-east-2.compute.internal"} 2021-10-07T21:21:19.616Z INFO ping-pong NODE {"Setting Label ": "specialresource.openshift.io/state-ping-pong-0005", "on ": "ip-10-0-192-153.us-east-2.compute.internal"} 2021-10-07T21:21:20.707Z INFO warning OnError: release: already exists 2021-10-07T21:21:20.707Z INFO helmer Release pre-install hooks 2021-10-07T21:21:20.711Z INFO helmer Hooks {"pre-install": "Ready (Get)"} 2021-10-07T21:21:20.711Z INFO helmer Release manifests 2021-10-07T21:21:20.716Z INFO helmer Release post-install hooks 2021-10-07T21:21:20.725Z INFO helmer Hooks {"post-install": "Ready (Get)"} 2021-10-07T21:21:20.734Z INFO ping-pong RECONCILE SUCCESS: All resources done 2021-10-07T21:21:20.739Z INFO status Reconciling ClusterOperator 2021-10-07T21:21:20.747Z INFO status Adding to relatedObjects {"namespace": "ping-pong"} 2021-10-07T21:21:20.747Z INFO status Adding to relatedObjects {"namespace": "cert-manager"} 2021-10-07T21:21:20.747Z INFO status Adding to relatedObjects {"namespace": "preamble"} 2021-10-07T21:21:20.754Z INFO status RECONCILE SUCCESS: Reconcile 11. # oc get all -n ping-pong NAME READY STATUS RESTARTS AGE pod/ping-pong-client-c5bff68d7-hgbhn 1/1 Running 1 (6m19s ago) 6m21s pod/ping-pong-server-7cbd48d69c-mq59t 1/1 Running 0 6m32s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/ping-pong-service ClusterIP 172.30.45.234 <none> 12021/TCP 15m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/ping-pong-client 1/1 1 1 14m deployment.apps/ping-pong-server 1/1 1 1 15m NAME DESIRED CURRENT READY AGE replicaset.apps/ping-pong-client-7fd9cc6848 0 0 0 14m replicaset.apps/ping-pong-client-c5bff68d7 1 1 1 6m21s replicaset.apps/ping-pong-server-7b8b5c98c4 0 0 0 15m replicaset.apps/ping-pong-server-7cbd48d69c 1 1 1 6m32s
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056