Description of problem: OLM is unable to recover automatically after a release failed to roll out. Context for the problem: I am part of the MT-SRE team and we are rolling out operator updates to a fleet of OpenShift Dedicated clusters. The operators that we are responsible for are managed and SRE-ed by Red Hat. As we rollout updates to a whole fleet of clusters, requiring manual intervention on every single cluster might take us days or weeks to recover from this situation, so it is very important to us to get this case handled within OLM. Within SRE teams this situation is commonly refered to as the "OLM Dance", required to get services back into an operational state. Version-Release number of selected component (if applicable): Provider: AWS OpenShift version: 4.8.11 Update channel: stable-4.8 How reproducible: Always Steps to Reproduce: 1. Successfully install an operator on a cluster with Subscription set to Automatic approval. (e.g. v0.2.0) 2. Release (e.g. v0.3.0) an Operator Bundle using the wrong image digests or have readiness/liveness probes that never pass. 3. Observe OLM failure to rollout new release 4. Release a new version (e.g. v0.3.1) skipping the previous version (v0.3.0) and providing an upgrade path from the broken release. Actual results: OLM is stuck waiting for manual intervention to reinstall the Operator. Expected results: OLM is rolling back to the previous working version v0.2.0, after progressDeadline is exceeded. OLM is rolling forward when a new upgrade path is available from the failed v0.3.0 to v0.3.1. Additional info: There is a longer Google Doc that explains the whole situation: https://docs.google.com/document/d/125zm8jEhpNF-z1IMoqMMd3IJSTVIwUMylPIgnPpgJLM The affected OpenShift dedicated cluster is also still available for debugging - if required. Edit: OLM is explicitly asking for manual reinstall in it's logs: $ oc logs -n openshift-operator-lifecycle-manager olm-operator-7bfd55d5c7-79jcw time="2021-09-22T10:35:02Z" level=info msg="addon-operator.v0.2.0 replaced by addon-operator.v0.3.0" time="2021-09-22T10:35:02Z" level=warning msg="needs reinstall: deployment addon-operator-manager not ready before timeout: deployment \"addon-operator-manager\" exceeded its progress deadline" csv=addon-operator.v0.3.0 id=6WSIG namespace=redhat-addon-operator phase=Failed strategy=deployment