This issue is a clone of Bug #1821658 and in reality represents a deeper problem. There a likely several other areas of dead code in the 3.x openshift-ansible code base that should be removed. +++ This bug was initially created as a clone of Bug #1821658 +++ Description of problem: When installing openshift-autoheal on OCP v3.11 the "receiver" container for pod "autoheal-XXXX-XXXX" failed to pull image from "registry.redhat.io" regsitry. Pod status : ~~~ $ oc get pods -n openshift-autoheal NAME READY STATUS RESTARTS AGE autoheal-86bf5d956-vg8ff 1/2 ImagePullBackOff 0 4d ~~~ Project events : ~~~ $ oc get events -n openshift-autoheal LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE 1h 4d 1122 autoheal-86bf5d956-vg8ff.1602406217cca537 Pod spec.containers{receiver} Warning Failed kubelet, node-0.redhat.com Failed to pull image "registry.redhat.io/openshift3/ose-autoheal:v3.11.170": rpc error: code = Unknown desc = error parsing HTTP 404 response body: invalid character 'F' looking for beginning of value: "File not found.\"" 1h 4d 1130 autoheal-86bf5d956-vg8ff.16024061cbbfef27 Pod spec.containers{receiver} Normal Pulling kubelet, node-0.redhat.com pulling image "registry.redhat.io/openshift3/ose-autoheal:v3.11.170" 6m 4d 25129 autoheal-86bf5d956-vg8ff.1602406258423d90 Pod spec.containers{receiver} Normal BackOff kubelet, node-0.redhat.com Back-off pulling image "registry.redhat.io/openshift3/ose-autoheal:v3.11.170" 1m 4d 25150 autoheal-86bf5d956-vg8ff.160240625842a56b Pod spec.containers{receiver} Warning Failed kubelet, node-0.redhat.com Error: ImagePullBackOff ~~~ Tried to manually pull the image : ~~~ $ sudo docker pull registry.redhat.io/openshift3/ose-autoheal:v3.11.170 Trying to pull repository registry.redhat.io/openshift3/ose-autoheal ... error parsing HTTP 404 response body: invalid character 'F' looking for beginning of value: "File not found.\"" ~~~ The image "registry.redhat.io/openshift3/ose-autoheal:v3.11.170" is not present in the registry/Red Hat Container Catalog. Version-Release number of the following components: $ oc version oc v3.11.170 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://<server-URL>:443 openshift v3.11.188 kubernetes v1.11.0+d4cacc0 How reproducible: Always Steps to Reproduce: 1. Add "openshift_autoheal_deploy=true" variable in inventory file and run the "/usr/share/ansible/openshift-ansible/playbooks/openshift-autoheal/config.yml" playbook. 2. The playbook runs successfully with the installation completed message. But when we take a look at the pods it failed with the "ImagePullBackOff" error. Actual results: The "openshift-autoheal" pod failes with the "ImagePullBackOff" error. Expected results: The "openshift-autoheal" should be installed successfully. --- Additional comment from Brenton Leanhardt on 2020-04-07 17:22:25 UTC --- This bug seems similar to Bug #1642767. --- Additional comment from Brenton Leanhardt on 2020-04-07 17:26:21 UTC --- Either we're having a general registry problem or I'd say there was a problem pushing this release. [ocp-build-data]$ podman pull registry.redhat.io/openshift3/ose-autoheal:v3.11.170 Trying to pull registry.redhat.io/openshift3/ose-autoheal:v3.11.170... error parsing HTTP 404 response body: invalid character 'F' looking for beginning of value: "File not found.\"" Error: error pulling image "registry.redhat.io/openshift3/ose-autoheal:v3.11.170": unable to pull registry.redhat.io/openshift3/ose-autoheal:v3.11.170: unable to pull image: Error initializing source docker://registry.redhat.io/openshift3/ose-autoheal:v3.11.170: Error reading manifest v3.11.170 in registry.redhat.io/openshift3/ose-autoheal: error parsing HTTP 404 response body: invalid character 'F' looking for beginning of value: "File not found.\"" --- Additional comment from Vikas Laad on 2020-04-07 18:16:35 UTC --- Previous bug was closed by mmagnani, adding him in this to see what was the reason. We never released autoheal in 3.11. --- Additional comment from Ashwini M. Khaire on 2020-04-17 07:52:57 UTC --- Hello Team, Any further updates on this? Thanks!! --- Additional comment from Brenton Leanhardt on 2020-04-17 12:07:19 UTC --- Hi Ashwini, If autoheal was never released in 3.11 then this is a documentation bug. I see it referenced here: https://docs.openshift.com/container-platform/3.11/install/running_install.html --- Additional comment from Vikram Goyal on 2020-04-20 08:46:30 UTC --- @Jeana - looks like we had something documented that was never released and needs to be removed. This should go through change management as described in the manual. --- Additional comment from Eric Rich on 2020-04-21 19:36:02 UTC --- We ship the playbooks for the auto-heal functionality (at least as part of our ansible): > $ rpm -qf /usr/share/ansible/openshift-ansible/playbooks/openshift-autoheal/config.yml openshift-ansible-playbooks-3.11.200-1.git.0.3f37acb.el7.noarch If this is an alignment issue with what image we provide, I want to better understand what happened, because https://github.com/openshift/autoheal looks defunct.
*** Bug 1812600 has been marked as a duplicate of this bug. ***
Another area of dead code that should be resolved with this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1812600 This should be removed: roles/openshift_control_plane/tasks/update_master_count.yml
Thank you for continuing to use Red Hat OpenShift. As part of a wider bug review, this bug has been evaluated and we have determined that at this time we do not plan to progress it. As such, we will be closing this bug. If you have need for continued assistance on this issue, please reopen the bug with additional context on why it needs to be reconsidered.