Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1940149

Summary:

[RFE] Retry the getting of the image from quay.io

Product:

OpenShift Container Platform

Reporter:

Udi Kalifon <ukalifon>

Component:

assisted-installer

Assignee:

Eran Cohen <ercohen>

assisted-installer sub component:

Installer

QA Contact:

Udi Kalifon <ukalifon>

Status:

CLOSED CURRENTRELEASE

Docs Contact:

Severity:

medium

Priority:

unspecified

CC:

ercohen, yobshans

Version:

4.7

Keywords:

Reopened

Target Milestone:

---

Target Release:

internal.milestone

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

OCP-Metal-v1.0.19.1

Doc Type:

No Doc Update

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2022-08-28 08:45:59 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Failure in run install	none

Description Udi Kalifon 2021-03-17 17:16:36 UTC

Created attachment 1764127 [details]
Failure in run install

Description of problem:
My installation failed right on the beginning (within ~20 seconds) with this error:

Cluster installation failed
Failed generating kubeconfig files for cluster 92d85eef-339a-4d80-9e83-361a49a3318f: command oc exited with non-zero exit code 1: error: unable to connect to image repository quay.io/openshift-release-dev/ocp-release:4.7.2-x86_64: Get "https://quay.io/v2/": context deadline exceeded (Client.Timeout exceeded while awaiting headers) .
Reset the installation process to return to the configuration and try again. Some hosts may need to be re-registered by rebooting into the Discovery ISO.


All hosts were in error in step 0/7.

To proceed, I reset the cluster and rebooted the hosts and started again. However, this can be avoided by having the agent or installer retry the call to quay a few more times before giving up, to make such errors more rare.


Version-Release number of selected component (if applicable):
Release tag
    stable
Assisted Installer UI version
    quay.io/ocpmetal/ocp-metal-ui:2fe99dd56daff096177e5d9a1b644c8a3ee5b039
Assisted Installer UI library version
    0.0.12-wizard
Assisted Installer
    quay.io/ocpmetal/assisted-installer:c107911c4756e4473405e893ee80f4a6b079ac4f
Assisted Installer Controller
    quay.io/ocpmetal/assisted-installer-controller:c107911c4756e4473405e893ee80f4a6b079ac4f
Assisted Installer Service
    quay.io/ocpmetal/assisted-service:e0df002062f80149769707e72e5952da16897aef
Discovery Agent
    quay.io/ocpmetal/assisted-installer-agent:edbaff3f6b1343b6e51c64d461923ac592820476


How reproducible:
Rarely


Steps to Reproduce:
1. This is the regular AI flow


Additional info:
See screenshot

Comment 1 Ronnie Lazar 2021-03-17 18:51:17 UTC

ercohen dont we already have retries?

Comment 3 Eran Cohen 2021-03-18 11:25:58 UTC

Note that when cluster might fail during preparing-for-installation due to multiple reasons and there is no reason to require hosts reboot.
So I think that's what we should fix

Comment 4 Eran Cohen 2021-03-21 07:30:37 UTC

There is work in progress that should mitigate this issue (the user won't need to reset the installation & reboot all nodes).
In case the assisted-installer failed for any reason during preparing-for-installation the cluster it will set the cluster status to insufficient.
The cluster will recover back to ready status if all is well.

Comment 5 Udi Kalifon 2021-03-22 13:44:35 UTC

This will still fail the automation, and I think that also most users won't like to manually retry the installation even if it's simple. Would you consider adding the retry after all?

Comment 6 Eran Cohen 2021-03-25 07:42:34 UTC

Sure, adding retries does make sense regardless of how the installation get bake on track.
I'll reopen and remove the won't fix resolution.

Comment 7 Yuri Obshansky 2021-05-05 13:14:38 UTC

Verified on  OCP-Metal-v1.0.19.1