Description of problem: Version-Release number of selected component (if applicable): {"release_tag":"v1.0.26.0","versions":{"assisted-installer":"registry-proxy.engineering.redhat.com/rh-osbs/openshift4-assisted-installer-rhel8:v1.0.0-99","assisted-installer-controller":"registry-proxy.engineering.redhat.com/rh-osbs/openshift4-assisted-installer-reporter-rhel8:v1.0.0-123","assisted-installer-service":"quay.io/app-sre/assisted-service:faf9e09","discovery-agent":"registry-proxy.engineering.redhat.com/rh-osbs/openshift4-assisted-installer-agent-rhel8:v1.0.0-69"}} How reproducible: Steps to Reproduce: 1. Prepare cluster with 3 masters/3 workers and OCS operator , using 4.8.12 2. Make sure all validation are OK and start installation 3. Actual results: After the Installation was started, and nodes were in "Preparing for installation" the cluster got reset and back into the before installation phase. This seems to be caused by some agents that failed the following validation: Container images availability: Failed to fetch container images needed for installation from quay.io/openshift-release-dev/ocp-release:4.8.12-x86_64. Expected results: Validation to fail before user starts Installation Additional info:
There were no installation logs to download, since the cluster moved back to before installation state
We do try cache some of the images before we start the installation, so network issues can happen anytime, i think that enabling the user to retry is a good solution, the question is if it's visible, do you see any events related to it or did you had to go into the agents logs? Besides making it visible i don't see a really good solution, @atraeger @rfreiman what do you think?
This is the design that we discussed with UX. I believe the UI has an open task on making the reason more visible.
@tjelinek @jkilzi Is there a ticket that we can link?
Please take a look at this https://issues.redhat.com/browse/MGMT-7943 I couldn't find something more specific. Please let me know if we need to open such a task.
@jkilzi i assigned this ticket to you, if it's ok with QE you can probably close this ticket.
reproduced on Assisted-ui-lib version: 2.0.6
@yobshans seems that when this bug was opened the expectation was to receive the validation failure, about container-images-availability, before the installation begins. This behavior was changed as part of https://issues.redhat.com/browse/MGMT-7943. The desired behavior now is to take you back to the "Review and create" step and show you a warning about what happened. The container-images-availability validation is not designed to run before the installation begins, therefore the expected behavior in the description of this bug must be aligned with the one I described before. Moving back to ON_QA
Not reproducible Assisted-ui-lib version: 2.0.9