Bug 1972598
| Summary: | [master] Install retry per recreating ACI, BMH error status is not cleared | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Nir Magnezi <nmagnezi> | ||||
| Component: | assisted-installer | Assignee: | Nir Magnezi <nmagnezi> | ||||
| assisted-installer sub component: | assisted-service | QA Contact: | Yuri Obshansky <yobshans> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | high | ||||||
| Priority: | unspecified | CC: | aos-bugs, fpercoco | ||||
| Version: | 4.8 | Keywords: | Triaged | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.9.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | AI-Team-Hive KNI-EDGE-4.8 | ||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1975805 (view as bug list) | Environment: | |||||
| Last Closed: | 2021-10-18 17:34:35 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1975805 | ||||||
| Attachments: |
|
||||||
|
Description
Nir Magnezi
2021-06-16 10:03:59 UTC
I looked into this, here's the summary: 1. The environment was using an older version of Assisted Service, which was missing a couple of PRs 2. After updating the assisted-service container (manually modified the Operator Subscription), I was able to retry a deployment (I will attach a screenshot of the BMH's events showing the deprovision/provision of the image when the deployment was retried). @nmagnezi do you want to give this another go before closing the issue? Created attachment 1791711 [details]
bmh events
BMH's events showing re-provision of an InfraEnv URL
(In reply to Flavio Percoco from comment #2) > I looked into this, here's the summary: > > 1. The environment was using an older version of Assisted Service, which was > missing a couple of PRs > 2. After updating the assisted-service container (manually modified the > Operator Subscription), I was able to retry a deployment (I will attach a > screenshot of the BMH's events showing the deprovision/provision of the > image when the deployment was retried). > > @nmagnezi do you want to give this another go before closing the > issue? I tried this again. What I see now is that the BMH clears the error status, but didn't get the new image URL, thus no re-install. log: https://gist.github.com/nmagnezi/a49d34d6cf2a8cc0fc110621fde43642 Let's follow up to see if I did something else / wrong that caused this, before we close this bug. I attempted this again because I simply forgot to remove the 'detached' label. However, now It fails with 404:: https://gist.github.com/nmagnezi/2395564774afa4f5a812ac5cf4e3c0db#file-bmh-yaml-L317 SVC log: https://gist.github.com/nmagnezi/d2f2040ce3d391a823b1e6b3f6bfc888#file-retry-go-L83 For QE: The solution here is to document how to retry an installation: the user need to recreate both BMH(s) and ACI. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |