Bug 1569293
| Summary: | Need to add deleted compute nodes back to the overcloud stack in the undercloud | ||
|---|---|---|---|
| Product: | [Community] RDO | Reporter: | David Manchado <dmanchad> |
| Component: | openstack-tripleo | Assignee: | James Slagle <jslagle> |
| Status: | CLOSED EOL | QA Contact: | Shai Revivo <srevivo> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | Ocata | CC: | kforde |
| Target Milestone: | --- | ||
| Target Release: | trunk | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-06-15 20:08:35 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
David Manchado
2018-04-19 00:29:18 UTC
We do have an undercloud snapshot taken before the last minor update (3 weeks ago). I think that would be the safest recovery path and then go for a minor update. We have replicated the issue on staging and reverting the snapshot seems to do the trick. We might do it on production environment too. After restoring from the snapshot everything seems to be ok (openstack baremetal node list & openstack server list).
As long as the restore was taken right before the last minor update, we have successfully run a minor update on the undercloud.
We want to run a deploy to confirm everything is ok before moving on but we have had the following issues:
overcloud.Controller.1.UpdateDeployment:
resource_type: OS::Heat::SoftwareDeployment
physical_resource_id: 4873582f-4633-42fa-bac5-d3cb6b3bb65d
status: UPDATE_FAILED
status_reason: |
UPDATE aborted
deploy_stdout: |
Started yum_update.sh on server 7de8d1ee-7cc9-4811-a3a1-5f878469feb4 at Thu Jan 25 10:05:03 UTC 2018
Not running due to unset update_identifier
deploy_stderr: |
overcloud.Controller.0.UpdateDeployment:
resource_type: OS::Heat::SoftwareDeployment
physical_resource_id: 2bdfe429-726b-49af-a303-3870ad2c2848
status: UPDATE_FAILED
status_reason: |
UPDATE aborted
deploy_stdout: |
Started yum_update.sh on server 60e7413f-4ff9-45ff-a50c-645be4610d7f at Thu Jan 25 10:05:59 UTC 2018
Not running due to unset update_identifier
deploy_stderr: |
So just wondering:
* Should a deploy be expected to succeed in this situation?
* Should we go for an overcloud minor update?
* should we go for openstack overcloud deploy --update-plan-only and then the deploy?
|