Bug 1392995
| Summary: | Replacing a Ceph storage node fails with StackValidationFailed: resources.CephStorageAllNodesDeployment: Property error: CephStorageAllNodesDeployment.Properties.input_values: The Referenced Attribute (CephStorage resource.0.hostname) is incorrect. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> | ||||
| Component: | openstack-tripleo-heat-templates | Assignee: | Steven Hardy <shardy> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Marius Cornea <mcornea> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 10.0 (Newton) | CC: | bcrochet, brad, dbecker, jcoufal, jefbrown, jschluet, jslagle, mburns, mcornea, morazi, pgrist, rhel-osp-director-maint, sasha, sclewis, shardy | ||||
| Target Milestone: | rc | Keywords: | Triaged | ||||
| Target Release: | 10.0 (Newton) | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | openstack-tripleo-heat-templates-5.0.0-1.7.el7ost | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-12-14 16:31:00 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Marius Cornea
2016-11-08 16:20:19 UTC
can you provide: all your custom templates heat-api.log, heat-engine.log from the undercloud plan contents (download the overcloud container contents from swift and tgz that) i'd also be interested which ceph node the uuid 03915d83-6026-4a4f-9e93-a3807c9e0d8e corresponds to. Is it the first one? Does the issue reproduce if you try to delete the last ceph node instead? Also, for OSP 10, I don't think you have to pass --templates and all the -e's to the node delete command. (In reply to James Slagle from comment #2) > Also, for OSP 10, I don't think you have to pass --templates and all the > -e's to the node delete command. Brad, can you confirm this bit ^? (In reply to James Slagle from comment #3) > (In reply to James Slagle from comment #2) > > > Also, for OSP 10, I don't think you have to pass --templates and all the > > -e's to the node delete command. > > Brad, can you confirm this bit ^? checked with him on irc and he confirmed that you don't need to pass --templates or the -e's anymore to the openstack overcloud node delete command. Created attachment 1218844 [details]
Logs and templates
This is because we now set the bootstrap node for all roles (to enable deployment of any puppet profile which expects to detect the first node in the cluster aka bootstrap node). Previously only the Controller set this, but now we have a hard-coded reference to node "0" here in the overcloud template: https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.j2.yaml#L234 input_values: bootstrap_nodeid: {get_attr: [{{role.name}}, resource.0.hostname]} bootstrap_nodeid_ip: {get_attr: [{{role.name}}, resource.0.ip_address]} We need some way for the node delete workflow to change this index when replacing node "0", or another way to detect the first node in the group without using the node name (this looks like an index but I think it's referring to the resource name in the resource group, so it should be e.g "1" after this removal, ideally we'd use a list lookup here instead, perhaps that's a possible way to fix this). https://review.openstack.org/#/c/395699/ posted upstream which I believe resolves this issue, done some local testing but feedback welcome. (In reply to Steven Hardy from comment #7) > https://review.openstack.org/#/c/395699/ posted upstream which I believe > resolves this issue, done some local testing but feedback welcome. Tested it on my env as well and it looks good. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html |