Bug 1218692
Summary: | Openstack-Heat: Attempting to scale up the overcloud with more compute nodes ending with : UPDATE_FAILED . | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Omri Hochman <ohochman> |
Component: | openstack-heat | Assignee: | Zane Bitter <zbitter> |
Status: | CLOSED ERRATA | QA Contact: | Amit Ugol <augol> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.0 (Kilo) | CC: | ddomingo, jprovazn, kbasil, mburns, rlandy, rybrown, sasha, sbaker, shardy, yeylon |
Target Milestone: | ga | Keywords: | Triaged |
Target Release: | 7.0 (Kilo) | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openstack-heat-2015.1.0-3.el7ost | Doc Type: | Bug Fix |
Doc Text: |
In previous releases, changes to the absolute path of a template for a template resource (as in, a resource implicitly backed by a stack) were not recognized by the Orchestration service. This prevented nested stacks backing a template resource from being updated whenever that resource's template was renamed or moved.
With this release, the Orchestration service can now detect such changes, thereby ensuring that nested stacks are updated accordingly.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2015-08-05 13:23:15 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Omri Hochman
2015-05-05 15:06:28 UTC
So the exception at least looks very similar to the one in bug 1215511, but in this case you're definitely using python-heatclient 0.5.0. And I assume you also have the fix for bug 1212625, because that has been in the midstream for a while. So there must be something else going on :/ Database spelunking confirms that this is *not* the same as bug 1215511 - the templates for both the regular and backup stacks contain the correct key in the files section. Inspection of the code also confirms that the fix for bug 1212625 is indeed present. Steve Baker pointed out that this could be caused by bug 1212740. Before the tripleo-common ScaleManager calls stacks.update it needs to prepare the environment and files by calling template_utils.get_template_contents and template_utils.process_multiple_environments_and_files http://git.openstack.org/cgit/openstack/python-heatclient/tree/heatclient/v1/shell.py#n449 Otherwise files will be missing in the request, and file paths will not be normalised. As a (nasty) workaround for BZ 1212740 scale-out code does filename replacement before sending data to heat. Albeit this is not nice, this worked with older rpms last week. I will submit a simple patch which saves temporarily saves template files so get_template_contents can be used to process/prepare them, but I'm not sure this regression is caused by this: If I do a simple test: 1) instack-deploy-overcloud --tuskar 2) cp -a /home/stach/tuskar_templates /home/stach/tuskar_templates.2 3) heat stack-update -f tuskar_templates.2/plan.yaml -e tuskar_templates.2/environment.yaml overcloud Then stack update fails anyway, in stack events I can see: | Controller | 2dc68516-f3e9-4757-bb2d-04e9d5080dba | ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "StackValidationFailed: Property error : : resources.ControllerConfig.properties.config: : No content found in the "files" section for get_file path: file:///home/stack/tuskar_template | UPDATE_FAILED | 2015-05-06T07:14:58Z | | Compute | 91d3476d-68b7-4394-b48d-08791ff140e1 | ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "StackValidationFailed: No content found in the "files" section for get_file path: file:///home/stack/tuskar_templates/hieradata/ceph.yaml" | UPDATE_FAILED | 2015-05-06T07:15:02Z | I've updated the scale out code to use get_template_contents and process_multiple_environments_and_files to make sure the problem is not on client side and I get same error as in Comment #7. So it looks like bug 1212625 is not fixed, even with the patch in place. It's again looking for the old path (tuskar_templates, not tuskar_templates.2) and adding extra debug shows that it's doing so from a files collection where everything is named with the new path. OK, we tracked down the problem. In this case, it was that we were passing an old version of a template (with old normalised paths) to stack update for a template resource (which the members of the ResourceGroups are). Basically TemplateResource failed to account for the possibility that the template name had changed in the environment on an update, and the bug was further masked by the fact that it falls back to just using the current template when it really shouldn't. So this is completely independent of bug 1212625, despite having almost identical symptoms, and fixes for both are required. Fix posted upstream: https://review.openstack.org/#/c/180843/ Backport to Kilo proposed upstream and cherry-picked into rdo-management mgt-kilo branch. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2015:1548 |