Bug 1303084
| Summary: | String values for comma_delimited_list parameters can fail stack-update | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Jiri Stransky <jstransk> | ||||||
| Component: | openstack-heat | Assignee: | Crag Wolfe <cwolfe> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Jiri Stransky <jstransk> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 8.0 (Liberty) | CC: | cwolfe, jcoufal, jschluet, jstransk, mburns, mlopes, rhel-osp-director-maint, sbaker, shardy, yeylon, zbitter | ||||||
| Target Milestone: | ga | ||||||||
| Target Release: | 8.0 (Liberty) | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | openstack-heat-5.0.1-3.el7ost | Doc Type: | Bug Fix | ||||||
| Doc Text: |
Previously, heat would attempt to validate old properties based on the current property's definitions. Consequently, during director upgrades where a property definition changed type, the process would fail with a 'TypeError' when heat tried to validate the old property value.
With this fix, heat no longer tries to validate old property values.
As a result, heat can now gracefully handle property schema definitions changes by only validating new property values.
|
Story Points: | --- | ||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-04-07 21:26:49 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Jiri Stransky
2016-01-29 13:56:41 UTC
Another parameter where this happens in TripleO is NeutronTunnelIdRanges or NeutronVniRanges. On the first upgrade attempt i see: 2016-02-01 09:40:24 [overcloud-Controller-m2dyrff7g74p]: UPDATE_FAILED TypeError: resources[0]: "u'1:1000'" is not a list 2016-02-01 09:40:24 [0]: UPDATE_FAILED TypeError: resources[0]: "u'1:1000'" is not a list 2016-02-01 09:40:25 [Controller]: UPDATE_FAILED resources.Controller: TypeError: resources[0]: "u'1:1000'" is not a list 2016-02-01 09:40:25 [overcloud-Compute-yaggukg6qibn]: UPDATE_FAILED TypeError: resources[0]: "u'1:1000'" is not a list 2016-02-01 09:40:26 [Compute]: UPDATE_FAILED resources.Compute: TypeError: resources[0]: "u'1:1000'" is not a list 2016-02-01 09:40:27 [BlockStorage]: UPDATE_COMPLETE state changed 2016-02-01 09:40:27 [overcloud]: UPDATE_FAILED resources.Controller: TypeError: resources[0]: "u'1:1000'" is not a list However this error just goes away on 2nd update attempt. When i do heat stack-show on overcloud between the attempts, i see: "NeutronTunnelIdRanges": "[u'1:1000']", "NeutronVniRanges": "[u'1:1000']", so it looks like Heat has updated this to arrays by itself, and the second upgrade attempt doesn't stop on this issue. A corresponding stack trace from heat-engine.log: 2016-02-01 04:40:24.937 15707 DEBUG heat.engine.scheduler [-] Task _resource_update from Stack "overcloud" [58d13951-8a07-44be-8134-8b49e76aea04] Update running step /usr/lib/python2.7/site-packages/heat/engine/ scheduler.py:214 2016-02-01 04:40:24.937 15707 DEBUG heat.engine.scheduler [-] Task _run_to_completion from ResourceGroup "Controller" [22b083a5-6b63-48fa-bf62-1e57b75b0731] Stack "overcloud" [58d13951-8a07-44be-8134-8b49e76aea0 4] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:214 2016-02-01 04:40:24.974 15707 INFO heat.engine.resource [-] UPDATE: ResourceGroup "Controller" [22b083a5-6b63-48fa-bf62-1e57b75b0731] Stack "overcloud" [58d13951-8a07-44be-8134-8b49e76aea04] 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource Traceback (most recent call last): 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 620, in _action_recorder 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource yield 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 949, in update 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource prop_diff]) 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 309, in wrapper 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource step = next(subtask) 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 664, in action_handler_task 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource while not check(handler_data): 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/heat/resource_group.py", line 400, in check_update_complete 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource if not checker.step(): 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 217, in step 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource next(self._runner) 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/heat/resource_group.py", line 388, in _run_to_completion 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource self).check_update_complete(updater): 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resources/stack_resource.py", line 442, in check_update_complete 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource cookie=cookie) 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resources/stack_resource.py", line 372, in _check_status_complete 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource action=action) 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource ResourceFailure: resources.Controller: TypeError: resources[0]: "u'1:1000'" is not a list 2016-02-01 04:40:24.974 15707 ERROR heat.engine.resource Not yet proven, but may be worth trying this patch: https://review.openstack.org/#/c/275544/1/heat/engine/properties.py Created attachment 1121612 [details]
Nova 404 errors seens with last deploy attempt
I was able to reproduce the issue. Also, I tried the patch to properties.py from Comment 3 on a fresh install (ie, first attempt at "openstack overcloud deploy" in the environment) and that seemed to fix the issue. However, I ran into another issue that I don't think is related during my last deploy attempt -- uploaded relevant details in previous comment. I'm pretty sure this is already resolved by the fix for bug 1310879. (I also left a comment to this effect on the upstream bug.) That also explains why it doesn't fail every time. It should happen when: * There was a previous update failure; and * Said update failure occurred _prior_ to the resource in question being updated Can you retest with the openstack-heat-5.0.1-2.el7ost build to confirm that it's fixed? Retested with unmodified openstack-heat-engine-5.0.1-2.el7ost.noarch and got the error again: 2016-03-01 15:08:47 [overcloud-Controller-r62lbjjc6oda]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-03-01 15:08:47 [0]: UPDATE_IN_PROGRESS state changed 2016-03-01 15:08:47 [0]: UPDATE_FAILED TypeError: resources[0]: "u'1:1000'" is not a list 2016-03-01 15:08:48 [overcloud-BlockStorage-pq2x3sffw2if]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-03-01 15:08:48 [overcloud-Controller-r62lbjjc6oda]: UPDATE_FAILED TypeError: resources[0]: "u'1:1000'" is not a list 2016-03-01 15:08:48 [overcloud-Compute-armrkngq7g4u]: UPDATE_IN_PROGRESS Stack UPDATE started 2016-03-01 15:08:48 [0]: UPDATE_IN_PROGRESS state changed 2016-03-01 15:08:48 [0]: UPDATE_FAILED TypeError: resources[0]: "u'1:1000'" is not a list 2016-03-01 15:08:50 [overcloud-Compute-armrkngq7g4u]: UPDATE_FAILED TypeError: resources[0]: "u'1:1000'" is not a list Stack overcloud UPDATE_FAILED Heat Stack update failed. Looks like we do need Crag's patch too. After applying the patch and restarting heat-engine, i no longer hit this error. Ah, I've just noticed that in the Launchpad bug the exception is ValueError, but here it is TypeError. We may just need to catch both. Can you paste the traceback of the exception from heat-engine.log? Actually, I'll assume it's the same traceback from comment #2. It looks like we may have a couple of different errors here, one of which should already be fixed and the other we're still hitting. Ignore comment #9; the traceback in comment #2 is from the parent stack, so we can't actually see what the proximate cause is. It's still likely to be the same issue but with TypeError instead of ValueError. Can you upload the whole section of the log file covering the update that failed? Created attachment 1132015 [details]
entire heat-engine.log with upgrade error
Stack trace from attached log:
2016-03-01 14:12:14.662 29747 INFO heat.engine.resource [-] UPDATE: TemplateResource "0" [17065956-ece1-4717-a38b-d67a986c312c] Stack "overcloud-Controller-rnd6grj6l3t7" [e7e9a121-50fb-467e-b140-2a2ad0d22b12]
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource Traceback (most recent call last):
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 638, in _action_recorder
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource yield
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 965, in update
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource before_props)
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 512, in update_template_diff_properties
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource changed_properties_set = set(k for k in after_props if prop_changed(k))
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 512, in <genexpr>
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource changed_properties_set = set(k for k in after_props if prop_changed(k))
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 497, in prop_changed
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource before = before_props.get(key)
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib64/python2.7/_abcoll.py", line 363, in get
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource return self[key]
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/properties.py", line 456, in __getitem__
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource return self._get_property_value(key)
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/properties.py", line 451, in _get_property_value
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource return prop.get_value(None, validate)
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/properties.py", line 326, in get_value
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource _value = self._get_list(value, validate)
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/properties.py", line 296, in _get_list
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource raise TypeError(_('"%s" is not a list') % repr(value))
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource TypeError: "u'1:1000'" is not a list
2016-03-01 14:12:14.662 29747 ERROR heat.engine.resource
This looks like the better approach: https://review.openstack.org/#/c/286874/1 -- works for me. openstack-heat-5.0.1-3.el7ost works for me too, no patching needed anymore, thanks! I ran overcloud major upgrade several times with openstack-heat-5.0.1-3.el7ost and never hit this issue again. Marking verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0603.html |