| Summary: | OSP 8 to 9 upgrade with Director - failures with step 3.4.2. Installing Aodh | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Matt Flusche <mflusche> |
| Component: | rhosp-director | Assignee: | Sofer Athlan-Guyot <sathlang> |
| Status: | CLOSED NOTABUG | QA Contact: | Omri Hochman <ohochman> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 9.0 (Mitaka) | CC: | augol, dbecker, emacchi, mburns, mflusche, morazi, rhel-osp-director-maint, tvignaud |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | 9.0 (Mitaka) | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-12-13 20:40:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Description of problem: During an OSP 8 to OSP 9 upgrade (w/ Director) if the heat stack update in step "3.4.2. Installing Aodh" (Doc: Upgrading Red Hat OpenStack Platform ) fails or hangs several issues could occur. - Can't list nested stacks from Director $ source ~stack/stackrc $ heat stack-list -n WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead ERROR: could not convert string to float: This issue is caused by a mismatch in the OSP8 & 9 templates. Specifically the NeutronTenantMtu heat parameters. OSP8 heat template parameter: NeutronTenantMtu type: number default: 1400 OSP9 heat template parameter: NeutronTenantMtu type: string default: '' Tracing this issue down; in the heat db I see a mismatch in the raw_template table for the nested stack associated with the OS::TripleO::Controller (or Compute) resources. The template data contains the (OSP8) type: number and the environment data contains a default (OSP9) null ('') value. This causes the following exception in the heat-engine.log 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher [req-64a7ef3c-16bc-4ca5-bee1-87b4abb8d678 - admin - default default] Exception during message handling: could not convert string to float: 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last): 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher incoming.message)) 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args) 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args) 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 117, in wrapper 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher return f(*args, **kwargs) 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 329, in wrapped 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher return func(self, ctx, *args, **kwargs) 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/heat/engine/service.py", line 518, in show_stack 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher stack, resolve_outputs=resolve_outputs) for stack in stacks] 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/heat/engine/api.py", line 221, in format_stack 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher rpc_api.STACK_PARAMETERS: stack.parameters.map(six.text_type), 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/heat/engine/parameters.py", line 540, in map 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher for n, p in six.iteritems(self.params) if filter_func(p)) 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/heat/engine/parameters.py", line 540, in <genexpr> 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher for n, p in six.iteritems(self.params) if filter_func(p)) 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/heat/engine/parameters.py", line 290, in __str__ 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher value = self.value() 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/heat/engine/parameters.py", line 316, in value 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher return Schema.str_to_num(super(NumberParam, self).value()) 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/heat/engine/constraints.py", line 181, in str_to_num 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher return float(value) 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher ValueError: could not convert string to float: 2016-09-23 17:26:14.216 12320 ERROR oslo_messaging.rpc.dispatcher 2016-09-23 17:26:14.218 12320 ERROR oslo_messaging._drivers.common [req-64a7ef3c-16bc-4ca5-bee1-87b4abb8d678 - admin - default default] Returning exception could not convert string to float: to caller The next issue observed when trying to do another deployment after the inital failure. - metadata seems to be broken on all the overcloud nodes. Any software deployment will hang indefinitely. - I noticed that the os-collect-config configuration changes during this initial upgrade step. In OSP8 it uses a heat-cfn endpoint and in OSP9 a swift container is accessed to pull heat software deployment data. Seems like something in the conversion may be causing issues here. During this broke condition os-collect-config will loop on the software deployment data and continue to re-apply the same configuration every minute; even when no heat stack update is running. Version-Release number of selected component (if applicable): Current OSP 9 bits How reproducible: 100% Steps to Reproduce: 1.Create OSP8 deployment 2.Upgrade Director to OSP9 (or use OSP9 director to deploy OSP8 and skip this step). 3.Shutdown os-collect-config on an overcloud node to create a deployment failure (timeout). 4. Run step 3.4.2. Installing Aodh from the upgrade guide. 5. Let deployment fail 6. Observe issue with: heat stack-list -n 7. restart os-collect-config on overcloud node(s) 8. restart deployment 9. Observe that heat software deployments will hang and never complete. Actual results: The deployment breaks and is not recoverable by re-applying the stack update after resolving the failure cause. Expected results: heat stack update should recover and complete the upgrade step. Additional info: I believe setting the NeutronTenantMtu parameter will resolve the "convert string to float" issue but os-collect-config is still broken.