Bug 1259255 - heat results in error state when parameter changes and there is an underlying issue during stack-update
heat results in error state when parameter changes and there is an underlying...
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-heat (Show other bugs)
6.0 (Juno)
Unspecified Unspecified
medium Severity medium
: ---
: 6.0 (Juno)
Assigned To: Zane Bitter
Amit Ugol
: Triaged, ZStream
: 1263091 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-02 05:52 EDT by Martin Schuppert
Modified: 2016-04-26 09:32 EDT (History)
7 users (show)

See Also:
Fixed In Version: openstack-heat-2014.2.3-9.el7ost
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-01-14 08:53:05 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
orig template (1.11 KB, text/plain)
2015-09-02 05:55 EDT, Martin Schuppert
no flags Details
orig params (46 bytes, text/plain)
2015-09-02 05:55 EDT, Martin Schuppert
no flags Details
stack change (1.05 KB, text/plain)
2015-09-02 05:55 EDT, Martin Schuppert
no flags Details
stack change params (29 bytes, text/plain)
2015-09-02 05:56 EDT, Martin Schuppert
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 230155 None None None Never

  None (edit)
Description Martin Schuppert 2015-09-02 05:52:07 EDT
Description of problem:

When there is a change in the parameters of a stack and the stack-update fails because of an underlying issue, e.g. at nova, heat remains in error state and heat stack-list returns error message until problem stack gets deleted.

Version-Release number of selected component (if applicable):
OSP6
openstack-heat-api-2014.2.3-1.el7ost
openstack-heat-engine-2014.2.3-1.el7ost.noarch

How reproducible:
always

Steps to Reproduce:

1) create a stack with 2 parameters provided in orig-env.yaml :
  username: nnnn
  volume_size: 2

# heat stack-create -f orig.yaml -e orig-env.yaml stack1
# heat stack-list
+--------------------------------------+------------+--------------------+----------------------+
| id                                   | stack_name | stack_status       | creation_time        |
+--------------------------------------+------------+--------------------+----------------------+
| 2e748f95-1082-4a8e-be2e-9e6f99bcd5c1 | stack1     | CREATE_IN_PROGRESS | 2015-09-02T07:49:32Z |
+--------------------------------------+------------+--------------------+----------------------+

# heat stack-list
+--------------------------------------+------------+-----------------+----------------------+
| id                                   | stack_name | stack_status    | creation_time        |
+--------------------------------------+------------+-----------------+----------------------+
| 2e748f95-1082-4a8e-be2e-9e6f99bcd5c1 | stack1     | CREATE_COMPLETE | 2015-09-02T07:49:32Z |
+--------------------------------------+------------+-----------------+----------------------+

# heat stack-show stack1
+----------------------+-------------------------------------------------------------------------------------------------------------------------+
| Property             | Value                                                                                                                   |
+----------------------+-------------------------------------------------------------------------------------------------------------------------+
| capabilities         | []                                                                                                                      |
| creation_time        | 2015-09-02T07:49:32Z                                                                                                    |
| description          | No description                                                                                                          |
| disable_rollback     | True                                                                                                                    |
| id                   | 2e748f95-1082-4a8e-be2e-9e6f99bcd5c1                                                                                    |
| links                | http://192.168.200.5:8004/v1/b3788aa2186d49af89f4c6d8425e462d/stacks/stack1/2e748f95-1082-4a8e-be2e-9e6f99bcd5c1 (self) |
| notification_topics  | []                                                                                                                      |
| outputs              | []                                                                                                                      |
| parameters           | {                                                                                                                       |
|                      |   "username": "nnnn",                                                                                                   |
|                      |   "volume_size": "2",                                                                                                   |
|                      |   "OS::stack_id": "2e748f95-1082-4a8e-be2e-9e6f99bcd5c1",                                                               |
|                      |   "OS::stack_name": "stack1"                                                                                            |
|                      | }                                                                                                                       |
| parent               | None                                                                                                                    |
| stack_name           | stack1                                                                                                                  |
| stack_owner          | admin                                                                                                                   |
| stack_status         | CREATE_COMPLETE                                                                                                         |
| stack_status_reason  | Stack CREATE completed successfully                                                                                     |
| template_description | No description                                                                                                          |
| timeout_mins         | None                                                                                                                    |
| updated_time         | None                                                                                                                    |
+----------------------+-------------------------------------------------------------------------------------------------------------------------+

2) modify stack template to only have one parameter:
  volume_size: 2

# heat stack-update -f mytemplate.yaml -e env.yaml stack1
+--------------------------------------+------------+--------------------+----------------------+
| id                                   | stack_name | stack_status       | creation_time        |
+--------------------------------------+------------+--------------------+----------------------+
| 2e748f95-1082-4a8e-be2e-9e6f99bcd5c1 | stack1     | UPDATE_IN_PROGRESS | 2015-09-02T07:49:32Z |
+--------------------------------------+------------+--------------------+----------------------+

# heat stack-list
ERROR: The Parameter (username) was not provided.

Seems the root cause for this is that there is an underlying problem which gets hidden and only the parameter ERROR gets displayed. 2 situations which has been seen is e.g. when the ssh keys for nova has not been distributed and result in a resize error with 'Host key verification failed':

# nova show 635df74f-bdcb-47d1-8d0c-cec5cd23b0f9
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
| Property                             | Value                                                                                                                             |
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
| NEW_EXTERNAL_NET network             | 220.0.0.100                                                                                                                       |
| OS-DCF:diskConfig                    | MANUAL                                                                                                                            |
| OS-EXT-AZ:availability_zone          | nova                                                                                                                              |
| OS-EXT-SRV-ATTR:host                 | compute1                                                                                                                          |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute1.storage-pub                                                                                                              |
| OS-EXT-SRV-ATTR:instance_name        | instance-00000492                                                                                                                 |
| OS-EXT-STS:power_state               | 1                                                                                                                                 |
| OS-EXT-STS:task_state                | -                                                                                                                                 |
| OS-EXT-STS:vm_state                  | error                                                                                                                             |
| OS-SRV-USG:launched_at               | 2015-09-02T08:36:27.000000                                                                                                        |
| OS-SRV-USG:terminated_at             | -                                                                                                                                 |
| accessIPv4                           |                                                                                                                                   |
| accessIPv6                           |                                                                                                                                   |
| config_drive                         |                                                                                                                                   |
| created                              | 2015-09-02T08:36:00Z                                                                                                              |
| fault                                | {"message": "Unexpected error while running command.                                                                              |
|                                      | Command: ssh 192.168.101.137 mkdir -p /var/lib/nova/instances/635df74f-bdcb-47d1-8d0c-cec5cd23b0f9                                |
|                                      | Exit code: 255                                                                                                                    |
|                                      | Stdout: u''                                                                                                                       |
|                                      | Stderr: u'Host key verification failed.\\r\                                                                                       |
|                                      | '", "code": 500, "details": "  File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 325, in decorated_function |
|                                      |     return function(self, context, *args, **kwargs)                                                                               |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 3833, in resize_instance                                |
|                                      |     timeout, retry_interval)                                                                                                      |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py\", line 6406, in migrate_disk_and_power_off                 |
|                                      |     utils.execute('ssh', dest, 'mkdir', '-p', inst_base)                                                                          |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/utils.py\", line 194, in execute                                                   |
|                                      |     return processutils.execute(*cmd, **kwargs)                                                                                   |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/openstack/common/processutils.py\", line 203, in execute                           |
|                                      |     cmd=sanitized_cmd)                                                                                                            |
|                                      | ", "created": "2015-09-02T08:40:54Z"}                                                                                             |
| flavor                               | m1.tiny (1)                                                                                                                       |
| hostId                               | 968a4c0f3ff716ae5039b360ed262b49757ab3654f493231940571e3                                                                          |
| id                                   | 635df74f-bdcb-47d1-8d0c-cec5cd23b0f9                                                                                              |
| image                                | cirros (11a7a0da-37a7-4299-b26a-4d7c97715261)                                                                                     |
| key_name                             | -                                                                                                                                 |
| metadata                             | {}                                                                                                                                |
| name                                 | Server1                                                                                                                           |
| os-extended-volumes:volumes_attached | []                                                                                                                                |
| security_groups                      | default                                                                                                                           |
| status                               | ERROR                                                                                                                             |
| tenant_id                            | b3788aa2186d49af89f4c6d8425e462d                                                                                                  |
| updated                              | 2015-09-02T08:40:54Z                                                                                                              |
| user_id                              | 7b9efba9746a45349b8846f1f6f3d902                                                                                                  |
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+

When configure nova to be able to login to the computes to support resize as explained in [1], we still see the error message doing a stack-list while the update is in progress:

# heat stack-list
ERROR: The Parameter (username) was not provided.

But rebuild works on the nova side:
# nova list
+--------------------------------------+---------+--------+------------+-------------+------------------------------+
| ID                                   | Name    | Status | Task State | Power State | Networks                     |
+--------------------------------------+---------+--------+------------+-------------+------------------------------+
| 0527d78a-11b1-41bb-8ab5-5583f6db949e | Server1 | BUILD  | spawning   | NOSTATE     | NEW_EXTERNAL_NET=220.0.0.103 |
| 635df74f-bdcb-47d1-8d0c-cec5cd23b0f9 | Server1 | ACTIVE | -          | Running     |                              |
+--------------------------------------+---------+--------+------------+-------------+------------------------------+

# nova list
+--------------------------------------+---------+--------+------------+-------------+------------------------------+
| ID                                   | Name    | Status | Task State | Power State | Networks                     |
+--------------------------------------+---------+--------+------------+-------------+------------------------------+
| 0527d78a-11b1-41bb-8ab5-5583f6db949e | Server1 | ACTIVE | -          | Running     | NEW_EXTERNAL_NET=220.0.0.103 |
| 635df74f-bdcb-47d1-8d0c-cec5cd23b0f9 | Server1 | ACTIVE | -          | Running     |                              |
+--------------------------------------+---------+--------+------------+-------------+------------------------------+

# nova list
+--------------------------------------+---------+--------+------------+-------------+------------------------------+
| ID                                   | Name    | Status | Task State | Power State | Networks                     |
+--------------------------------------+---------+--------+------------+-------------+------------------------------+
| 0527d78a-11b1-41bb-8ab5-5583f6db949e | Server1 | ACTIVE | -          | Running     | NEW_EXTERNAL_NET=220.0.0.103 |
+--------------------------------------+---------+--------+------------+-------------+------------------------------+

After that we also see the stack-list returns the list with UPDATE_COMPLETE :
# heat stack-list
+--------------------------------------+------------+-----------------+----------------------+
| id                                   | stack_name | stack_status    | creation_time        |
+--------------------------------------+------------+-----------------+----------------------+
| 7c6eb5ad-1094-45c5-bd75-f8ec58a3c8a6 | stack1     | UPDATE_COMPLETE | 2015-09-02T08:35:53Z |
+--------------------------------------+------------+-----------------+----------------------+

=> It seems that heat gets stuck when there is a parameter change and there is an underlying issue, like the nova resize fail and shows the following until you delete the stack:

# heat stack-list
ERROR: The Parameter (username) was not provided.

[1] https://access.redhat.com/solutions/1326953
Comment 3 Martin Schuppert 2015-09-02 05:54:00 EDT
The second issue to reproduce this is to change the flavor in the new template to a non existing flavor in nova.

2015-09-02 11:43:09.583 2893 INFO heat.engine.resource [-] updating Server "external_server_1" [b8fddfcc-5375-4f77-945a-a75f9a2f51a2] Stack "stack1" [28d647bd-67d6-41a0-a822-a9ace5489a16]
2015-09-02 11:43:09.606 2893 DEBUG glanceclient.common.http [-] curl -i -X GET -H 'Accept-Encoding: gzip, deflate' -H 'Accept: */*' -H 'User-Agent: python-glanceclient' -H 'Connection: keep-alive' -H 'X-Auth-Token: {SHA1}6fa1275045afdead33e580e56a723f5f50273540' -H 'Content-Type: application/octet-stream' http://192.168.200.5:9292/v1/images/detail?limit=20&name=cirros log_curl_request /usr/lib/python2.7/site-packages/glanceclient/common/http.py:120
2015-09-02 11:43:09.820 2893 DEBUG glanceclient.common.http [-] 
HTTP/1.1 200 OK
date: Wed, 02 Sep 2015 09:43:09 GMT
connection: keep-alive
content-type: application/json; charset=UTF-8
content-length: 481
x-openstack-request-id: req-198c1ece-2f6a-4f00-a52c-1980ab756365

{"images": [{"status": "active", "deleted_at": null, "name": "cirros", "deleted": false, "container_format": "bare", "created_at": "2015-08-21T16:19:06", "disk_format": "qcow2", "updated_at": "2015-08-21T16:19:11", "min_disk": 0, "protected": false, "id": "11a7a0da-37a7-4299-b26a-4d7c97715261", "min_ram": 0, "checksum": "133eae9fb1c98f45894a4e60d8736619", "owner": "53a66750dda6498c9c3989da706c1113", "is_public": true, "virtual_size": null, "properties": {}, "size": 13200896}]}
 log_http_response /usr/lib/python2.7/site-packages/glanceclient/common/http.py:133
2015-09-02 11:43:10.079 2893 INFO heat.engine.resource [-] UPDATE: Server "external_server_1" [b8fddfcc-5375-4f77-945a-a75f9a2f51a2] Stack "stack1" [28d647bd-67d6-41a0-a822-a9ace5489a16]
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource Traceback (most recent call last):
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 439, in _action_recorder
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource     yield
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 694, in update
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource     args=[after, tmpl_diff, prop_diff])
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 286, in wrapper
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource     step = next(subtask)
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 480, in action_handler_task
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource     handler_data = handler(*args)
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/server.py", line 759, in handle_update
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource     flavor_id = self.client_plugin().get_flavor_id(flavor)
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/clients/os/nova.py", line 154, in get_flavor_id
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource     raise exception.FlavorMissing(flavor_id=flavor)
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource FlavorMissing: The Flavor ID (dd) could not be found.
2015-09-02 11:43:10.079 2893 TRACE heat.engine.resource 

# heat stack-list
ERROR: The Parameter (username) was not provided.
Comment 4 Martin Schuppert 2015-09-02 05:55:08 EDT
Created attachment 1069331 [details]
orig template
Comment 5 Martin Schuppert 2015-09-02 05:55:31 EDT
Created attachment 1069332 [details]
orig params
Comment 6 Martin Schuppert 2015-09-02 05:55:55 EDT
Created attachment 1069333 [details]
stack change
Comment 7 Martin Schuppert 2015-09-02 05:56:21 EDT
Created attachment 1069334 [details]
stack change params
Comment 9 Zane Bitter 2015-09-02 08:41:42 EDT
I just raised bug 1258967 yesterday to fix this issue in RHOS 7 (it's already fixed upstream in Liberty); the fix is relatively simple, so it may be possible to backport to RHOS 6 also, but I suspect it relies on some other changes between Juno and Kilo so it will require some investigation.
Comment 10 Zane Bitter 2015-10-01 17:39:17 EDT
*** Bug 1263091 has been marked as a duplicate of this bug. ***
Comment 11 Zane Bitter 2015-10-06 22:23:44 EDT
Upstream backport looks good.
Comment 16 Amit Ugol 2016-01-04 10:36:57 EST
I tried the exact same steps to reproduce and was unable to issue an update unless I had all the default parameters as with the original create. This bypasses any subsequent issues that might occur later.
Comment 18 errata-xmlrpc 2016-01-14 08:53:05 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0037.html

Note You need to log in before you can comment on or make changes to this bug.