Bug 1278544
| Summary: | Unrecoverable heat stack in UPDATE_FAILED | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | James Slagle <jslagle> |
| Component: | openstack-heat | Assignee: | Steve Baker <sbaker> |
| Status: | CLOSED ERRATA | QA Contact: | Amit Ugol <augol> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.0 (Kilo) | CC: | kbasil, mburns, ohochman, rbiba, rhel-osp-director-maint, sbaker, sclewis, shardy, yeylon, zbitter |
| Target Milestone: | z3 | Keywords: | ZStream |
| Target Release: | 7.0 (Kilo) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-heat-2015.1.2-2.el7ost | Doc Type: | Bug Fix |
| Doc Text: |
After a failed stack update, Heat was ignoring the contents of the new environment when reading backed up resources; that is, those that were set aside while their replacements were being created. In particular, it was not picking up any new resource type aliases. As a consequence, if a new resource was successfully created using a new type alias in the environment before the update failed, further attempts to update the stack failed due to the inability to load a resource with an unknown type alias. With this update, backup resources are now stored with a merged combination of the old and new environments. As a result, after an update failure in this scenario, a subsequent update can now recover the stack.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-12-21 17:03:14 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1278975 | ||
|
Description
James Slagle
2015-11-05 18:24:21 UTC
The controller resources never seem to go back into UPDATE_IN_PROGRESS: [stack@instack ~]$ heat resource-list c9163ce9-c8ba-4514-871f-e289914c43f9 +---------------+--------------------------------------+-------------------------+-----------------+----------------------+ | resource_name | physical_resource_id | resource_type | resource_status | updated_time | +---------------+--------------------------------------+-------------------------+-----------------+----------------------+ | 1 | 7e8d9da6-cd45-4047-bbc8-e6f266c8de12 | OS::TripleO::Controller | UPDATE_FAILED | 2015-11-05T15:37:03Z | | 2 | 87550c9b-25bb-41a5-8946-27806bc3fa1c | OS::TripleO::Controller | UPDATE_FAILED | 2015-11-05T15:38:06Z | | 0 | d70d6756-61d5-4e57-8237-29f048f7d0ce | OS::TripleO::Controller | UPDATE_FAILED | 2015-11-05T15:38:42Z | +---------------+--------------------------------------+-------------------------+-----------------+----------------------+ Here's an event-list for one of the controllers: [stack@instack ~]$ heat event-list 7e8d9da6-cd45-4047-bbc8-e6f266c8de12 +--------------------------------------------------+--------------------------------------+------------------------------------------------+--------------------+----------------------+ | resource_name | id | resource_status_reason | resource_status | event_time | +--------------------------------------------------+--------------------------------------+------------------------------------------------+--------------------+----------------------+ | overcloud-Controller-hja56vtbtibv-1-dt6boa7xta3j | 17e78a45-521b-40f4-9fb3-11f443f01f22 | Stack CREATE started | CREATE_IN_PROGRESS | 2015-11-05T03:16:26Z | | NodeUserData | 53c537f2-b031-4221-a7d0-96d933ac4759 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:16:26Z | | UpdateConfig | 15309b4f-c4f7-47d0-b7c2-b66ceba6f4fd | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:16:28Z | | NodeUserData | df94bdf8-ca4f-4a16-a949-1eb3404cc6a2 | state changed | CREATE_COMPLETE | 2015-11-05T03:16:31Z | | Controller | a7d0f6a4-9766-46ac-a849-0940378e65d6 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:16:32Z | | UpdateConfig | ad90f69b-8e0c-4f9b-a2a8-bdb1733ca1b8 | state changed | CREATE_COMPLETE | 2015-11-05T03:16:37Z | | Controller | 8eb2cbfb-08d8-47a0-98d4-0871865bd681 | state changed | CREATE_COMPLETE | 2015-11-05T03:24:07Z | | StorageMgmtPort | e09678c5-97cf-4a91-8a7b-2cc67fb8fd52 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:24:08Z | | ExternalPort | 51be2123-647a-456f-b1c3-30a5840feed8 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:24:15Z | | StoragePort | 7fc925bb-7cf5-48b5-b0ea-4215d52c3a6c | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:24:19Z | | UpdateDeployment | 2826d707-8ad9-4bd0-9157-661d5cf03fc9 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:24:29Z | | InternalApiPort | fa6ffe85-7bf3-4760-871d-4abc7dc23798 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:24:38Z | | TenantPort | 7dc2a63e-cb1f-4870-99ae-697a9f8be223 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:24:45Z | | StorageMgmtPort | d4913db7-44b8-44c4-912c-5b1708b533d1 | state changed | CREATE_COMPLETE | 2015-11-05T03:24:50Z | | InternalApiPort | 1e71f067-1c66-4366-a127-9fdeb0a105e7 | state changed | CREATE_COMPLETE | 2015-11-05T03:24:55Z | | ExternalPort | 2cf72ce4-f1eb-4898-8168-31ed4c49c84a | state changed | CREATE_COMPLETE | 2015-11-05T03:24:56Z | | TenantPort | 25b396f6-483c-4ca4-961f-ca8c3284cb14 | state changed | CREATE_COMPLETE | 2015-11-05T03:24:57Z | | StoragePort | 8c5c1200-ba22-426b-b537-171234064644 | state changed | CREATE_COMPLETE | 2015-11-05T03:24:58Z | | NetworkConfig | 8871608e-576e-4974-ae19-913a81cc4ab9 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:25:01Z | | NetIpMap | f1d85348-9e6e-4382-bb6a-d64753897b89 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:25:06Z | | NetIpSubnetMap | d910b261-aae0-4c48-8615-9ea94eeda067 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:25:08Z | | NetIpMap | 9962587a-b5c8-4f91-baff-f6118d7fd7d6 | state changed | CREATE_COMPLETE | 2015-11-05T03:25:13Z | | NetIpSubnetMap | 34f8a077-b027-49ed-ab14-a64ac0abe9e1 | state changed | CREATE_COMPLETE | 2015-11-05T03:25:14Z | | NetworkConfig | 01bd7785-b4f5-4fd3-a801-c1e4ba02accc | state changed | CREATE_COMPLETE | 2015-11-05T03:25:14Z | | ControllerConfig | e1868780-1319-4b20-ae9c-897f85de8446 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:25:14Z | | NetworkDeployment | e1fbe77f-7b74-4826-9318-0f0833e9beab | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:25:14Z | | ControllerConfig | 033c899a-2123-49dc-ad86-b612f755ce25 | state changed | CREATE_COMPLETE | 2015-11-05T03:25:17Z | | UpdateDeployment | 8911cc70-09e1-4296-9071-ed975585faf2 | Signal: deployment succeeded | SIGNAL_IN_PROGRESS | 2015-11-05T03:29:26Z | | NetworkDeployment | e7820c39-40a0-4a5c-b627-080c03c884fd | Signal: deployment succeeded | SIGNAL_IN_PROGRESS | 2015-11-05T03:29:29Z | | UpdateDeployment | cf45cec3-6b39-4305-b96e-2ea05502626d | state changed | CREATE_COMPLETE | 2015-11-05T03:29:30Z | | NetworkDeployment | 6d8b44b4-38ad-4431-aef3-d11bb813d2b6 | state changed | CREATE_COMPLETE | 2015-11-05T03:29:31Z | | ControllerDeployment | 7a54d409-9d5d-49e1-b43b-1922b0e9938a | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:29:33Z | | ControllerDeployment | 0c834309-aac7-4e90-9d40-1e28213fb4ce | Signal: deployment succeeded | SIGNAL_IN_PROGRESS | 2015-11-05T03:30:26Z | | NetworkDeployment | db5dfdbb-f019-42b4-bc17-51ceba9b5e59 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:30:27Z | | ControllerDeployment | 3fbe851d-fa61-4d0f-8ea9-a3daf860cd92 | state changed | CREATE_COMPLETE | 2015-11-05T03:30:28Z | | ControllerExtraConfigPre | befd264c-628e-4e67-b830-87967fac8d02 | state changed | CREATE_IN_PROGRESS | 2015-11-05T03:30:29Z | | ControllerExtraConfigPre | d9d6c72f-0811-48b7-9792-03dd11a7d97a | state changed | CREATE_COMPLETE | 2015-11-05T03:30:36Z | | overcloud-Controller-hja56vtbtibv-1-dt6boa7xta3j | ebf2fe96-f636-4a24-bd29-750669cc16da | Stack CREATE completed successfully | CREATE_COMPLETE | 2015-11-05T03:30:36Z | | ControllerDeployment | ecca66da-b87a-4202-a6b8-6105c7fd0062 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:31:43Z | | NetworkDeployment | 7fcf443d-562f-4a24-8687-a6b1bd969101 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:31:44Z | | ControllerDeployment | 2ddb77c0-5e6f-4d5b-af4f-b7f150bbe5d1 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:34:42Z | | NetworkDeployment | 9ff5ebb2-b6c5-419f-b1c6-dc448da13f08 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:34:43Z | | ControllerDeployment | c13aff29-09ff-436d-92c3-5623df222a15 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:37:36Z | | NetworkDeployment | 4796a100-de14-4e5f-b7ba-3a41041b089a | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:37:36Z | | ControllerDeployment | cd37b9d0-88ed-428a-a025-9d4fe2ac1ba6 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:39:01Z | | NetworkDeployment | 477f572f-a422-41d5-8057-3277246848cc | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:39:02Z | | ControllerDeployment | 07f2364f-2db6-4246-84ca-beb51b03ca56 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:42:28Z | | NetworkDeployment | 44cad5bb-40f8-4911-a4fc-997ad13acc8b | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:42:30Z | | ControllerDeployment | 23125460-37a0-42c5-ab2f-7d185ff8e491 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:45:37Z | | NetworkDeployment | 50408a12-c1c8-4fdf-a273-5d5026f52a54 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:45:38Z | | ControllerDeployment | a7fac72c-cae6-437f-8b5a-aa667be56df8 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:48:44Z | | NetworkDeployment | 211d2bbf-792e-44c5-88f4-8b470e0e8df2 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:48:44Z | | ControllerDeployment | 542686b7-0961-414a-9249-a8c8c1aac12d | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:52:39Z | | NetworkDeployment | fc53abc0-f7b7-4830-b7d4-4999fdbd0e93 | Unknown | SIGNAL_COMPLETE | 2015-11-05T03:52:39Z | | overcloud-Controller-hja56vtbtibv-1-dt6boa7xta3j | ad5eb07f-77a3-49d7-8648-6eecbdee078d | Stack UPDATE started | UPDATE_IN_PROGRESS | 2015-11-05T15:38:06Z | | NodeUserData | 8cf6ff67-ddbf-46fe-bb2e-844e7de8cb36 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:38:13Z | | UpdateConfig | 64d3556e-1255-4afc-97f2-24ea104b3de3 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:38:16Z | | NodeUserData | 6d3dc8e5-f77d-43ec-84dc-5e6c80d48310 | state changed | UPDATE_COMPLETE | 2015-11-05T15:38:25Z | | StorageMgmtPort | 3727d758-717a-467a-8130-47f208c3b1c2 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:38:27Z | | StoragePort | 171dd65a-424b-40ad-ad79-0f3ae7c65738 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:38:29Z | | InternalApiPort | 7a026c8c-4a8e-419c-9bce-ee02e43b6c35 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:38:35Z | | ExternalPort | 3337ec1d-2533-4c89-8fce-bac27f252631 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:38:43Z | | TenantPort | 328127d9-4a96-4a3c-a0c5-e763854cb412 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:38:56Z | | StorageMgmtPort | 84922ac0-836b-48bb-b01d-72304e4f2069 | state changed | UPDATE_COMPLETE | 2015-11-05T15:39:12Z | | UpdateConfig | 9bef6715-5703-4ea6-97f1-738af23c35de | state changed | UPDATE_COMPLETE | 2015-11-05T15:39:12Z | | ExternalPort | 8e27061a-dd0c-4c96-b5b0-8610f8a67710 | state changed | UPDATE_COMPLETE | 2015-11-05T15:39:13Z | | StoragePort | 5dded725-3fde-4791-a517-76b88ff44763 | state changed | UPDATE_COMPLETE | 2015-11-05T15:39:14Z | | InternalApiPort | 8d05c966-3449-4d99-9fc1-da33736cbb59 | state changed | UPDATE_COMPLETE | 2015-11-05T15:39:15Z | | TenantPort | 8c56d989-4d98-4be6-bf53-3325ef5784a5 | state changed | UPDATE_COMPLETE | 2015-11-05T15:39:16Z | | NetIpSubnetMap | 1b48901e-6889-442d-9075-f77addbf8e65 | state changed | CREATE_IN_PROGRESS | 2015-11-05T15:39:19Z | | UpdateDeployment | 621b20e4-4c0b-4451-9e29-b888e67c64be | UPDATE paused until Hook pre-update is cleared | CREATE_COMPLETE | 2015-11-05T15:39:31Z | | NetIpMap | ce71b163-dbe5-4c59-bb03-8f94755011f3 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:39:33Z | | NetworkConfig | afd816cb-ad23-4948-b3d8-2e91bbaad304 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:39:42Z | | NetIpSubnetMap | d78bf9f4-e002-4e33-a1bf-60d52c79baaf | state changed | CREATE_COMPLETE | 2015-11-05T15:39:51Z | | NetIpMap | 502eb3f4-4848-47c6-939e-9e840b8a84ee | state changed | UPDATE_COMPLETE | 2015-11-05T15:39:52Z | | ControllerConfig | b33d29bd-fa6a-4e69-b6e5-58e5c2f59f29 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:39:54Z | | ControllerConfig | 8c0f795a-4c6d-4908-851a-29bfe3fea1fb | state changed | CREATE_IN_PROGRESS | 2015-11-05T15:39:56Z | | ControllerConfig | 1842aca0-7a6c-4270-ad2a-4396043e553b | state changed | CREATE_COMPLETE | 2015-11-05T15:40:00Z | | NetworkConfig | 04b2330e-43fe-4a0b-82c3-8d18bffabe1d | state changed | UPDATE_COMPLETE | 2015-11-05T15:40:05Z | | NetworkDeployment | 956f8eda-bb13-47f0-96d7-ffd503f4f1a2 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T15:40:07Z | | ControllerDeployment | bf13eacb-2b5b-4780-a6a5-cd9c5fe78a67 | Unknown | SIGNAL_COMPLETE | 2015-11-05T15:42:33Z | | NetworkDeployment | e26f2a08-34c8-4d04-9efd-92e6ea96adb4 | Signal: deployment succeeded | SIGNAL_IN_PROGRESS | 2015-11-05T15:42:34Z | | NetworkDeployment | c2eb6e5f-3b3a-4eb9-bf04-6369292b09e8 | state changed | UPDATE_COMPLETE | 2015-11-05T15:42:35Z | | UpdateDeployment | 929c7e72-ec34-465c-886f-0b20f480cabd | Hook pre-update is cleared | CREATE_COMPLETE | 2015-11-05T17:00:43Z | | UpdateDeployment | 912316dd-38e3-434c-8e84-9399b0705fe4 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T17:00:44Z | | UpdateDeployment | c5fc528d-90f5-44fb-903a-8e36ef20658b | Signal: deployment succeeded | SIGNAL_IN_PROGRESS | 2015-11-05T17:14:25Z | | UpdateDeployment | 0e32df7d-7363-41c6-a399-9e875fe2158c | state changed | UPDATE_COMPLETE | 2015-11-05T17:14:25Z | | ControllerDeployment | 02dcfb86-26e6-4875-bfc8-d3d1d5e6f17c | state changed | UPDATE_IN_PROGRESS | 2015-11-05T17:14:25Z | | ControllerDeployment | 6d98b01e-0fbc-4f68-b9fd-8abc69818dbf | Signal: deployment succeeded | SIGNAL_IN_PROGRESS | 2015-11-05T17:14:28Z | | NetworkDeployment | 868fb6f1-09ac-49b0-9056-3dad1b055bb2 | Unknown | SIGNAL_COMPLETE | 2015-11-05T17:14:28Z | | ControllerDeployment | ceaddd04-fa76-426d-8193-f43b59a427dc | state changed | UPDATE_COMPLETE | 2015-11-05T17:14:29Z | | ControllerExtraConfigPre | 2f58f504-22f9-4af8-a5df-1a9578eb2ed3 | state changed | UPDATE_IN_PROGRESS | 2015-11-05T17:14:29Z | | ControllerExtraConfigPre | 91c58a9e-a163-4b48-bbcc-745a6baeb5f5 | state changed | UPDATE_COMPLETE | 2015-11-05T17:14:32Z | | NodeExtraConfig | 9f8d9703-a951-4b1b-865a-5ef53e8fa71b | state changed | CREATE_IN_PROGRESS | 2015-11-05T17:14:32Z | | NodeExtraConfig | 46746abe-cace-4d5c-9996-3acd15aecd5b | state changed | CREATE_COMPLETE | 2015-11-05T17:14:35Z | | NetworkDeployment | 94abc64c-b24b-45e4-8c9d-807d83569894 | Unknown | SIGNAL_COMPLETE | 2015-11-05T17:15:11Z | | ControllerDeployment | 962d8bf0-3809-46ee-a7f9-3ed95e6d4a37 | Unknown | SIGNAL_COMPLETE | 2015-11-05T17:15:11Z | +--------------------------------------------------+--------------------------------------+------------------------------------------------+--------------------+----------------------+ No new breakpoints ever get added: [stack@instack ~]$ heat hook-poll -n 5 overcloud +----+------------------------+-----------------+------------+------------+ | id | resource_status_reason | resource_status | event_time | stack_name | +----+------------------------+-----------------+------------+------------+ +----+------------------------+-----------------+------------+------------+ I've uploaded the heat logs here: http://file.rdu.redhat.com/~jslagle/bug-1278544/ note that the output from the client is just a seemingly infinite list of "IN_PROGRESS" It's weird, b/c it appears nothing is in progress: [stack@instack ~]$ heat stack-list heat +--------------------------------------+------------+--------------------+----------------------+ | id | stack_name | stack_status | creation_time | +--------------------------------------+------------+--------------------+----------------------+ | d8eae9e6-c64e-4ce6-aef7-244979bfc0f1 | overcloud | UPDATE_IN_PROGRESS | 2015-11-05T03:15:41Z | +--------------------------------------+------------+--------------------+----------------------+ [stack@instack ~]$ heat resource-list overcloud +-----------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+ | resource_name | physical_resource_id | resource_type | resource_status | updated_time | +-----------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+ | BlockStorageAllNodesDeployment | c7586f0e-4c6f-40e3-b20a-dc92298589cc | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | BlockStorageNodesPostDeployment | d37e0e34-9654-4543-b286-2668fec896a4 | OS::TripleO::BlockStoragePostDeployment | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | CephClusterConfig | 33e939b1-54a6-48ff-a235-18459c6ec36c | OS::TripleO::CephClusterConfig::SoftwareConfig | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | CephStorageAllNodesDeployment | ea28a95a-aaa8-4b4f-b1b0-0d4b6a7d4737 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | CephStorageCephDeployment | 58f40370-606b-4ec8-be06-070b24373778 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | CephStorageNodesPostDeployment | 5224cd58-9723-4679-8185-d06215a7f3d7 | OS::TripleO::CephStoragePostDeployment | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ComputeAllNodesDeployment | 9a86b367-f991-47dd-bc0f-9b857d0bfdcd | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ComputeCephDeployment | e6bf8562-b529-4123-bdb0-ac10a63fbba9 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ComputeNodesPostDeployment | 22db48bb-6dc2-4c4c-ad3d-58e299fc3500 | OS::TripleO::ComputePostDeployment | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ControllerAllNodesDeployment | 443ac7d7-837e-401b-ae14-045bddc08e86 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ControllerBootstrapNodeConfig | 4163b982-a79f-416e-b123-afacddba8e27 | OS::TripleO::BootstrapNode::SoftwareConfig | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ControllerBootstrapNodeDeployment | 87534065-3b51-461c-8ddf-8148dfe2f198 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ControllerCephDeployment | e787564e-fa56-4e7c-8f5e-d9a9304132fa | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ControllerClusterConfig | f3dc9941-ce27-4713-955c-1dca9ced8d23 | OS::Heat::StructuredConfig | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ControllerClusterDeployment | 853073c9-f694-4592-bd87-c7bdfee95d96 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ControllerIpListMap | 85630f7d-c4cf-4d37-9bd4-af9234b91737 | OS::TripleO::Network::Ports::NetIpListMap | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ControllerNodesPostDeployment | afe25c74-a231-4df6-bb61-c84697d8277d | OS::TripleO::ControllerPostDeployment | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ControllerSwiftDeployment | dc3bbe28-6a44-462e-b0e6-f8773e96fe04 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | HeatAuthEncryptionKey | overcloud-HeatAuthEncryptionKey-6ljv5gid5hxi | OS::Heat::RandomString | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | HorizonSecret | overcloud-HorizonSecret-6x5wuqwd6dq6 | OS::Heat::RandomString | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | MysqlClusterUniquePart | overcloud-MysqlClusterUniquePart-kesbckxdev67 | OS::Heat::RandomString | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | MysqlRootPassword | overcloud-MysqlRootPassword-e3olccywlv67 | OS::Heat::RandomString | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ObjectStorageAllNodesDeployment | 7c2529c8-f4f3-4e41-869e-b3e3af936829 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ObjectStorageNodesPostDeployment | a15a83fe-91fe-4757-bc2a-31846c48bcdd | OS::TripleO::ObjectStoragePostDeployment | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | ObjectStorageSwiftDeployment | 5a09f483-79ce-4454-bbe1-52f5a3017bed | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | PcsdPassword | overcloud-PcsdPassword-otnh56wmmlyo | OS::Heat::RandomString | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | RabbitCookie | overcloud-RabbitCookie-suxlvfl5cd3m | OS::Heat::RandomString | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | SwiftDevicesAndProxyConfig | e49da330-9376-4046-bd8e-116b43d7a2b1 | OS::TripleO::SwiftDevicesAndProxy::SoftwareConfig | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | VipDeployment | 53768f74-70c6-40e5-ab27-ec374616edb1 | OS::Heat::StructuredDeployments | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | allNodesConfig | e95a6953-b590-49a4-a993-d5fbfc0efd8c | OS::TripleO::AllNodes::SoftwareConfig | CREATE_COMPLETE | 2015-11-05T03:15:42Z | | Controller | c9163ce9-c8ba-4514-871f-e289914c43f9 | OS::Heat::ResourceGroup | UPDATE_FAILED | 2015-11-05T15:36:57Z | | BlockStorage | 7075b578-1939-46d0-935d-3dbf6969673f | OS::Heat::ResourceGroup | UPDATE_COMPLETE | 2015-11-05T15:37:03Z | | VipConfig | 58d0557a-7068-4221-bf6a-6e5f534df6c4 | OS::TripleO::VipConfig | UPDATE_COMPLETE | 2015-11-05T16:26:10Z | | Networks | 8976fd6f-06a7-4ff6-8d0f-657563160a5d | OS::TripleO::Network | UPDATE_COMPLETE | 2015-11-05T16:26:11Z | | ControlVirtualIP | 6b1b4ac1-0dd3-40f6-a7ac-cf6333983092 | OS::TripleO::Network::Ports::CtlplaneVipPort | UPDATE_COMPLETE | 2015-11-05T16:26:41Z | | ObjectStorage | 6d3259eb-93fe-4932-b652-1b69eb589e43 | OS::Heat::ResourceGroup | UPDATE_COMPLETE | 2015-11-05T16:26:46Z | | CephStorage | c5eaa74d-9a34-4fc8-8490-1a33bb4e3704 | OS::Heat::ResourceGroup | UPDATE_COMPLETE | 2015-11-05T16:26:48Z | | StorageMgmtVirtualIP | 9875176b-f7f7-4765-9913-ad6b3fb751a0 | OS::TripleO::Network::Ports::StorageMgmtVipPort | UPDATE_COMPLETE | 2015-11-05T16:26:53Z | | RedisVirtualIP | 3d9d95c3-d7e3-4cc0-8891-9f131ddde6ad | OS::TripleO::Network::Ports::RedisVipPort | UPDATE_COMPLETE | 2015-11-05T16:26:55Z | | PublicVirtualIP | 109a58f9-2ce4-4f33-8172-5b442fee7dfb | OS::TripleO::Network::Ports::ExternalVipPort | UPDATE_COMPLETE | 2015-11-05T16:27:02Z | | StorageVirtualIP | c08bfd65-9831-465b-a7a7-a43a330f2536 | OS::TripleO::Network::Ports::StorageVipPort | UPDATE_COMPLETE | 2015-11-05T16:27:09Z | | InternalApiVirtualIP | a6dffc7e-59a5-4594-91c7-c4c7bd8c64a8 | OS::TripleO::Network::Ports::InternalApiVipPort | UPDATE_COMPLETE | 2015-11-05T16:27:12Z | | VipMap | 7868bf41-836a-4508-acfd-291e8caede35 | OS::TripleO::Network::Ports::NetVipMap | UPDATE_COMPLETE | 2015-11-05T16:27:16Z | | Compute | eb7dcb92-07d4-4f2e-8839-4f24c8bf7b53 | OS::Heat::ResourceGroup | UPDATE_FAILED | 2015-11-05T16:27:20Z | +-----------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+ Even doing a heat resource-list -n 10 overcloud, nothing shows IN_PROGRESS i tailed the heat-engine.log for a minute or so and saved that to a file: http://file.rdu.redhat.com/~jslagle/bug-1278544/heat-engine-tailed.log it's just the same pattern over and over again. I wonder if it's stuck in an infinite loop? If a stack gets stuck in IN_PROGRESS, it's almost certainly because there was an uncaught exception (https://bugs.launchpad.net/heat/+bug/1492433), and because it's uncaught it's also not logged in heat-engine.log (https://bugs.launchpad.net/heat/+bug/1492427). You should be able to find the traceback in the journal (thanks systemd!), and from there we can diagnose the bug. BTW the workaround would be to restart heat-engine and then use the workaround for bug 1267364. That will get your stacks back to the FAILED state so that they're not stuck IN_PROGRESS any more. When you say you're upgrading from 7.0 to the latest poodle, did you start by upgrading the undercloud to latest poodle first? (In reply to Steve Baker from comment #11) > When you say you're upgrading from 7.0 to the latest poodle, did you start > by upgrading the undercloud to latest poodle first? Yes, first step in updating from 7.0 is to update the undercloud, and make sure services are restarted (should happen automatically via package updates). I didn't have the newest heat build though with the 2 patches bug 1267364, looks like that was done yesterday just shortly after i had updated the undercloud ;). I updated to those today, and restarted heat-engine. The stack is still stuck in UPDATE_IN_PROGRESS. Is that still expected? Is wasn't clear to me from the bug if the 2 patches remove the need for the manual db sql workaround, or if I still have to do that. (In reply to Zane Bitter from comment #9) > If a stack gets stuck in IN_PROGRESS, it's almost certainly because there > was an uncaught exception (https://bugs.launchpad.net/heat/+bug/1492433), > and because it's uncaught it's also not logged in heat-engine.log > (https://bugs.launchpad.net/heat/+bug/1492427). You should be able to find > the traceback in the journal (thanks systemd!), and from there we can > diagnose the bug. it looks like the journal got rotated, and we weren't saving old journal files (no /var/log/journal) in 7.0. If this happens again, what unit should I look at to see the traceback? Would it be openstack-heat-engine? using the newest heat build with the 2 patches from https://bugzilla.redhat.com/show_bug.cgi?id=1267364 and restarting heat-engine, this issue still remains. zaneb indicated the manual sql workaround shouldnt be needed anymore. so there must be something else going on. > I updated to those today, and restarted heat-engine. The stack is still stuck in UPDATE_IN_PROGRESS. Is that still expected? No, not expected. At startup, Heat goes through all stacks that are IN_PROGRESS and tries to break their locks (i.e. we ping the engine that owns the lock, and if it doesn't reply we steal it) and move them to FAILED. This was broken for nested stacks, and didn't move the member resources to FAILED, and that's what the patches for bug 1267364 fixed. > what unit should I look at to see the traceback? Would it be openstack-heat-engine? Yes. i've reproduced this now and captured the actual Heat exception in a new bug: https://bugzilla.redhat.com/show_bug.cgi?id=1278975 given the traceback there, my initial suspicion is that python-rdomanager-oscplugin did something wrong (didn't send the environment I asked it to, or something else). Still, I think this existing bug is valid. We ought to be able to recover the stack somehow in such a situation. Even if we have to bounce the heat-engine service. i tried the sql from https://bugzilla.redhat.com/show_bug.cgi?id=1267364 : UPDATE stack SET status="FAILED" WHERE status="IN_PROGRESS" AND action="UPDATE"; UPDATE resource SET status="FAILED" WHERE status="IN_PROGRESS" AND action="UPDATE"; obviously my stack is in UPDATE_FAILED now :). going to see if i can figure out what causes https://bugzilla.redhat.com/show_bug.cgi?id=1278975 now Are there any possible workarounds? I also have a stack in progress, but no resources in progress and no hooks waiting to be cleared. [stack@instack ~]$ heat resource-list overcloud -n10 | grep PROG [stack@instack ~]$ heat hook-poll overcloud -n10 +----+------------------------+-----------------+------------+------------+ | id | resource_status_reason | resource_status | event_time | stack_name | +----+------------------------+-----------------+------------+------------+ +----+------------------------+-----------------+------------+------------+ [stack@instack ~]$ heat stack-list +--------------------------------------+------------+--------------------+----------------------+ | id | stack_name | stack_status | creation_time | +--------------------------------------+------------+--------------------+----------------------+ | 6de1d453-b12d-421f-96be-d42b8bf93f5a | overcloud | UPDATE_IN_PROGRESS | 2015-11-11T12:42:36Z | +--------------------------------------+------------+--------------------+----------------------+ Works better now. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2015:2680 |