Bug 1389040

Summary: [OSP-Director-10] upgrade from OSP 9 to OSP 10 fails because of stopped resources on the cluster failure ( during: UPGRADE CONTROLLER AND BLOCKSTORAGE).
Product: Red Hat OpenStack Reporter: mlammon
Component: openstack-tripleo-heat-templatesAssignee: Michele Baldessari <michele>
Status: CLOSED ERRATA QA Contact: Omri Hochman <ohochman>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 10.0 (Newton)CC: dbecker, ipilcher, jcoufal, jschluet, mandreou, mburns, michele, mlammon, morazi, ohochman, rhel-osp-director-maint, sasha, sathlang
Target Milestone: rcKeywords: Triaged
Target Release: 10.0 (Newton)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-5.1.0-3.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-14 16:25:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
heat-engine.log
none
full messages none

Description mlammon 2016-10-26 17:23:55 UTC
[OSP-Director-10] upgrade from OSP 9 to OSP 10 fails because of stopped resources on the cluster failure. 

Environment:
instack-undercloud-5.0.0-2.el7ost.noarch
instack-5.0.0-1.el7ost.noarch
openstack-heat-api-cfn-7.0.0-4.el7ost.noarch
openstack-heat-common-7.0.0-4.el7ost.noarch
openstack-heat-api-7.0.0-4.el7ost.noarch
openstack-heat-templates-0.0.1-0.20161011152629.40a4ed0.el7ost.noarch
openstack-heat-engine-7.0.0-4.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-0.8.0rc3.el7ost.noarch
openstack-tripleo-heat-templates-compat-2.0.0-34.3.el7ost.noarch


Description of the problem:
The attempt to upgrade from OSP 9 to OSP 10 fails because of stopped resources on the cluster.  The failure occurs while attempting to update the UPGRADE CONTROLLER AND BLOCKSTORAGE step.  pcs resources were checked before this step and all resources were good.

Steps:
(1) Deploy OSP 9 
(2) Attempt to upgrade the undercloud

Results:
openstack undercloud upgrade successfully

Errors:
[stack@undercloud-0 ~]$ heat stack-list
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+------------+---------------+----------------------+----------------------+
| id                                   | stack_name | stack_status  | creation_time        | updated_time         |
+--------------------------------------+------------+---------------+----------------------+----------------------+
| ee358f54-d537-4fed-be1b-615f50013a07 | overcloud  | UPDATE_FAILED | 2016-10-26T06:28:33Z | 2016-10-26T09:20:58Z |
+--------------------------------------+------------+---------------+----------------------+----------------------+
[stack@undercloud-0 ~]$ heat resource-list ee358f54-d537-4fed-be1b-615f50013a07
WARNING (shell) "heat resource-list" is deprecated, please use "openstack stack resource list" instead
+-------------------------------------------+----------------------------------------------+-------------------------------------------------+-----------------+----------------------+
| resource_name                             | physical_resource_id                         | resource_type                                   | resource_status | updated_time         |
+-------------------------------------------+----------------------------------------------+-------------------------------------------------+-----------------+----------------------+
| HeatAuthEncryptionKey                     | overcloud-HeatAuthEncryptionKey-4a7kbhpfwasr | OS::Heat::RandomString                          | CREATE_COMPLETE | 2016-10-26T06:28:33Z |
| HorizonSecret                             | overcloud-HorizonSecret-r4fbtwduyzx2         | OS::Heat::RandomString                          | CREATE_COMPLETE | 2016-10-26T06:28:33Z |
| PcsdPassword                              | overcloud-PcsdPassword-wbxjyekdkxa2          | OS::Heat::RandomString                          | CREATE_COMPLETE | 2016-10-26T06:28:33Z |
| RabbitCookie                              | overcloud-RabbitCookie-ktxqfrvxbay6          | OS::Heat::RandomString                          | CREATE_COMPLETE | 2016-10-26T06:28:33Z |
| ControlVirtualIP                          | 68a5d255-4e17-4a1b-8b28-063dcd5fb88f         | OS::Neutron::Port                               | CREATE_COMPLETE | 2016-10-26T06:28:34Z |
| MysqlRootPassword                         | overcloud-MysqlRootPassword-ecdxbntwogmx     | OS::Heat::RandomString                          | CREATE_COMPLETE | 2016-10-26T06:28:34Z |
| AllNodesDeploySteps                       | cfc279a0-14e1-4598-9148-5fa9e9be63fe         | OS::TripleO::PostDeploySteps                    | CREATE_COMPLETE | 2016-10-26T08:55:53Z |
| AllNodesExtraConfig                       | a203bf7f-bfb1-4bdc-824a-7affba41cfc7         | OS::TripleO::AllNodesExtraConfig                | CREATE_COMPLETE | 2016-10-26T09:03:38Z |
| Networks                                  | 462ab489-6cd7-446a-b14f-5217d6672d4b         | OS::TripleO::Network                            | UPDATE_COMPLETE | 2016-10-26T09:21:11Z |
| ServiceNetMap                             | 0e5d3fe3-e42a-415e-8d82-436b21faa3ed         | OS::TripleO::ServiceNetMap                      | UPDATE_COMPLETE | 2016-10-26T09:21:11Z |
| DefaultPasswords                          | 6c107d0d-d7c8-4a8b-b939-cbfdb9bda9f7         | OS::TripleO::DefaultPasswords                   | UPDATE_COMPLETE | 2016-10-26T09:21:12Z |
| StorageVirtualIP                          | 2966a9ba-2e2b-41ba-bb05-d6146440d6e8         | OS::TripleO::Network::Ports::StorageVipPort     | UPDATE_COMPLETE | 2016-10-26T09:21:19Z |
| PublicVirtualIP                           | 23300c72-0e41-4c2c-9d1c-ba334e3edd14         | OS::TripleO::Network::Ports::ExternalVipPort    | UPDATE_COMPLETE | 2016-10-26T09:21:20Z |
| InternalApiVirtualIP                      | fb01e7bf-197b-4716-ba94-be222aa9d908         | OS::TripleO::Network::Ports::InternalApiVipPort | UPDATE_COMPLETE | 2016-10-26T09:21:21Z |
| StorageMgmtVirtualIP                      | f532a479-6b68-494d-9f9f-62e30d84c5b7         | OS::TripleO::Network::Ports::StorageMgmtVipPort | UPDATE_COMPLETE | 2016-10-26T09:21:22Z |
| RedisVirtualIP                            | ae833ff7-a95b-4d4c-9ca2-cd005ab70b88         | OS::TripleO::Network::Ports::RedisVipPort       | UPDATE_COMPLETE | 2016-10-26T09:21:23Z |
| VipMap                                    | 5b3c93ac-31b6-4a1b-9e34-85fbb1b8bf18         | OS::TripleO::Network::Ports::NetVipMap          | UPDATE_COMPLETE | 2016-10-26T09:21:25Z |
| EndpointMap                               | 0eaa45f6-e23e-4eca-beb9-e55d9b938426         | OS::TripleO::EndpointMap                        | UPDATE_COMPLETE | 2016-10-26T09:21:29Z |
| BlockStorageServiceChain                  | 9e3fb6ed-5325-444d-869f-41013472f772         | OS::TripleO::Services                           | UPDATE_COMPLETE | 2016-10-26T09:21:32Z |
| ControllerServiceChain                    | 0e1cf836-2492-4537-844d-50bbb6d1c4ca         | OS::TripleO::Services                           | UPDATE_COMPLETE | 2016-10-26T09:21:34Z |
| ObjectStorageServiceChain                 | 8921434f-fbfc-40a2-b5d0-6401cb19e3df         | OS::TripleO::Services                           | UPDATE_COMPLETE | 2016-10-26T09:21:40Z |
| CephStorageServiceChain                   | 0bfc8dbf-f5bb-4c00-a1da-aba0ecb4d7a5         | OS::TripleO::Services                           | UPDATE_COMPLETE | 2016-10-26T09:21:42Z |
| ComputeServiceChain                       | a3ae0b60-9639-4e86-91e9-c9497895b283         | OS::TripleO::Services                           | UPDATE_COMPLETE | 2016-10-26T09:21:44Z |
| Compute                                   | 3af842ba-0d12-4ee4-a801-951f15f66853         | OS::Heat::ResourceGroup                         | UPDATE_COMPLETE | 2016-10-26T09:23:36Z |
| ObjectStorage                             | eb7ad4fe-9213-419a-97df-4b8869abf843         | OS::Heat::ResourceGroup                         | UPDATE_COMPLETE | 2016-10-26T09:23:58Z |
| CephStorage                               | c856e0e6-8bde-4551-a34c-71793cc11034         | OS::Heat::ResourceGroup                         | UPDATE_COMPLETE | 2016-10-26T09:23:59Z |
| BlockStorage                              | 93014158-6a6f-48a3-9621-0ff9fa6f8815         | OS::Heat::ResourceGroup                         | UPDATE_COMPLETE | 2016-10-26T09:24:04Z |
| Controller                                | 94b62e92-0b6a-40bc-be34-60665c6f68be         | OS::Heat::ResourceGroup                         | UPDATE_COMPLETE | 2016-10-26T09:24:05Z |
| ObjectStorageIpListMap                    | 08d24ec3-beae-4819-be45-4f3da2212b97         | OS::TripleO::Network::Ports::NetIpListMap       | UPDATE_COMPLETE | 2016-10-26T09:24:14Z |
| BlockStorageIpListMap                     | a05398f0-1f91-4102-8106-9e0c4d3481cc         | OS::TripleO::Network::Ports::NetIpListMap       | UPDATE_COMPLETE | 2016-10-26T09:24:15Z |
| ComputeIpListMap                          | fb395bbf-09b7-40f3-abae-b203ea2a4a0a         | OS::TripleO::Network::Ports::NetIpListMap       | UPDATE_COMPLETE | 2016-10-26T09:25:09Z |
| CephStorageIpListMap                      | 6c4deaf8-b0bb-4d5e-95f8-4ce06eebc0a7         | OS::TripleO::Network::Ports::NetIpListMap       | UPDATE_COMPLETE | 2016-10-26T09:25:14Z |
| ControllerIpListMap                       | 6c0c8a66-cbce-4460-967b-5dfcdd22485d         | OS::TripleO::Network::Ports::NetIpListMap       | UPDATE_COMPLETE | 2016-10-26T09:25:30Z |
| AllNodesValidationConfig                  | 472ffb24-f8e2-4577-83e5-828dc4b1eb1b         | OS::TripleO::AllNodes::Validation               | UPDATE_COMPLETE | 2016-10-26T09:25:38Z |
| hostsConfig                               | 5dd3177e-f6cc-44a9-a7a1-40870e4c370f         | OS::TripleO::Hosts::SoftwareConfig              | UPDATE_COMPLETE | 2016-10-26T09:25:39Z |
| ControllerHostsDeployment                 | 3e203fae-c2d6-4836-8dc4-79fadfab3eeb         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:25:41Z |
| CephStorageHostsDeployment                | 3a9b01c6-4f72-42c9-b1f1-668d34310d2b         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:25:42Z |
| ObjectStorageHostsDeployment              | 1c6510a3-a97c-48a5-8883-f244865a72cd         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:25:42Z |
| BlockStorageHostsDeployment               | 51961fc2-f2b6-48b7-ab7f-2b8f083d743d         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:25:43Z |
| ComputeHostsDeployment                    | 7a3797f3-bc0e-4618-9d18-e2754806ba24         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:25:43Z |
| allNodesConfig                            | a9cb2ef1-9ac0-4560-8017-9bffbd8a7f63         | OS::TripleO::AllNodes::SoftwareConfig           | UPDATE_COMPLETE | 2016-10-26T09:25:44Z |
| ControllerAllNodesDeployment              | 84113895-64a1-47d6-bf45-685d364bae58         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:25:58Z |
| ComputeAllNodesDeployment                 | 2ec1ff8b-9264-4132-899b-a16a811460f5         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:25:59Z |
| ObjectStorageAllNodesDeployment           | 82311408-f82b-4526-b795-30dbaa14ec0a         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:25:59Z |
| CephStorageAllNodesDeployment             | d724dea5-5b55-4279-b100-7f90232d569f         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:26:00Z |
| BlockStorageAllNodesDeployment            | fc958448-26aa-47e0-8876-778830e28dda         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:26:01Z |
| ObjectStorageAllNodesValidationDeployment | 85297120-f7be-47b3-bbb1-ed943b86e1c1         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:26:03Z |
| BlockStorageAllNodesValidationDeployment  | 5308dd18-457a-49f8-b951-3308bd706636         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:26:05Z |
| ComputeAllNodesValidationDeployment       | e533d1f0-2289-4bb7-9a66-f7bd71d3f991         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:26:49Z |
| CephStorageAllNodesValidationDeployment   | ca7e8e43-2805-444e-9ae5-7f24d30f49d7         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:27:04Z |
| ControllerAllNodesValidationDeployment    | cedeb793-6b4f-405c-88f9-3da7483eedf6         | OS::Heat::StructuredDeployments                 | UPDATE_COMPLETE | 2016-10-26T09:27:36Z |
| UpdateWorkflow                            | 70a67c40-65f5-4359-92e9-fd41ac6cf2d6         | OS::TripleO::Tasks::UpdateWorkflow              | UPDATE_FAILED   | 2016-10-26T09:27:37Z |
+-------------------------------------------+----------------------------------------------+-------------------------------------------------+-----------------+----------------------+
[stack@undercloud-0 ~]$ heat resource-list 70a67c40-65f5-4359-92e9-fd41ac6cf2d6
WARNING (shell) "heat resource-list" is deprecated, please use "openstack stack resource list" instead
+--------------------------------------------+--------------------------------------+-----------------------------------+-----------------+----------------------+
| resource_name                              | physical_resource_id                 | resource_type                     | resource_status | updated_time         |
+--------------------------------------------+--------------------------------------+-----------------------------------+-----------------+----------------------+
| ControllerPacemakerUpgradeConfig_Step2     | 6f67c319-c228-49d7-bb7b-81b4c5553c8c | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:40Z |
| ControllerPacemakerUpgradeConfig_Step1     | 7b0da71c-e7ee-4671-8af5-e43be2a32d08 | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:42Z |
| ControllerPacemakerUpgradeConfig_Step5     | 4636b4c4-243e-4e00-bad2-1ae244fe2770 | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:42Z |
| ControllerPacemakerUpgradeConfig_Step3     | 4a02e0ee-0961-4095-aa38-65ca4007ebe2 | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:43Z |
| ControllerPacemakerUpgradeConfig_Step4     | baf622bf-60f7-4ec3-bbef-d8f7005e09fe | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:43Z |
| CephMonUpgradeConfig                       | 8b3c144a-0222-4228-9b2e-d0c30da33f33 | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:44Z |
| CephMonUpgradeDeployment                   | 7accff7e-e478-4f33-8efc-54253b1f7617 | OS::Heat::SoftwareDeploymentGroup | CREATE_COMPLETE | 2016-10-26T09:27:46Z |
| ControllerPacemakerUpgradeDeployment_Step1 | 610201d7-a081-4d3b-b319-1b09801dd9cd | OS::Heat::SoftwareDeploymentGroup | CREATE_FAILED   | 2016-10-26T09:36:32Z |
+--------------------------------------------+--------------------------------------+-----------------------------------+-----------------+----------------------+
[stack@undercloud-0 ~]$ heat resource-show 610201d7-a081-4d3b-b319-1b09801dd9cd 0
WARNING (shell) "heat resource-show" is deprecated, please use "openstack stack resource show" instead
+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property               | Value                                                                                                                                                                                                                  |
+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| attributes             | {                                                                                                                                                                                                                      |
|                        |   "deploy_stdout": null,                                                                                                                                                                                               |
|                        |   "deploy_stderr": null,                                                                                                                                                                                               |
|                        |   "deploy_status_code": null                                                                                                                                                                                           |
|                        | }                                                                                                                                                                                                                      |
| creation_time          | 2016-10-26T09:36:33Z                                                                                                                                                                                                   |
| description            |                                                                                                                                                                                                                        |
| links                  | http://192.0.2.1:8004/v1/6d46c171345148948580de86d5a4db02/stacks/overcloud-UpdateWorkflow-szz6sourtsch-ControllerPacemakerUpgradeDeployment_Step1-mvgaujtjf4r6/610201d7-a081-4d3b-b319-1b09801dd9cd/resources/0 (self) |
|                        | http://192.0.2.1:8004/v1/6d46c171345148948580de86d5a4db02/stacks/overcloud-UpdateWorkflow-szz6sourtsch-ControllerPacemakerUpgradeDeployment_Step1-mvgaujtjf4r6/610201d7-a081-4d3b-b319-1b09801dd9cd (stack)            |
| logical_resource_id    | 0                                                                                                                                                                                                                      |
| parent_resource        | ControllerPacemakerUpgradeDeployment_Step1                                                                                                                                                                             |
| physical_resource_id   | bb8a7d6c-3f71-4c04-b2c2-83ce68086a8d                                                                                                                                                                                   |
| required_by            |                                                                                                                                                                                                                        |
| resource_name          | 0                                                                                                                                                                                                                      |
| resource_status        | CREATE_FAILED                                                                                                                                                                                                          |
| resource_status_reason | Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1                                                                                                  |
| resource_type          | OS::Heat::SoftwareDeployment                                                                                                                                                                                           |
| updated_time           | 2016-10-26T09:36:33Z                                                                                                                                                                                                   |
+------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------

[stack@undercloud-0 ~]$ heat resource-list 70a67c40-65f5-4359-92e9-fd41ac6cf2d6
WARNING (shell) "heat resource-list" is deprecated, please use "openstack stack resource list" instead
+--------------------------------------------+--------------------------------------+-----------------------------------+-----------------+----------------------+
| resource_name                              | physical_resource_id                 | resource_type                     | resource_status | updated_time         |
+--------------------------------------------+--------------------------------------+-----------------------------------+-----------------+----------------------+
| ControllerPacemakerUpgradeConfig_Step2     | 6f67c319-c228-49d7-bb7b-81b4c5553c8c | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:40Z |
| ControllerPacemakerUpgradeConfig_Step1     | 7b0da71c-e7ee-4671-8af5-e43be2a32d08 | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:42Z |
| ControllerPacemakerUpgradeConfig_Step5     | 4636b4c4-243e-4e00-bad2-1ae244fe2770 | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:42Z |
| ControllerPacemakerUpgradeConfig_Step3     | 4a02e0ee-0961-4095-aa38-65ca4007ebe2 | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:43Z |
| ControllerPacemakerUpgradeConfig_Step4     | baf622bf-60f7-4ec3-bbef-d8f7005e09fe | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:43Z |
| CephMonUpgradeConfig                       | 8b3c144a-0222-4228-9b2e-d0c30da33f33 | OS::Heat::SoftwareConfig          | CREATE_COMPLETE | 2016-10-26T09:27:44Z |
| CephMonUpgradeDeployment                   | 7accff7e-e478-4f33-8efc-54253b1f7617 | OS::Heat::SoftwareDeploymentGroup | CREATE_COMPLETE | 2016-10-26T09:27:46Z |
| ControllerPacemakerUpgradeDeployment_Step1 | 610201d7-a081-4d3b-b319-1b09801dd9cd | OS::Heat::SoftwareDeploymentGroup | CREATE_FAILED   | 2016-10-26T09:36:32Z |
+--------------------------------------------+--------------------------------------+-----------------------------------+-----------------+----------------------+

heat resource-list overcloud -n 5 | grep -v COMPLETE
WARNING (shell) "heat resource-list" is deprecated, please use "openstack stack resource list" instead
+--------------------------------------------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------+----------------------+----------------------------------------------------------------------------------------------------------------------+
| resource_name                              | physical_resource_id                                                            | resource_type                                                                                                       | resource_status | updated_time         | stack_name                                                                                                           |
+--------------------------------------------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------+----------------------+----------------------------------------------------------------------------------------------------------------------+
| UpdateWorkflow                             | 70a67c40-65f5-4359-92e9-fd41ac6cf2d6                                            | OS::TripleO::Tasks::UpdateWorkflow                                                                                  | UPDATE_FAILED   | 2016-10-26T09:27:37Z | overcloud                                                                                                            |
| ControllerPacemakerUpgradeDeployment_Step1 | 610201d7-a081-4d3b-b319-1b09801dd9cd                                            | OS::Heat::SoftwareDeploymentGroup                                                                                   | CREATE_FAILED   | 2016-10-26T09:36:32Z | overcloud-UpdateWorkflow-szz6sourtsch                                                                                |
| 0                                          | bb8a7d6c-3f71-4c04-b2c2-83ce68086a8d                                            | OS::Heat::SoftwareDeployment                                                                                        | CREATE_FAILED   | 2016-10-26T09:36:33Z | overcloud-UpdateWorkflow-szz6sourtsch-ControllerPacemakerUpgradeDeployment_Step1-mvgaujtjf4r6                        |
| 1                                          | 818b242e-9a0c-4cd7-b8f8-0e29755159a9                                            | OS::Heat::SoftwareDeployment                                                                                        | CREATE_FAILED   | 2016-10-26T09:36:33Z | overcloud-UpdateWorkflow-szz6sourtsch-ControllerPacemakerUpgradeDeployment_Step1-mvgaujtjf4r6                        |
| 2                                          | 20763fad-c640-4424-84eb-45e94c1c09d5                                            | OS::Heat::SoftwareDeployment                                                                                        | CREATE_FAILED   | 2016-10-26T09:36:33Z | overcloud-UpdateWorkflow-szz6sourtsch-ControllerPacemakerUpgradeDeployment_Step1-mvgaujtjf4r6                        |
+--------------------------------------------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------+----------------------+----------------------------------------------------------------------------------------------------------------------+

[stack@undercloud-0 ~]$ heat deployment-show 20763fad-c640-4424-84eb-45e94c1c09d5
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
{
  "status": "FAILED",
  "server_id": "d9393a78-77bc-48f2-b1ed-0b73a0f6d1fb",
  "config_id": "db71a425-93ab-4807-a8cf-58c9f7b66ed8",
  "output_values": {
    "deploy_stdout": "mysql upgrade required: 0\nWed Oct 26 09:37:47 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop httpd\nWed Oct 26 09:37:49 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop memcached\nWed Oct 26 09:37:49 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop mongod\nWed Oct 26 09:37:49 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop neutron-dhcp-agent\nWed Oct 26 09:37:52 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop neutron-l3-agent\nWed Oct 26 09:38:01 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop neutron-metadata-agent\nWed Oct 26 09:38:01 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop neutron-netns-cleanup\nWed Oct 26 09:38:01 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop neutron-openvswitch-agent\nWed Oct 26 09:38:03 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop neutron-ovs-cleanup\nWed Oct 26 09:38:03 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop neutron-server\nWed Oct 26 09:38:22 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-aodh-evaluator\nWed Oct 26 09:38:30 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-aodh-listener\nWed Oct 26 09:38:30 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-aodh-notifier\nWed Oct 26 09:38:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-ceilometer-central\nWed Oct 26 09:38:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-ceilometer-collector\nWed Oct 26 09:38:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-ceilometer-notification\nWed Oct 26 09:38:49 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-cinder-api\nWed Oct 26 09:38:49 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-cinder-scheduler\nWed Oct 26 09:39:04 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-glance-api\nWed Oct 26 09:39:04 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-glance-registry\nWed Oct 26 09:39:04 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-gnocchi-metricd\nWed Oct 26 09:39:04 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-gnocchi-statsd\nWed Oct 26 09:39:05 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-heat-api-cfn\nWed Oct 26 09:39:05 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-heat-api\nWed Oct 26 09:39:05 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-heat-api-cloudwatch\nWed Oct 26 09:39:05 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-heat-engine\nWed Oct 26 09:39:05 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-api\nWed Oct 26 09:39:07 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-conductor\nWed Oct 26 09:39:07 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-consoleauth\nWed Oct 26 09:39:13 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-novncproxy\nWed Oct 26 09:39:14 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-scheduler\nWed Oct 26 09:39:29 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-sahara-api\nWed Oct 26 09:39:29 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-sahara-engine\nWed Oct 26 09:39:30 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-auditor.service\nWed Oct 26 09:39:30 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-reaper.service\nWed Oct 26 09:39:30 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-replicator.service\nWed Oct 26 09:39:30 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account.service\nWed Oct 26 09:39:30 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-auditor.service\nWed Oct 26 09:39:30 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-replicator.service\nWed Oct 26 09:39:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-updater.service\nWed Oct 26 09:39:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container.service\nWed Oct 26 09:39:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-auditor.service\nWed Oct 26 09:39:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-replicator.service\nWed Oct 26 09:39:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-updater.service\nWed Oct 26 09:39:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object.service\nWed Oct 26 09:39:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-proxy.service\nWed Oct 26 09:39:31 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-reaper\nWed Oct 26 09:39:32 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-replicator\nWed Oct 26 09:39:32 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account\nWed Oct 26 09:39:32 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-auditor\nWed Oct 26 09:39:32 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-replicator\nWed Oct 26 09:39:32 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-updater\nWed Oct 26 09:39:32 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container\nWed Oct 26 09:39:32 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-auditor\nWed Oct 26 09:39:32 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-replicator\nWed Oct 26 09:39:32 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-updater\nWed Oct 26 09:39:33 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object\nWed Oct 26 09:39:33 UTC 2016 db71a425-93ab-4807-a8cf-58c9f7b66ed8 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-proxy\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nERROR: cluster shutdown timed out\n",
    "deploy_stderr": "",
    "deploy_status_code": 1
  },
  "creation_time": "2016-10-26T09:36:36Z",
  "updated_time": "2016-10-26T10:09:38Z",
  "input_values": {
    "update_identifier": "1477466661",
    "deploy_identifier": "1477473633"
  },
  "action": "CREATE",
  "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1",
  "id": "20763fad-c640-4424-84eb-45e94c1c09d5"
}



[stack@undercloud-0 ~]$ heat deployment-show bb8a7d6c-3f71-4c04-b2c2-83ce68086a8d
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
{
  "status": "FAILED",
  "server_id": "704c9598-262d-4a02-861f-af64abb5a885",
  "config_id": "e1dd87c2-6d4d-415b-9544-c5d058153659",
  "output_values": {
    "deploy_stdout": "ERROR: upgrade cannot start with stopped resources on the cluster. Make sure that all the resources are up and running.\n",
    "deploy_stderr": "",
    "deploy_status_code": 1
  },
  "creation_time": "2016-10-26T09:36:35Z",
  "updated_time": "2016-10-26T09:38:19Z",
  "input_values": {
    "update_identifier": "1477466661",
    "deploy_identifier": "1477473633"
  },
  "action": "CREATE",
  "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1",
  "id": "bb8a7d6c-3f71-4c04-b2c2-83ce68086a8d"
}

Comment 1 mlammon 2016-10-26 17:47:52 UTC
Additional information: 

[root@controller-2 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-0 (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Oct 26 15:59:01 2016          Last change: Wed Oct 26 09:02:07 2016 by root via cibadmin on controller-0

3 nodes and 124 resources configured

Online: [ controller-0 controller-1 controller-2 ]

Full list of resources:

 ip-172.17.1.10 (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-192.0.2.6   (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.4.10 (ocf::heartbeat:IPaddr2):       Started controller-2
 Clone Set: haproxy-clone [haproxy]
     Started: [ controller-0 controller-1 controller-2 ]
 Master/Slave Set: galera-master [galera]
     Masters: [ controller-0 controller-1 controller-2 ]
 Clone Set: memcached-clone [memcached]
     Started: [ controller-0 controller-1 controller-2 ]
 ip-172.17.3.10 (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-10.0.0.101  (ocf::heartbeat:IPaddr2):       Started controller-1
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-core-clone [openstack-core]
     Started: [ controller-0 controller-1 controller-2 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ controller-1 ]
     Slaves: [ controller-0 controller-2 ]
 ip-172.17.1.11 (ocf::heartbeat:IPaddr2):       Started controller-2
 Clone Set: mongod-clone [mongod]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
     Started: [ controller-0 controller-1 controller-2 ]
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started controller-0
 Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-heat-api-clone [openstack-heat-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-glance-api-clone [openstack-glance-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-api-clone [openstack-nova-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-sahara-api-clone [openstack-sahara-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-sahara-engine-clone [openstack-sahara-engine]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: delay-clone [delay]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-server-clone [neutron-server]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: httpd-clone [httpd]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
     Started: [ controller-0 controller-1 controller-2 ]

Failed Actions:
* memcached_monitor_60000 on controller-2 'not running' (7): call=93, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:38:23 2016', queued=0ms, exec=0ms
* mongod_monitor_60000 on controller-2 'not running' (7): call=244, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:38:13 2016', queued=0ms, exec=0ms
* openstack-aodh-evaluator_monitor_60000 on controller-2 'not running' (7): call=333, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:27 2016', queued=0ms, exec=0ms
* openstack-heat-api_monitor_60000 on controller-2 'not running' (7): call=278, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:40 2016', queued=0ms, exec=0ms
* openstack-nova-api_monitor_60000 on controller-2 'not running' (7): call=305, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:42 2016', queued=0ms, exec=0ms
* openstack-nova-consoleauth_monitor_60000 on controller-2 'not running' (7): call=295, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:57 2016', queued=0ms, exec=0ms
* openstack-ceilometer-notification_monitor_60000 on controller-2 'not running' (7): call=270, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:36 2016', queued=0ms, exec=0ms
* neutron-openvswitch-agent_monitor_60000 on controller-2 'OCF_PENDING' (196): call=310, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:38:01 2016', queued=0ms, exec=0ms
* openstack-nova-novncproxy_monitor_60000 on controller-2 'not running' (7): call=302, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:24 2016', queued=0ms, exec=0ms
* neutron-server_monitor_60000 on controller-2 'not running' (7): call=303, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:38:24 2016', queued=0ms, exec=0ms
* memcached_monitor_60000 on controller-1 'not running' (7): call=83, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:38:24 2016', queued=0ms, exec=0ms
* mongod_monitor_60000 on controller-1 'not running' (7): call=240, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:38:27 2016', queued=0ms, exec=0ms
* openstack-aodh-evaluator_monitor_60000 on controller-1 'not running' (7): call=337, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:28 2016', queued=0ms, exec=0ms
* openstack-aodh-listener_monitor_60000 on controller-1 'not running' (7): call=340, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:30 2016', queued=0ms, exec=0ms
* openstack-aodh-notifier_monitor_60000 on controller-1 'not running' (7): call=341, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:30 2016', queued=0ms, exec=0ms
* openstack-heat-api_monitor_60000 on controller-1 'not running' (7): call=284, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:39 2016', queued=0ms, exec=0ms
* openstack-nova-api_monitor_60000 on controller-1 'OCF_PENDING' (196): call=309, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:36 2016', queued=0ms, exec=0ms
* openstack-ceilometer-notification_monitor_60000 on controller-1 'not running' (7): call=274, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:36 2016', queued=0ms, exec=0ms
* neutron-dhcp-agent_monitor_60000 on controller-1 'not running' (7): call=316, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:07 2016', queued=0ms, exec=0ms
* neutron-openvswitch-agent_monitor_60000 on controller-1 'not running' (7): call=314, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:00 2016', queued=0ms, exec=0ms
* neutron-server_monitor_60000 on controller-1 'not running' (7): call=305, status=complete, exitreason='none',
    last-rc-change='Wed Oct 26 09:39:18 2016', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 2 mlammon 2016-10-26 17:52:21 UTC
This was the pcs status just before we "START UPGRADE CONTROLLER AND BLOCKSTORAGE" step take from console of jenkins job

09:18:39 
09:18:39  Stack overcloud UPDATE_COMPLETE 
09:18:39 
09:18:39 Overcloud Endpoint: http://10.0.0.101:5000/v2.0
09:18:39 Overcloud Deployed
09:18:39 clean_up DeployOvercloud: 
09:18:39 END return value: 0
09:18:42 Cluster name: tripleo_cluster
09:18:42 Stack: corosync
09:18:42 Current DC: controller-0 (version 1.1.15-11.el7-e174ec8) - partition with quorum
09:18:42 Last updated: Wed Oct 26 09:18:41 2016		Last change: Wed Oct 26 09:02:07 2016 by root via cibadmin on controller-0
09:18:42 
09:18:42 3 nodes and 124 resources configured
09:18:42 
09:18:42 Online: [ controller-0 controller-1 controller-2 ]
09:18:42 
09:18:42 Full list of resources:
09:18:42 
09:18:42  ip-172.17.1.10	(ocf::heartbeat:IPaddr2):	Started controller-0
09:18:42  ip-192.0.2.6	(ocf::heartbeat:IPaddr2):	Started controller-1
09:18:42  ip-172.17.4.10	(ocf::heartbeat:IPaddr2):	Started controller-2
09:18:42  Clone Set: haproxy-clone [haproxy]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Master/Slave Set: galera-master [galera]
09:18:42      Masters: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: memcached-clone [memcached]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  ip-172.17.3.10	(ocf::heartbeat:IPaddr2):	Started controller-0
09:18:42  ip-10.0.0.101	(ocf::heartbeat:IPaddr2):	Started controller-1
09:18:42  Clone Set: rabbitmq-clone [rabbitmq]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-core-clone [openstack-core]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Master/Slave Set: redis-master [redis]
09:18:42      Masters: [ controller-1 ]
09:18:42      Slaves: [ controller-0 controller-2 ]
09:18:42  ip-172.17.1.11	(ocf::heartbeat:IPaddr2):	Started controller-2
09:18:42  Clone Set: mongod-clone [mongod]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  openstack-cinder-volume	(systemd:openstack-cinder-volume):	Started controller-0
09:18:42  Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-heat-api-clone [openstack-heat-api]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-glance-api-clone [openstack-glance-api]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-nova-api-clone [openstack-nova-api]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-sahara-api-clone [openstack-sahara-api]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-sahara-engine-clone [openstack-sahara-engine]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: delay-clone [delay]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: neutron-server-clone [neutron-server]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: httpd-clone [httpd]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42  Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
09:18:42      Started: [ controller-0 controller-1 controller-2 ]
09:18:42 
09:18:42 Daemon Status:
09:18:42   corosync: active/enabled
09:18:42   pacemaker: active/enabled
09:18:42   pcsd: active/enabled
09:18:46 Checking stack status
09:18:47 WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
09:18:47 +--------------------------------------+------------+-----------------+----------------------+----------------------+
09:18:47 | id                                   | stack_name | stack_status    | creation_time        | updated_time         |
09:18:47 +--------------------------------------+------------+-----------------+----------------------+----------------------+
09:18:47 | ee358f54-d537-4fed-be1b-615f50013a07 | overcloud  | UPDATE_COMPLETE | 2016-10-26T06:28:33Z | 2016-10-26T09:06:42Z |
09:18:47 +--------------------------------------+------------+-----------------+----------------------+----------------------+
09:18:48 ### FINISH INIT COMMAND  ###
09:18:51 ### START UPGRADE CONTROLLER AND BLOCKSTORAGE ###

Comment 3 Omri Hochman 2016-10-26 17:52:21 UTC
From checking the logs of the job -  there were no issues or failed pcs_resources before the starting of 'UPGRADE CONTROLLER AND BLOCKSTORAGE ' . 
the pcs issue began during the step. 

the original jenkins-job console can be found here : https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/Director/view/9.0/job/infrared_deploy_9.0_3_control_1_compute_1ceph_no-UCSSL_no-OCSSL_then_update_RHEL_7.3_upgrade_10/23/consoleFull

Comment 5 mlammon 2016-10-26 18:33:29 UTC
Created attachment 1214385 [details]
heat-engine.log

Comment 6 Omri Hochman 2016-10-26 20:13:09 UTC
After fixing failed-resources / restart the PCS cluster - another attempt to re-run the upgrade from the same point failed :


strange errors in messages (full messages file attached) : 
http://pastebin.test.redhat.com/424646

Comment 7 Omri Hochman 2016-10-26 20:13:41 UTC
Created attachment 1214419 [details]
full messages

Comment 8 Jaromir Coufal 2016-10-27 06:52:11 UTC
Do we have an idea what might be the root cause here? Is this something new what suddenly started happening or is it some new use cases which we did not test before?

Comment 11 Sofer Athlan-Guyot 2016-10-27 15:24:12 UTC
Looking at the environment we try to stop openstack-swift-proxy on one
of the controller:

    [root@controller-0 ~]# systemctl status -l openstack-swift-proxy
    ● openstack-swift-proxy.service - OpenStack Object Storage (swift) - Proxy Server
       Loaded: loaded (/usr/lib/systemd/system/openstack-swift-proxy.service; enabled; vendor preset: disabled)
       Active: active (running) since Wed 2016-10-26 08:24:41 UTC; 1 day 6h ago
     Main PID: 2256 (swift-proxy-ser)
       CGroup: /system.slice/openstack-swift-proxy.service
               └─2256 /usr/bin/python2 /usr/bin/swift-proxy-server /etc/swift/proxy-server.conf
    
    Oct 27 15:08:17 controller-0.localdomain proxy-server[2256]: ERROR with Account server 172.17.4.13:6002/d1 re: Trying to GET /v1/AUTH_1075f161175442f99fad2e1efc031d26: Connectio
    d (txn: txe1be088ca13549a998af6-0058121861) (client_ip: 172.17.3.15)
    Oct 27 15:08:17 controller-0.localdomain proxy-server[2256]: ERROR with Account server 172.17.4.14:6002/d1 re: Trying to GET /v1/AUTH_1075f161175442f99fad2e1efc031d26: Connectio
    d (txn: txe1be088ca13549a998af6-0058121861) (client_ip: 172.17.3.15)
    Oct 27 15:08:17 controller-0.localdomain proxy-server[2256]: 172.17.3.15 172.17.3.15 27/Oct/2016/15/08/17 GET /v1/AUTH_1075f161175442f99fad2e1efc031d26%3Fformat%3Djson HTTP/1.0
    thon-swiftclient-3.0.0 e8db84cc646c4b7e... - 2 - txe1be088ca13549a998af6-0058121861 - 0.0080 - - 1477580897.777108908 1477580897.785095930 -
    Oct 27 15:08:17 controller-0.localdomain proxy-server[2256]: Deferring reject downstream
    Oct 27 15:08:17 controller-0.localdomain proxy-server[2256]: - - 27/Oct/2016/15/08/17 HEAD /v1/AUTH_900da63cc23d4700ad38384e4dc052b1 HTTP/1.0 204 - Swift - - - - tx35b5347c7cd74
    -0058121861 - 0.0071 RL - 1477580897.795131922 1477580897.802206993 -
    Oct 27 15:08:17 controller-0.localdomain proxy-server[2256]: 172.17.3.15 172.17.3.15 27/Oct/2016/15/08/17 GET /v1/AUTH_900da63cc23d4700ad38384e4dc052b1%3Fformat%3Djson HTTP/1.0
    thon-swiftclient-3.0.0 e8db84cc646c4b7e... - 2 - tx35b5347c7cd740fa89046-0058121861 - 0.0077 - - 1477580897.804646015 1477580897.812345982 -
    Oct 27 15:08:18 controller-0.localdomain proxy-server[2256]: ERROR with Account server 172.17.4.14:6002/d1 re: Trying to HEAD /v1/AUTH_1075f161175442f99fad2e1efc031d26: Connecti
    ed (txn: txd6d67a1f17444699b4305-0058121862) (client_ip: 172.17.3.15)
    Oct 27 15:08:18 controller-0.localdomain proxy-server[2256]: ERROR with Account server 172.17.4.13:6002/d1 re: Trying to HEAD /v1/AUTH_1075f161175442f99fad2e1efc031d26: Connecti
    ed (txn: txd6d67a1f17444699b4305-0058121862) (client_ip: 172.17.3.15)
    Oct 27 15:08:18 controller-0.localdomain proxy-server[2256]: 172.17.3.15 172.17.3.15 27/Oct/2016/15/08/18 HEAD /v1/AUTH_1075f161175442f99fad2e1efc031d26 HTTP/1.0 204 - python-sw
    t-3.0.0 e8db84cc646c4b7e... - - - txd6d67a1f17444699b4305-0058121862 - 0.0079 - - 1477580898.147253036 1477580898.155152082 -
    Oct 27 15:08:18 controller-0.localdomain proxy-server[2256]: 172.17.3.15 172.17.3.15 27/Oct/2016/15/08/18 HEAD /v1/AUTH_900da63cc23d4700ad38384e4dc052b1 HTTP/1.0 204 - python-sw
    t-3.0.0 e8db84cc646c4b7e... - - - tx9b1741fe46ee4089922cb-0058121862 - 0.0074 - - 1477580898.164133072 1477580898.171533108 -


It tries to get to swift on the other controllers, but there, the swift-proxy are indeed not listening

(stopped) :

    [heat-admin@controller-1 ~]$ sudo -i
    [root@controller-1 ~]# netstat -pant | grep 600
    -> nothing

the same on the controller-2.

Comment 12 Omri Hochman 2016-10-27 20:28:38 UTC
Reproduce this issue on my BM (but with **SSL on Overcloud)
 
I think it possible that the reason of this issue is - overcloud with SSL. 

(going to check on that )

Comment 13 Sofer Athlan-Guyot 2016-11-02 17:18:44 UTC
This parameter should be added to SSL upgrade:


  PublicVirtualFixedIPs: [{'ip_address':'192.168.200.180'}].

Don't put 192.168.200.180, but the ip of the public and admin endpoint.  This should be part of your local network yaml file.

Comment 16 Omri Hochman 2016-11-03 20:59:17 UTC
According Ben Nemec we should also add : -e /home/stack/ssl-heat-templates/environments/tls-endpoints-public-ip.yaml


Add this ^ on-top of the deployment-command that is being used for the upgrade.

Comment 18 Marios Andreou 2016-11-08 14:58:31 UTC
@omri I think you were going to verify if this was environmental as in comment #16 ... do we still need this BZ or can we close it?

Comment 20 mlammon 2016-11-08 23:28:38 UTC
Latest test on 08 NOV 2016 has failed to upgrade.  
 
[stack@undercloud-0 ~]$ heat resource-list -n5 overcloud | grep -v COMPLETE
WARNING (shell) "heat resource-list" is deprecated, please use "openstack stack resource list" instead
+--------------------------------------------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------+----------------------+----------------------------------------------------------------------------------------------------------------------+
| resource_name                              | physical_resource_id                                                            | resource_type                                                                                                       | resource_status | updated_time         | stack_name                                                                                                           |
+--------------------------------------------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------+----------------------+----------------------------------------------------------------------------------------------------------------------+
| UpdateWorkflow                             | 74a716e2-23b1-4463-b286-958ced24b0f8                                            | OS::TripleO::Tasks::UpdateWorkflow                                                                                  | UPDATE_FAILED   | 2016-11-08T21:13:40Z | overcloud                                                                                                            |
| 0                                          | ad8123ed-b137-4132-bc17-de5f63594268                                            | OS::Heat::SoftwareDeployment                                                                                        | CREATE_FAILED   | 2016-11-08T21:21:56Z | overcloud-UpdateWorkflow-r53etdnzrvoc-ControllerPacemakerUpgradeDeployment_Step1-j72cf6jpqnfr                        |
| 1                                          | 3b652369-9c83-498e-9d47-cf1c6456458f                                            | OS::Heat::SoftwareDeployment                                                                                        | CREATE_FAILED   | 2016-11-08T21:21:56Z | overcloud-UpdateWorkflow-r53etdnzrvoc-ControllerPacemakerUpgradeDeployment_Step1-j72cf6jpqnfr                        |
| ControllerPacemakerUpgradeDeployment_Step1 | 7b904d09-3e0e-4099-9e3d-3185c0ca49d1                                            | OS::Heat::SoftwareDeploymentGroup                                                                                   | CREATE_FAILED   | 2016-11-08T21:21:56Z | overcloud-UpdateWorkflow-r53etdnzrvoc                                                                                |
| 2                                          | e5e0e6d6-eb1e-4aed-9a98-729233da648d                                            | OS::Heat::SoftwareDeployment                                                                                        | CREATE_FAILED   | 2016-11-08T21:21:57Z | overcloud-UpdateWorkflow-r53etdnzrvoc-ControllerPacemakerUpgradeDeployment_Step1-j72cf6jpqnfr                        |
+--------------------------------------------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------+----------------------+----------------------------------------------------------------------------------------------------------------------+
 
---------+-------------+--------------------+-------------+
 
[stack@undercloud-0 ~]$ heat deployment-show e5e0e6d6-eb1e-4aed-9a98-729233da648d
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
{
  "status": "IN_PROGRESS",
  "server_id": "7caf37c8-3daf-4b78-a256-768a82679876",
  "config_id": "cc3c4825-4758-49e1-975a-de6d134cb3c6",
  "output_values": null,
  "creation_time": "2016-11-08T21:21:59Z",
  "input_values": {
    "update_identifier": "",
    "deploy_identifier": "1478639139"
  },
  "action": "CREATE",
  "status_reason": "Deploy data available",
  "id": "e5e0e6d6-eb1e-4aed-9a98-729233da648d"
}
 
 
[root@controller-2 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-2 (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Tue Nov  8 21:47:12 2016          Last change: Tue Nov  8 20:43:48 2016 by root via cibadmin on controller-0
 
3 nodes and 124 resources configured
 
Online: [ controller-0 controller-1 controller-2 ]
 
Full list of resources:
 
 ip-fd00.fd00.fd00.4000..10     (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-192.0.2.6   (ocf::heartbeat:IPaddr2):       Started controller-1
 Clone Set: haproxy-clone [haproxy]
     Started: [ controller-0 controller-1 controller-2 ]
 Master/Slave Set: galera-master [galera]
     Masters: [ controller-0 controller-1 controller-2 ]
 Clone Set: memcached-clone [memcached]
     Started: [ controller-0 controller-1 controller-2 ]
 ip-2620.52.0.13b8.5054.ff.fe3e.1       (ocf::heartbeat:IPaddr2):       Started controller-2
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-core-clone [openstack-core]
     Started: [ controller-0 controller-1 controller-2 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ controller-0 ]
     Slaves: [ controller-1 controller-2 ]
 ip-fd00.fd00.fd00.3000..10     (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-fd00.fd00.fd00.2000..10     (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-fd00.fd00.fd00.2000..11     (ocf::heartbeat:IPaddr2):       Started controller-2
 Clone Set: mongod-clone [mongod]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
     Started: [ controller-0 controller-1 controller-2 ]
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started controller-0
 Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-heat-api-clone [openstack-heat-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-glance-api-clone [openstack-glance-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-api-clone [openstack-nova-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-sahara-api-clone [openstack-sahara-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-sahara-engine-clone [openstack-sahara-engine]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: delay-clone [delay]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-server-clone [neutron-server]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: httpd-clone [httpd]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
     Started: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
     Started: [ controller-0 controller-1 controller-2 ]
 
Failed Actions:
* memcached_monitor_60000 on controller-1 'not running' (7): call=33, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:24:10 2016', queued=0ms, exec=0ms
* mongod_monitor_60000 on controller-1 'not running' (7): call=78, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:24:04 2016', queued=0ms, exec=0ms
* openstack-aodh-evaluator_monitor_60000 on controller-1 'not running' (7): call=341, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:25:09 2016', queued=0ms, exec=0ms
* openstack-aodh-listener_monitor_60000 on controller-1 'not running' (7): call=344, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:25:11 2016', queued=0ms, exec=0ms
* openstack-aodh-notifier_monitor_60000 on controller-1 'not running' (7): call=345, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:25:12 2016', queued=0ms, exec=0ms
* openstack-nova-api_monitor_60000 on controller-1 'not running' (7): call=200, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:24:54 2016', queued=0ms, exec=0ms
* openstack-nova-consoleauth_monitor_60000 on controller-1 'OCF_PENDING' (196): call=207, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:25:01 2016', queued=0ms, exec=0ms
* neutron-dhcp-agent_monitor_60000 on controller-1 'OCF_PENDING' (196): call=263, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:23:27 2016', queued=0ms, exec=0ms
* neutron-openvswitch-agent_monitor_60000 on controller-1 'not running' (7): call=270, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:24:30 2016', queued=0ms, exec=0ms
* neutron-server_monitor_60000 on controller-1 'not running' (7): call=304, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:24:27 2016', queued=0ms, exec=0ms
* httpd_monitor_60000 on controller-1 'not running' (7): call=306, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:23:27 2016', queued=0ms, exec=0ms
* memcached_monitor_60000 on controller-2 'not running' (7): call=35, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:24:10 2016', queued=0ms, exec=0ms
* mongod_monitor_60000 on controller-2 'not running' (7): call=79, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:24:04 2016', queued=0ms, exec=0ms
* openstack-aodh-evaluator_monitor_60000 on controller-2 'not running' (7): call=344, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:25:09 2016', queued=0ms, exec=0ms
* openstack-aodh-listener_monitor_60000 on controller-2 'not running' (7): call=347, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:25:11 2016', queued=0ms, exec=0ms
* openstack-aodh-notifier_monitor_60000 on controller-2 'not running' (7): call=348, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:25:11 2016', queued=0ms, exec=0ms
* neutron-dhcp-agent_monitor_60000 on controller-2 'OCF_PENDING' (196): call=266, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:23:26 2016', queued=0ms, exec=0ms
* neutron-openvswitch-agent_monitor_60000 on controller-2 'not running' (7): call=273, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:24:30 2016', queued=0ms, exec=0ms
* neutron-server_monitor_60000 on controller-2 'not running' (7): call=307, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:24:27 2016', queued=0ms, exec=0ms
* httpd_monitor_60000 on controller-2 'not running' (7): call=309, status=complete, exitreason='none',
    last-rc-change='Tue Nov  8 21:23:27 2016', queued=0ms, exec=0ms
 
 
Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@controller-2 ~]# pcs status | grep -i stopped -B2
[root@controller-2 ~]# pcs status | grep -i unmanaged -B2


[stack@undercloud-0 ~]$ heat deployment-show e5e0e6d6-eb1e-4aed-9a98-729233da648d
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
{
  "status": "IN_PROGRESS",
  "server_id": "7caf37c8-3daf-4b78-a256-768a82679876",
  "config_id": "cc3c4825-4758-49e1-975a-de6d134cb3c6",
  "output_values": null,
  "creation_time": "2016-11-08T21:21:59Z",
  "input_values": {
    "update_identifier": "",
    "deploy_identifier": "1478639139"
  },
  "action": "CREATE",
  "status_reason": "Deploy data available",
  "id": "e5e0e6d6-eb1e-4aed-9a98-729233da648d"
}
 
 
same command later in time.. maybe 15 minutes
 
[stack@undercloud-0 ~]$ heat deployment-show e5e0e6d6-eb1e-4aed-9a98-729233da648d
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
{
  "status": "FAILED",
  "server_id": "7caf37c8-3daf-4b78-a256-768a82679876",
  "config_id": "cc3c4825-4758-49e1-975a-de6d134cb3c6",
  "output_values": {
    "deploy_stdout": "mysql upgrade required: 0\nTue Nov  8 21:23:18 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop httpd\nTue Nov  8 21:23:20 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop memcached\nTue Nov  8 21:23:20 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop mongod\nTue Nov  8 21:23:20 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop neutron-dhcp-agent\nTue Nov  8 21:23:28 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop neutron-l3-agent\nTue Nov  8 21:23:37 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop neutron-metadata-agent\nTue Nov  8 21:23:38 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop neutron-netns-cleanup\nTue Nov  8 21:23:38 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop neutron-openvswitch-agent\nTue Nov  8 21:23:39 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop neutron-ovs-cleanup\nTue Nov  8 21:23:39 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop neutron-server\nTue Nov  8 21:24:12 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-aodh-evaluator\nTue Nov  8 21:24:31 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-aodh-listener\nTue Nov  8 21:24:32 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-aodh-notifier\nTue Nov  8 21:24:33 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-ceilometer-central\nTue Nov  8 21:24:33 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-ceilometer-collector\nTue Nov  8 21:24:34 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-ceilometer-notification\nTue Nov  8 21:24:47 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-cinder-api\nTue Nov  8 21:24:47 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-cinder-scheduler\nTue Nov  8 21:25:04 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-glance-api\nTue Nov  8 21:25:04 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-glance-registry\nTue Nov  8 21:25:04 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-gnocchi-metricd\nTue Nov  8 21:25:05 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-gnocchi-statsd\nTue Nov  8 21:25:05 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-heat-api-cfn\nTue Nov  8 21:25:06 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-heat-api\nTue Nov  8 21:25:06 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-heat-api-cloudwatch\nTue Nov  8 21:25:06 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-heat-engine\nTue Nov  8 21:25:06 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-api\nTue Nov  8 21:25:06 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-conductor\nTue Nov  8 21:25:07 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-consoleauth\nTue Nov  8 21:25:15 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-novncproxy\nTue Nov  8 21:25:15 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-nova-scheduler\nTue Nov  8 21:25:15 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-sahara-api\nTue Nov  8 21:25:15 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-sahara-engine\nTue Nov  8 21:25:16 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-auditor.service\nTue Nov  8 21:25:16 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-reaper.service\nTue Nov  8 21:25:16 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-replicator.service\nTue Nov  8 21:25:16 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account.service\nTue Nov  8 21:25:16 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-auditor.service\nTue Nov  8 21:25:16 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-replicator.service\nTue Nov  8 21:25:17 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-updater.service\nTue Nov  8 21:25:17 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container.service\nTue Nov  8 21:25:17 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-auditor.service\nTue Nov  8 21:25:17 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-replicator.service\nTue Nov  8 21:25:17 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-updater.service\nTue Nov  8 21:25:17 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object.service\nTue Nov  8 21:25:17 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-proxy.service\nTue Nov  8 21:25:18 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-reaper\nTue Nov  8 21:25:18 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account-replicator\nTue Nov  8 21:25:18 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-account\nTue Nov  8 21:25:18 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-auditor\nTue Nov  8 21:25:18 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-replicator\nTue Nov  8 21:25:18 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container-updater\nTue Nov  8 21:25:19 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-container\nTue Nov  8 21:25:19 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-auditor\nTue Nov  8 21:25:19 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-replicator\nTue Nov  8 21:25:19 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object-updater\nTue Nov  8 21:25:19 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-object\nTue Nov  8 21:25:19 UTC 2016 cc3c4825-4758-49e1-975a-de6d134cb3c6 tripleo-upgrade controller-2 Going to systemctl stop openstack-swift-proxy\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nactive\nERROR: cluster shutdown timed out\n",
    "deploy_stderr": "",
    "deploy_status_code": 1
  },
  "creation_time": "2016-11-08T21:21:59Z",
  "updated_time": "2016-11-08T21:55:23Z",
  "input_values": {
    "update_identifier": "",
    "deploy_identifier": "1478639139"
  },
  "action": "CREATE",
  "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1",
  "id": "e5e0e6d6-eb1e-4aed-9a98-729233da648d"
}


Please see also SOS REPORT

Comment 21 mlammon 2016-11-08 23:30:02 UTC
SOS REPORT:
http://rhos-release.virt.bos.redhat.com/log/bz1389040

Comment 22 Michele Baldessari 2016-11-09 08:12:05 UTC
So here is the reason for the failure:
Currently when we call the major-upgrade step we do the following:
"""
...
if [[ -n $(is_bootstrap_node) ]]; then
    check_clean_cluster
fi
...
if [[ -n $(is_bootstrap_node) ]]; then
    migrate_full_to_ng_ha
fi
...
for service in $(services_to_migrate); do
    manage_systemd_service stop "${service%%-clone}"
    ...
done
"""

The problem with the above code is that it is open to the following race condition:
1. Code gets run first on a non-bootstrap controller node so we start stopping a bunch of services
2. Pacemaker notices will notice that services are down and will mark the service as stopped
3. Code gets run on the bootstrap node (controller-0) and the check_clean_cluster function will fail and exit
4. Eventually also the script on the non-bootstrap controller node will timeout and exit because the cluster never shut down (it never actually started the shutdown because we failed at 3)


I attached a review that is fairly simply in concept:
split major_upgrade_controller_pacemaker_1 in two so we can guarantee that no
systemd service will be stopped before anything else. It is very simple in concept (split the file in two, so we move to 6 steps instead of 5), but it is an invasive change. Happy to discuss alternative approaches

Comment 23 Michele Baldessari 2016-11-09 08:35:10 UTC
Mike,

I prepared the newton backport in https://review.openstack.org/#/c/395460/. Can you try and give it a test as soon as you can? Since the change is simple but invasive I'd lke to get as much feedback as possible

Thanks,
Michele

Comment 28 Marios Andreou 2016-11-11 14:00:18 UTC
fix landed to stable/newton with https://review.openstack.org/#/c/395460/ moving POST

Comment 33 mlammon 2016-11-21 20:35:09 UTC
Deploy RHOS 9 using latest puddle 2016-11-19.4/
Removed patch from upgrade to test it landed.
Upgrade 9->10
No long see this issue seen.

Comment 35 errata-xmlrpc 2016-12-14 16:25:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html