This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1255931 - rhel-osp-director: rhel-osp-director: unable to delete a heat stack deployed with "--rhel-reg --reg-method portal --reg-org <rel-org> --reg-activation-key '<key>'", following a failed attempt to update it with "openstack overcloud update stack --templates
rhel-osp-director: rhel-osp-director: unable to delete a heat stack deployed ...
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
unspecified
x86_64 Linux
high Severity high
: y1
: 7.0 (Kilo)
Assigned To: Zane Bitter
Alexander Chuzhoy
: TestOnly, Triaged
Depends On: 1257717 1265010
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-21 20:45 EDT by Alexander Chuzhoy
Modified: 2015-10-08 08:17 EDT (History)
6 users (show)

See Also:
Fixed In Version: openstack-heat-2015.1.1-1.el7ost python-rdomanager-oscplugin-0.0.10-6.el7ost openstack-tripleo-common-0.0.1.dev6-3.git49b57eb.el7ost
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-08 08:17:18 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alexander Chuzhoy 2015-08-21 20:45:48 EDT
rhel-osp-director: unable to delete a heat stack deployed with "--rhel-reg --reg-method portal --reg-org <rel-org> --reg-activation-key '<key>'", following a failed attempt to update it with "openstack overcloud update stack --templates  -e <yaml> -i overcloud"

Environment:
openstack-heat-engine-2015.1.0-6.el7ost.noarch
openstack-tripleo-heat-templates-0.8.6-46.el7ost.noarch
instack-undercloud-2.1.2-23.el7ost.noarch



Steps to reproduce:
1. Deploy an overcloud with "openstack overcloud deploy --templates --control-scale <num> --compute-scale <num> --ceph-storage-scale <num> -e <yaml> --compute-flavor compute --control-flavor control --ceph-storage-flavor ceph --rhel-reg --reg-method portal --reg-org <rel-org> --reg-activation-key '<key>'"


2. Attempt to update the stack with: penstack overcloud update stack --templates  -e <yaml> -i overcloud"

3. If the update fails (for example not enough active subscriptions), then it becomes impossible to delete the stack. Run "heat stack-delete overcloud".

Result:
The deletion gets stuck/fails. Not possible to delete the stack.

Expected result:
The stack should get deleted.
Comment 3 Alexander Chuzhoy 2015-08-25 11:15:34 EDT
Reproduced.


heat resource-list -n5 overcloud|grep -v COMPLETE

+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+
| resource_name                               | physical_resource_id                          | resource_type                                     | resource_status    | updated_time         | parent_resource                             |
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+
| RHELUnregistrationDeployment                | 4b6cb843-cda1-4529-b787-24f2eaefbda5          | OS::Heat::StructuredDeployments                   | DELETE_IN_PROGRESS | 2015-08-24T20:27:55Z | ExtraConfig                                 |
| 0                                           | 15fefbb8-882d-416a-a8c6-9744a1e5a05d          | OS::Heat::StructuredDeployment                    | DELETE_IN_PROGRESS | 2015-08-24T20:27:58Z | RHELUnregistrationDeployment                |
| RHELUnregistrationDeployment                | 3ef23240-23a4-43f2-984a-54cc1c7f16c1          | OS::Heat::StructuredDeployments                   | DELETE_IN_PROGRESS | 2015-08-24T20:32:55Z | ExtraConfig                                 |
| 0                                           | f3c04c97-a619-42f3-8512-4ce757ee173e          | OS::Heat::StructuredDeployment                    | DELETE_IN_PROGRESS | 2015-08-24T20:32:57Z | RHELUnregistrationDeployment                |
| ControllerNodesPostDeployment               | 2476441b-5aca-4d8e-96e0-32124486087b          | OS::TripleO::ControllerPostDeployment             | DELETE_IN_PROGRESS | 2015-08-25T13:15:20Z |                                             |
| ComputeNodesPostDeployment                  | f2a690a7-fb59-4fa1-8ca2-cdbf9242ade2          | OS::TripleO::ComputePostDeployment                | DELETE_IN_PROGRESS | 2015-08-25T13:15:24Z |                                             |
| ExtraConfig                                 | 15cb8528-a9a6-4953-8c27-ff40747cdcd4          | OS::TripleO::NodeExtraConfigPost                  | DELETE_IN_PROGRESS | 2015-08-25T13:15:34Z | ComputeNodesPostDeployment                  |
| ExtraConfig                                 | 1cb87b3a-8746-492c-8341-781b933e6965          | OS::TripleO::NodeExtraConfigPost                  | DELETE_IN_PROGRESS | 2015-08-25T13:16:08Z | ControllerNodesPostDeployment               |
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+
Comment 4 Alexander Chuzhoy 2015-08-25 14:08:28 EDT
Keep on reproducing it:

The deployment of overcloud against portal failed and I'm not able to delete the stack:
heat resource-list -n5 overcloud |grep -v COMPLE
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+
| resource_name                               | physical_resource_id                          | resource_type                                     | resource_status    | updated_time         | parent_resource                             |
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+
| ControllerNodesPostDeployment               | b0d2468b-4e43-46ce-b4a7-c2ad3df7447d          | OS::TripleO::ControllerPostDeployment             | DELETE_IN_PROGRESS | 2015-08-25T17:19:05Z |                                             |
| ExtraConfig                                 | 3512b75d-6331-4f15-9a2c-f9cd1d91c6a1          | OS::TripleO::NodeExtraConfigPost                  | DELETE_IN_PROGRESS | 2015-08-25T17:28:03Z | ControllerNodesPostDeployment               |
| RHELUnregistrationDeployment                | dc0b5196-fe11-4907-8416-21ca5876f12c          | OS::Heat::StructuredDeployments                   | DELETE_IN_PROGRESS | 2015-08-25T17:43:59Z | ExtraConfig                                 |
| 0                                           | 084c2ab9-d2f1-42ff-8669-a556836e5063          | OS::Heat::StructuredDeployment                    | DELETE_IN_PROGRESS | 2015-08-25T17:44:01Z | RHELUnregistrationDeployment                |
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+



Note the RHELUnregistrationDeployment...
Comment 5 Zane Bitter 2015-09-03 12:35:00 EDT
The problem is that the deploy command is creating an extra environment file to pass on the fly, containing the registration details. Since the user never gets to see this file, there's no way for them to correctly pass this on a subsequent update, hence this inevitable failure.

We decided to fix this by making the environment 'sticky' on PATCH updates (in the same way that parameters are). So the fix for bug 1257717 should resolve this issue too.
Comment 6 Alexander Chuzhoy 2015-09-17 17:52:30 EDT
FailedQA

Environment:
openstack-heat-engine-2015.1.1-3.el7ost.noarch
openstack-tripleo-heat-templates-0.8.6-62.el7ost.noarch
openstack-heat-templates-0-0.6.20150605git.el7ost.noarch
openstack-heat-api-2015.1.1-3.el7ost.noarch
openstack-heat-common-2015.1.1-3.el7ost.noarch
python-heatclient-0.6.0-1.el7ost.noarch
openstack-heat-api-cfn-2015.1.1-3.el7ost.noarch
openstack-heat-api-cloudwatch-2015.1.1-3.el7ost.noarch
instack-undercloud-2.1.2-26.el7ost.noarch



Still unable to delete the stack.
Comment 7 Alexander Chuzhoy 2015-09-17 18:09:03 EDT
heat resource-list -n 5 overcloud|grep DELETE_IN_PROGRESS
| ExtraConfig                                 | 9fe21a57-c8f7-4432-be3b-a2af2246c02a          | OS::TripleO::NodeExtraConfigPost                  | DELETE_IN_PROGRESS | 2015-09-17T18:52:41Z | ComputeNodesPostDeployment                  |
| ExtraConfig                                 | 7b584351-2aba-4490-b92a-ab3f238e4906          | OS::TripleO::NodeExtraConfigPost                  | DELETE_IN_PROGRESS | 2015-09-17T18:53:06Z | ControllerNodesPostDeployment               |
| RHELUnregistrationDeployment                | d9e9be47-8341-4ff9-b28c-fb5bca23790d          | OS::Heat::StructuredDeployments                   | DELETE_IN_PROGRESS | 2015-09-17T19:07:56Z | ExtraConfig                                 |
| 0                                           | 4124106f-45c7-4760-afe9-ff4bcdba0e5b          | OS::Heat::StructuredDeployment                    | DELETE_IN_PROGRESS | 2015-09-17T19:08:01Z | RHELUnregistrationDeployment                |
| RHELUnregistrationDeployment                | 5985c027-c429-4332-b332-27e9e67d6145          | OS::Heat::StructuredDeployments                   | DELETE_IN_PROGRESS | 2015-09-17T19:13:53Z | ExtraConfig                                 |
| 1                                           | d8330f76-cebe-4b6b-8a39-74b5a1ca5de8          | OS::Heat::StructuredDeployment                    | DELETE_IN_PROGRESS | 2015-09-17T19:13:58Z | RHELUnregistrationDeployment                |
| 0                                           | 8bb3776e-a40a-4e21-acb8-7f53a631a52d          | OS::Heat::StructuredDeployment                    | DELETE_IN_PROGRESS | 2015-09-17T19:13:59Z | RHELUnregistrationDeployment                |
| 2                                           | 261bd039-e7d5-4d6f-8310-f45625548f81          | OS::Heat::StructuredDeployment                    | DELETE_IN_PROGRESS | 2015-09-17T19:13:59Z | RHELUnregistrationDeployment                |
| ComputeNodesPostDeployment                  | 259ca4b9-849e-468b-8882-307d7fc15ea2          | OS::TripleO::ComputePostDeployment                | DELETE_IN_PROGRESS | 2015-09-17T20:49:15Z |                                             |
| ControllerNodesPostDeployment               | e86fec51-2cb5-45a5-80d5-7eaf9499395e          | OS::TripleO::ControllerPostDeployment             | DELETE_IN_PROGRESS | 2015-09-17T20:49:21Z |
Comment 8 Zane Bitter 2015-09-17 18:16:36 EDT
I wonder if explicitly passing -e <yaml> to the overcloud update command is causing the other environment that does the registration (the one not being passed again) to be overwritten. Can you try without passing any environment files to the overcloud update command?

Another possibility is that the UnregistrationDeployment has a bug and will just not complete ever, and it's nothing to do with Heat at all.
Comment 9 Alexander Chuzhoy 2015-09-18 11:14:13 EDT
Tried without providing the yaml file - failed right away
openstack overcloud update stack --templates -i overcloud                                                
starting package update on stack overcloud                                                                                     
IN_PROGRESS                                                                                                                    
IN_PROGRESS                                                                                                                    
FAILED                                                                                                                         
update finished with status FAILED    




heat resource-list -n 5 overcloud|grep -v COMPLE
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+---------------------------------------------+                                                                                                                                                                                                
| resource_name                               | physical_resource_id                          | resource_type                                     | resource_status | updated_time         | parent_resource                             |                                                                                                                                                                                                
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+---------------------------------------------+
| ExternalSubnet                              | 5224619f-0b4d-4b7f-bd91-373ef54d6af1          | OS::Neutron::Subnet                               | DELETE_FAILED   | 2015-09-18T14:27:46Z | ExternalNetwork                             |
| TenantSubnet                                | 8ed62f9f-1c9c-4c53-a7a5-b43f2f1989fe          | OS::Neutron::Subnet                               | DELETE_FAILED   | 2015-09-18T14:27:47Z | TenantNetwork                               |
| StorageSubnet                               | 311fcb10-fde1-4ed4-aadc-6da1d0971f6f          | OS::Neutron::Subnet                               | DELETE_FAILED   | 2015-09-18T14:27:48Z | StorageNetwork                              |
| InternalApiSubnet                           | 59d79223-2df6-4263-8c7b-4884766e94ba          | OS::Neutron::Subnet                               | DELETE_FAILED   | 2015-09-18T14:27:49Z | InternalNetwork                             |
| StorageMgmtSubnet                           | 521b0677-7689-4649-b81e-2ef5818e78f7          | OS::Neutron::Subnet                               | DELETE_FAILED   | 2015-09-18T14:27:49Z | StorageMgmtNetwork                          |
| Networks                                    | 81780252-77df-4559-9778-651f2f7d3d30          | OS::TripleO::Network                              | UPDATE_FAILED   | 2015-09-18T15:11:19Z |                                             |
| ExternalNetwork                             | 88155ba8-0c6a-483d-a59a-c9633ee6a973          | OS::TripleO::Network::External                    | UPDATE_FAILED   | 2015-09-18T15:11:24Z | Networks                                    |
| StorageNetwork                              | c3cb6a9f-c2ff-49dd-a608-71cea5225236          | OS::TripleO::Network::Storage                     | UPDATE_FAILED   | 2015-09-18T15:11:25Z | Networks                                    |
| TenantNetwork                               | 525ddd74-addc-46af-b9ad-31420c6e049b          | OS::TripleO::Network::Tenant                      | UPDATE_FAILED   | 2015-09-18T15:11:26Z | Networks                                    |
| InternalNetwork                             | 862971cb-6510-439b-8475-2e9686cc238a          | OS::TripleO::Network::InternalApi                 | UPDATE_FAILED   | 2015-09-18T15:11:27Z | Networks                                    |
| StorageMgmtNetwork                          | 1bf7dc27-0a96-4773-97d0-82f952471e67          | OS::TripleO::Network::StorageMgmt                 | UPDATE_FAILED   | 2015-09-18T15:11:28Z | Networks                                    |
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+----------------------+---------------------------------------------+
Comment 10 Steven Hardy 2015-09-21 06:13:47 EDT
So I observed this issue today, and it does appear to be a heat issue, because we see the signal in the os-collect-config logs, but then no corresponding signal event exists in the heat event-list output.

AFAICT the reason for this is heat can't find the resource, even though it's visible in both resource-list and deployment-show:

[stack@instack ~]$ heat resource-list overcloud-ControllerNodesPostDeployment-dqduzcl6unz3-ExtraConfig-gf4u4arkc75r-RHELUnregistrationDeployment-jdcj
+---------------+--------------------------------------+--------------------------------+--------------------+----------------------+
| resource_name | physical_resource_id                 | resource_type                  | resource_status    | updated_time         |
+---------------+--------------------------------------+--------------------------------+--------------------+----------------------+
| 0             | cb3f6080-aa17-427d-aec3-bbdf167922c4 | OS::Heat::StructuredDeployment | DELETE_IN_PROGRESS | 2015-09-21T08:30:22Z |
+---------------+--------------------------------------+--------------------------------+--------------------+----------------------+
        
[stack@instack ~]$ heat resource-show overcloud-ControllerNodesPostDeployment-dqduzcl6unz3-ExtraConfig-gf4u4arkc75r-RHELUnregistrationDeployment-jdcj
Stack or resource not found: overcloud-ControllerNodesPostDeployment-dqduzcl6unz3-ExtraConfig-gf4u4arkc75r-RHELUnregistrationDeployment-jdcj 0

[stack@instack ~]$ heat resource-signal overcloud-ControllerNodesPostDeployment-dqduzcl6unz3-ExtraConfig-gf4u4arkc75r-RHELUnregistrationDeployment-jd
Stack or resource not found: overcloud-ControllerNodesPostDeployment-dqduzcl6unz3-ExtraConfig-gf4u4arkc75r-RHELUnregistrationDeployment-jdcj 0

[stack@instack ~]$ heat deployment-show cb3f6080-aa17-427d-aec3-bbdf167922c4
{
  "status": "IN_PROGRESS", 
  "server_id": "1118af86-bfc7-42fc-8a48-355f7a9de338", 
  "config_id": "5d6e14d6-2e93-4a67-b790-14a98aba0d09", 
  "output_values": null, 
  "creation_time": "2015-09-21T08:39:57Z", 
  "updated_time": "2015-09-21T09:18:01Z", 
  "input_values": {}, 
  "action": "DELETE", 
  "status_reason": "Deploy data available", 
  "id": "cb3f6080-aa17-427d-aec3-bbdf167922c4"
}

Here, the resource-signal should have forced the IN_PROGRESS deployment to complete, but it can't because it's failing to find the resource - I assume the curl from the node is failing in a similar way.
Comment 11 Jan Provaznik 2015-09-21 09:10:23 EDT
It seems that in this case the problem is in multiple mapping of OS::TripleO::NodeExtraConfigPost resource:

overcloud-resource-registry-puppet.yaml:
  OS::TripleO::NodeExtraConfigPost: extraconfig/post_deploy/default.yaml

extraconfig/post_deploy/rhel-registration/rhel-registration-resource-registry.yaml:
  OS::TripleO::NodeExtraConfigPost: rhel-registration.yaml

overcloud-resource-registry.yaml env registry file is passed to heat only when creating OC (CLI includes it dynamically in https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_deploy.py#L372). But this file is not included on pkg update command, when only the general overcloud-resource-registry-puppet.yaml is included. Thanks to this mapping of OS::TripleO::NodeExtraConfigPost is changed from rhel-registration.yaml to extraconfig/post_deploy/default.yaml during the stack update operation which causes replacement of RHEL-reg resources.

Thanks Steven Hardy who found this.
Comment 13 Alexander Chuzhoy 2015-09-25 18:59:35 EDT
Verified:
openstack-heat-common-2015.1.1-5.el7ost.noarch
openstack-heat-api-cfn-2015.1.1-5.el7ost.noarch
openstack-heat-api-cloudwatch-2015.1.1-5.el7ost.noarch
openstack-heat-api-2015.1.1-5.el7ost.noarch
openstack-heat-templates-0-0.6.20150605git.el7ost.noarch
openstack-heat-engine-2015.1.1-5.el7ost.noarch
openstack-tripleo-heat-templates-0.8.6-69.el7ost.noarch


Was able to delete the overcloud after a failed attempt to update it.
Comment 15 errata-xmlrpc 2015-10-08 08:17:18 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:1862

Note You need to log in before you can comment on or make changes to this bug.