Bug 1284669 - [Docs] [Director] Deleting instances during overcloud scale down (reduction of computes), causes the instances to stuck in deletion.
Summary: [Docs] [Director] Deleting instances during overcloud scale down (reduction o...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ga
: 8.0 (Liberty)
Assignee: Dan Macpherson
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-11-23 20:35 UTC by Alexander Chuzhoy
Modified: 2016-03-16 23:59 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-16 23:59:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Alexander Chuzhoy 2015-11-23 20:35:50 UTC
rhel-osp-director: 8.0 - Deleting instances during overcloud scale  down (reduction of computes), causes the instances to stuck in deletion.



Environment:

openstack-heat-api-cloudwatch-5.0.0-1.el7ost.noarch  
openstack-heat-api-cfn-5.0.0-1.el7ost.noarch 
openstack-heat-common-5.0.0-1.el7ost.noarch
openstack-heat-templates-0-0.1.20151019.el7ost.noarch
openstack-heat-api-5.0.0-1.el7ost.noarch
openstack-heat-engine-5.0.0-1.el7ost.noarch
instack-undercloud-2.1.3-1.el7ost.noarc


Steps to reproduce:
1. Deploy overcloud with several computes.
2. Launch several instances on all computes.
3. Attempt to delete  instances on computes being removed.

Result:
The deleted instances get stuck:
| 934e05a7-f25e-42df-93f2-75039a84d600 | new_instance-2    | ACTIVE | deleting   | Running     | Internal=192.168.50.10               |
| cfd03305-a349-4167-83a8-458ba841c673 | new_instance-2    | ACTIVE | deleting   | Running     | Internal=192.168.50.14               |
| 1a8fa414-0ca7-4531-b3b3-bfac8b13483d | new_instance-3    | ACTIVE | deleting   | Running     | Internal=192.168.50.11               |
| 705e33ad-f50b-458b-af31-5359e4e4b6f6 | new_instance-3    | ACTIVE | deleting   | Running     | Internal=192.168.50.13               |



Expected result:
We should either take care of that or not allow removal of instances during scale down.

Comment 1 Jaromir Coufal 2015-12-07 18:55:10 UTC
Based on IRC discussion:

related docs: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Scaling_the_Overcloud.html

<sasha>	jcoufal: the link you pasted - scroll to the previous ection (7.6)
<sasha>	scaling the overcloud
<sasha>	jcoufal: so I scaled down the with --compute-scale X
<sasha>	where X is n-1

I think scaling down should specifically follow 7.7 section:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Removing_Nodes_from_the_Overcloud.html

Sasha, could you please verify whether you are having this same issue when you use above mentioned documented process?

Comment 2 Andrew Dahms 2016-02-08 03:58:04 UTC
Assigning to Dan for review.

Comment 3 Dan Macpherson 2016-02-09 05:18:47 UTC
It seems like I should retitle Section 7.6 so that it specifically deals with Compute and Ceph nodes. maybe also merge Section 7.7 into Section 7.6.

Jarda, what do you think about this plan?

Comment 5 Dan Macpherson 2016-03-16 03:33:21 UTC
Sasha, Jarda -- just following up on this BZ. Any further changes required to these sections?

Comment 6 Alexander Chuzhoy 2016-03-16 16:24:38 UTC
So I see this under Important:
"Before removing a Compute node from the Overcloud, migrate the workload from the node to other Compute nodes."

Following this guideline, the issue doesn't reproduce.

Comment 7 Dan Macpherson 2016-03-16 23:59:32 UTC
Cool. In that case, I'll close this BZ. If further changes are required, please reopen and let me know.


Note You need to log in before you can comment on or make changes to this bug.