Bug 1284669 - [Docs] [Director] Deleting instances during overcloud scale down (reduction of computes), causes the instances to stuck in deletion.
[Docs] [Director] Deleting instances during overcloud scale down (reduction o...
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation (Show other bugs)
8.0 (Liberty)
Unspecified Unspecified
high Severity medium
: ga
: 8.0 (Liberty)
Assigned To: Dan Macpherson
RHOS Documentation Team
: Documentation
Depends On:
  Show dependency treegraph
Reported: 2015-11-23 15:35 EST by Alexander Chuzhoy
Modified: 2016-03-16 19:59 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2016-03-16 19:59:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Alexander Chuzhoy 2015-11-23 15:35:50 EST
rhel-osp-director: 8.0 - Deleting instances during overcloud scale  down (reduction of computes), causes the instances to stuck in deletion.



Steps to reproduce:
1. Deploy overcloud with several computes.
2. Launch several instances on all computes.
3. Attempt to delete  instances on computes being removed.

The deleted instances get stuck:
| 934e05a7-f25e-42df-93f2-75039a84d600 | new_instance-2    | ACTIVE | deleting   | Running     | Internal=               |
| cfd03305-a349-4167-83a8-458ba841c673 | new_instance-2    | ACTIVE | deleting   | Running     | Internal=               |
| 1a8fa414-0ca7-4531-b3b3-bfac8b13483d | new_instance-3    | ACTIVE | deleting   | Running     | Internal=               |
| 705e33ad-f50b-458b-af31-5359e4e4b6f6 | new_instance-3    | ACTIVE | deleting   | Running     | Internal=               |

Expected result:
We should either take care of that or not allow removal of instances during scale down.
Comment 1 Jaromir Coufal 2015-12-07 13:55:10 EST
Based on IRC discussion:

related docs: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Scaling_the_Overcloud.html

<sasha>	jcoufal: the link you pasted - scroll to the previous ection (7.6)
<sasha>	scaling the overcloud
<sasha>	jcoufal: so I scaled down the with --compute-scale X
<sasha>	where X is n-1

I think scaling down should specifically follow 7.7 section:


Sasha, could you please verify whether you are having this same issue when you use above mentioned documented process?
Comment 2 Andrew Dahms 2016-02-07 22:58:04 EST
Assigning to Dan for review.
Comment 3 Dan Macpherson 2016-02-09 00:18:47 EST
It seems like I should retitle Section 7.6 so that it specifically deals with Compute and Ceph nodes. maybe also merge Section 7.7 into Section 7.6.

Jarda, what do you think about this plan?
Comment 5 Dan Macpherson 2016-03-15 23:33:21 EDT
Sasha, Jarda -- just following up on this BZ. Any further changes required to these sections?
Comment 6 Alexander Chuzhoy 2016-03-16 12:24:38 EDT
So I see this under Important:
"Before removing a Compute node from the Overcloud, migrate the workload from the node to other Compute nodes."

Following this guideline, the issue doesn't reproduce.
Comment 7 Dan Macpherson 2016-03-16 19:59:32 EDT
Cool. In that case, I'll close this BZ. If further changes are required, please reopen and let me know.

Note You need to log in before you can comment on or make changes to this bug.