Bug 1693838 - Scaling overcloud does not remove entries of Compute Services list or network agent list
Summary: Scaling overcloud does not remove entries of Compute Services list or network...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.0 (Train)
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: 16.0 (Train on RHEL 8.1)
Assignee: Emilien Macchi
QA Contact: Cédric Jeanneret
URL:
Whiteboard:
Depends On: 1698021
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-28 18:45 UTC by coldford@redhat.com
Modified: 2023-03-24 14:40 UTC (History)
15 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.1-0.20191211120233.5ca908c.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-06 14:39:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Scaledown log (5.48 KB, text/plain)
2019-12-11 14:05 UTC, Cédric Jeanneret
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1856062 0 None None None 2019-12-11 15:31:54 UTC
OpenStack gerrit 653893 0 'None' MERGED Scale-down tasks for nova-compute 2020-10-05 04:10:29 UTC
OpenStack gerrit 698515 0 None MERGED scale: fixes for compute scale down 2020-10-05 04:10:29 UTC
Red Hat Product Errata RHEA-2020:0283 0 None None None 2020-02-06 14:40:44 UTC

Description coldford@redhat.com 2019-03-28 18:45:44 UTC
Description of problem:
As per https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/director_installation_and_usage/sect-scaling_the_overcloud#sect-Removing_Compute_Nodes

When compute nodes are removed it necessary to manually clean up the service entries. Our client is requesting this be automated.

Version-Release number of selected component (if applicable):
OSP 10

How reproducible:
Every time


Steps to Reproduce:
1. Remove computes
2. Forced to clean up cruft service records

Actual results:
- Service entries require manual clean up.

Expected results:
- No manual clean up.

Comment 5 Alex Schultz 2019-05-14 17:37:30 UTC
We now have scale down tasks and have landed the code to run scale down actions on nova-compute https://review.opendev.org/#/c/653893/

Comment 13 Cédric Jeanneret 2019-12-11 14:04:14 UTC
Moving back to ON_DEV:
- overcloud with 10 computes, tried to drop compute-1, there's an issue because it seems to take compute-10 in addition
- same overcloud, dropped compute-9: services are still up there, "enabled" but "down"

Run log looks suspicious (see coming attachement) and, in addition, the /var/lib/mistral/<stack-name> is removed at some point, making debug complicated....

Comment 14 Cédric Jeanneret 2019-12-11 14:05:30 UTC
Created attachment 1643897 [details]
Scaledown log

This log was generated with the following command:
openstack overcloud node delete -y --stack overcloud-0 464817dd-e129-402a-93bf-dc2ed216471a 2>&1 | tee ~/scaledown-1.log

That's the only trace I can get for now since /var/lib/mistral/overcloud-0 is dropped :(.

Comment 15 Cédric Jeanneret 2019-12-11 14:38:40 UTC
Just created a new BZ for the ansible directory removal: https://bugzilla.redhat.com/show_bug.cgi?id=1782379

Comment 17 Cédric Jeanneret 2019-12-13 13:38:26 UTC
I just tried the new RPM, and it works fine!

Please note, only nova services are cleaned right now, because Network DFG didn't provide relevant code for the auto-cleanup.

Verification method:
- deploy Director and Overcloud using the provided t-h-t package (and some dependencies). Overcloud has 1ctrl + 2 computes
- remove one of the computes, and check the output of `openstack compute service list' using the overcloudrc env

Comment 19 errata-xmlrpc 2020-02-06 14:39:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0283


Note You need to log in before you can comment on or make changes to this bug.