Bug 1346072 - Scaling down a compute node does not remove it from nova service-list or neutron agent-list
Summary: Scaling down a compute node does not remove it from nova service-list or neut...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 8.0 (Liberty)
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
: 12.0 (Pike)
Assignee: Angus Thomas
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-13 20:39 UTC by Dan Yasny
Modified: 2018-07-12 11:45 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-12 14:50:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dan Yasny 2016-06-13 20:39:30 UTC
Description of problem:
after scaling a compute node down, it still remains registered, note overcloud-compute-1 is still present: 

$ nova service-list
+----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary           | Host                               | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+
| 3  | nova-scheduler   | overcloud-controller-0.localdomain | internal | enabled | up    | 2016-06-13T20:30:24.000000 | -               |
| 6  | nova-scheduler   | overcloud-controller-2.localdomain | internal | enabled | up    | 2016-06-13T20:30:23.000000 | -               |
| 9  | nova-scheduler   | overcloud-controller-1.localdomain | internal | enabled | up    | 2016-06-13T20:30:24.000000 | -               |
| 12 | nova-consoleauth | overcloud-controller-0.localdomain | internal | enabled | up    | 2016-06-13T20:30:20.000000 | -               |
| 15 | nova-consoleauth | overcloud-controller-1.localdomain | internal | enabled | up    | 2016-06-13T20:30:25.000000 | -               |
| 18 | nova-consoleauth | overcloud-controller-2.localdomain | internal | enabled | up    | 2016-06-13T20:30:28.000000 | -               |
| 21 | nova-conductor   | overcloud-controller-1.localdomain | internal | enabled | up    | 2016-06-13T20:30:22.000000 | -               |
| 24 | nova-compute     | overcloud-compute-0.localdomain    | nova     | enabled | up    | 2016-06-13T20:30:29.000000 | -               |
| 27 | nova-compute     | overcloud-compute-1.localdomain    | nova     | enabled | down  | 2016-06-13T19:40:57.000000 | -               |
| 30 | nova-conductor   | overcloud-controller-2.localdomain | internal | enabled | up    | 2016-06-13T20:30:23.000000 | -               |
| 33 | nova-conductor   | overcloud-controller-0.localdomain | internal | enabled | up    | 2016-06-13T20:30:23.000000 | -               |
+----+------------------+------------------------------------+----------+---------+-------+----------------------------+-----------------+



$ neutron agent-list
+--------------------------------------+--------------------+------------------------------------+-------+----------------+---------------------------+
| id                                   | agent_type         | host                               | alive | admin_state_up | binary                    |
+--------------------------------------+--------------------+------------------------------------+-------+----------------+---------------------------+
| 0197d9a8-fcc4-43b9-865d-a9f3aad1b680 | Metadata agent     | overcloud-controller-0.localdomain | :-)   | True           | neutron-metadata-agent    |
| 17813287-da44-493d-809e-c8cfe552a92a | L3 agent           | overcloud-controller-2.localdomain | :-)   | True           | neutron-l3-agent          |
| 267609e9-4a7e-4ba1-87a7-1e1a8afb7208 | DHCP agent         | overcloud-controller-1.localdomain | :-)   | True           | neutron-dhcp-agent        |
| 3d3b17e3-1567-4285-9e07-2aecc23ed4b5 | Metadata agent     | overcloud-controller-2.localdomain | :-)   | True           | neutron-metadata-agent    |
| 491dfa76-a5a1-4019-b24f-163a68894302 | Open vSwitch agent | overcloud-controller-1.localdomain | :-)   | True           | neutron-openvswitch-agent |
| 4b7e3cfb-0cf6-470c-8be0-d369615cf516 | L3 agent           | overcloud-controller-0.localdomain | :-)   | True           | neutron-l3-agent          |
| 6024ac37-4cff-47fe-89ad-ddc1f1413bb7 | Open vSwitch agent | overcloud-compute-0.localdomain    | :-)   | True           | neutron-openvswitch-agent |
| 65d8b445-513b-4066-b965-22ccdf6f7288 | L3 agent           | overcloud-controller-1.localdomain | :-)   | True           | neutron-l3-agent          |
| 66c84104-0dd0-43cf-8800-9a56fc945840 | DHCP agent         | overcloud-controller-2.localdomain | :-)   | True           | neutron-dhcp-agent        |
| 78c63616-d57a-4753-a13e-4fb38daae37d | Metadata agent     | overcloud-controller-1.localdomain | :-)   | True           | neutron-metadata-agent    |
| 8b63b900-9d5a-440d-8349-570179f16116 | Open vSwitch agent | overcloud-controller-0.localdomain | :-)   | True           | neutron-openvswitch-agent |
| 8c6492c2-3d1b-4a49-842c-97110186e92c | Open vSwitch agent | overcloud-compute-1.localdomain    | xxx   | True           | neutron-openvswitch-agent |
| db432abc-b2e2-4470-8e72-85fb65c46e04 | DHCP agent         | overcloud-controller-0.localdomain | :-)   | True           | neutron-dhcp-agent        |
| fefb7824-29e2-4b49-8c6f-8afe40ae2a92 | Open vSwitch agent | overcloud-controller-2.localdomain | :-)   | True           | neutron-openvswitch-agent |
+--------------------------------------+--------------------+------------------------------------+-------+----------------+---------------------------+


Version-Release number of selected component (if applicable):
openstack-ceilometer-collector-5.0.2-2.el7ost.noarch
openstack-heat-templates-0-0.1.20151019.el7ost.noarch
python-openstackclient-1.7.2-1.el7ost.noarch
openstack-ironic-api-4.2.3-1.el7ost.noarch
openstack-heat-engine-5.0.1-6.el7ost.noarch
openstack-swift-container-2.5.0-2.el7ost.noarch
openstack-selinux-0.6.58-1.el7ost.noarch
openstack-neutron-common-7.0.4-5.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.14-13.el7ost.noarch
openstack-ceilometer-notification-5.0.2-2.el7ost.noarch
openstack-ceilometer-polling-5.0.2-2.el7ost.noarch
openstack-tripleo-0.0.7-1.el7ost.noarch
openstack-ceilometer-alarm-5.0.2-2.el7ost.noarch
openstack-keystone-8.0.1-1.el7ost.noarch
openstack-nova-cert-12.0.3-11.el7ost.noarch
openstack-tripleo-image-elements-0.9.9-2.el7ost.noarch
openstack-ceilometer-central-5.0.2-2.el7ost.noarch
openstack-tripleo-common-0.3.1-1.el7ost.noarch
openstack-ironic-conductor-4.2.3-1.el7ost.noarch
openstack-puppet-modules-7.0.19-1.el7ost.noarch
openstack-utils-2014.2-1.el7ost.noarch
openstack-neutron-7.0.4-5.el7ost.noarch
openstack-nova-conductor-12.0.3-11.el7ost.noarch
openstack-heat-api-cloudwatch-5.0.1-6.el7ost.noarch
openstack-swift-plugin-swift3-1.9-1.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.5-1.el7ost.noarch
openstack-ironic-common-4.2.3-1.el7ost.noarch
openstack-nova-common-12.0.3-11.el7ost.noarch
openstack-nova-compute-12.0.3-11.el7ost.noarch
openstack-swift-2.5.0-2.el7ost.noarch
openstack-ironic-inspector-2.2.6-1.el7ost.noarch
openstack-neutron-openvswitch-7.0.4-5.el7ost.noarch
openstack-heat-common-5.0.1-6.el7ost.noarch
openstack-heat-api-5.0.1-6.el7ost.noarch
openstack-swift-account-2.5.0-2.el7ost.noarch
openstack-swift-proxy-2.5.0-2.el7ost.noarch
openstack-neutron-ml2-7.0.4-5.el7ost.noarch
openstack-nova-scheduler-12.0.3-11.el7ost.noarch
openstack-nova-api-12.0.3-11.el7ost.noarch
openstack-glance-11.0.1-4.el7ost.noarch
openstack-ceilometer-api-5.0.2-2.el7ost.noarch
openstack-heat-api-cfn-5.0.1-6.el7ost.noarch
openstack-swift-object-2.5.0-2.el7ost.noarch
openstack-tripleo-heat-templates-0.8.14-13.el7ost.noarch
openstack-ceilometer-common-5.0.2-2.el7ost.noarch
python-tripleoclient-0.3.4-5.el7ost.noarch


How reproducible:
always

Steps to Reproduce:
1. install OSP with two or more computes
2. rerun the deploy command with one compute less (downscale computes by one)
3. source overcloudrc; nova service-list; neutron agent-list

Actual results:

as shown above

Expected results:

the downscaled node should be removed and unregistered

Additional info:

Comment 2 James Slagle 2016-06-21 13:06:33 UTC
this is not a bug imo. agents are not removed from the nova and neutron db when nodes are deleted. unless there is an api or command we can use to remove these entries from the db tables, i dont think there is anything we can do from the tripleo side

Comment 3 Dan Yasny 2016-06-21 13:11:57 UTC
(In reply to James Slagle from comment #2)
> this is not a bug imo. agents are not removed from the nova and neutron db
> when nodes are deleted. unless there is an api or command we can use to
> remove these entries from the db tables, i dont think there is anything we
> can do from the tripleo side

Scaling with tripleo leaves the cloud in an unclean state, I do think it's a bug. If there is no API command to handle this, then this is another bug, on which this one should depend.

Comment 4 Brad P. Crochet 2016-06-21 13:47:16 UTC
Please provide full commandline used to scale down.

Comment 5 Dan Yasny 2016-06-21 14:11:17 UTC
openstack overcloud deploy --templates --control-scale 3 --compute-scale 1   --neutron-network-type vxlan --neutron-tunnel-types vxlan  --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml --ceph-storage-scale 1

The only difference between this and the original deployment command is that the original had --compute-scale 2

Comment 6 Brad P. Crochet 2016-06-23 13:38:58 UTC
First of all, the 'openstack overcloud node delete' command should be used when removing a node. Otherwise, the node removal would be random, which is not what an operator would desire.

Second, is it unreasonable that the operator clean up these resources? I don't think this belongs in the client.

neutron agent-delete and nova service-delete already exist. 

I agree with slagle. IMHO this is notabug.

Comment 7 Dan Yasny 2016-06-23 13:46:56 UTC
(In reply to Brad P. Crochet from comment #6)
> First of all, the 'openstack overcloud node delete' command should be used
> when removing a node. 

Ah! OK, going to add this to our automation stack... In my tests, which node gets removed doesn't really matter, but it makes sense to repeat what the customers would be doing.

> Otherwise, the node removal would be random, which is
> not what an operator would desire.
> 
> Second, is it unreasonable that the operator clean up these resources? I
> don't think this belongs in the client.

IMO a node removal should be complete and clean for best user experience, but that's up to the PM

> 
> neutron agent-delete and nova service-delete already exist. 
> 
> I agree with slagle. IMHO this is notabug.

Comment 8 Dan Yasny 2016-06-23 13:53:54 UTC
Opened https://bugzilla.redhat.com/show_bug.cgi?id=1349467 for docs

Comment 9 Brad P. Crochet 2016-06-23 15:24:27 UTC
(In reply to Dan Yasny from comment #7)
> 
> IMO a node removal should be complete and clean for best user experience,
> but that's up to the PM

The thing is, we don't create those resources. They are created "magically" by OpenStack. Which is why they are not cleaned up via Heat. Therefore, they are rather outside of the purview of TripleO.

Comment 11 Jaromir Coufal 2016-10-10 02:14:34 UTC
Outside of OSP10. Clean removal would be desired but unfortunately I don't see any capacity for this work to be done in the near future (understand following release). Removing release target and changing priority. Needs to be revisited in time.

Comment 13 Alex Schultz 2018-03-12 14:50:12 UTC
We don't cleanup when scaling down. This is currently documented on how to cleanup these items when scaling down.  Feel free to open an RFE request to improve this.


Note You need to log in before you can comment on or make changes to this bug.