Bug 2077670 - "openstack overcloud node delete" command throws an exception
Summary: "openstack overcloud node delete" command throws an exception
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 17.0 (Wallaby)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Rabi Mishra
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-21 20:55 UTC by Marian Krcmarik
Modified: 2022-09-21 12:21 UTC (History)
4 users (show)

Fixed In Version: python-tripleoclient-16.4.1-0.20220506221701.559cc8c.el9ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-21 12:20:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 838980 0 None NEW Fix node delete for unprovision confirmation 2022-04-22 04:49:26 UTC
Red Hat Issue Tracker OSP-14822 0 None None None 2022-04-21 21:01:54 UTC
Red Hat Product Errata RHEA-2022:6543 0 None None None 2022-09-21 12:21:12 UTC

Description Marian Krcmarik 2022-04-21 20:55:58 UTC
Description of problem:
The command "openstack overcloud node delete" throws an exception:
Exception occured while running the command
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/tripleoclient/command.py", line 34, in run
    super(Command, self).run(parsed_args)
  File "/usr/lib/python3.9/site-packages/osc_lib/command/command.py", line 39, in run
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python3.9/site-packages/cliff/command.py", line 186, in run
    return_code = self.take_action(parsed_args) or 0
  File "/usr/lib/python3.9/site-packages/tripleoclient/v1/overcloud_node.py", line 140, in take_action
    nodes_text, nodes = self._nodes_to_delete(parsed_args, roles)
  File "/usr/lib/python3.9/site-packages/tripleoclient/v1/overcloud_node.py", line 107, in _nodes_to_delete
    nodes_data = [(i.get('hostname', ''),
  File "/usr/lib/python3.9/site-packages/tripleoclient/v1/overcloud_node.py", line 107, in <listcomp>
    nodes_data = [(i.get('hostname', ''),
AttributeError: 'str' object has no attribute 'get'
'str' object has no attribute 'get'

The exact command line is following:
openstack overcloud node delete --stack dcn2 --baremetal-deployment /home/stack/dcn2/network/baremetal_deployment.yaml --yes

where baremetal_deployment.yaml looks like:
- name: DistributedCompute
  count: 1
  instances:
  - hostname: dcn2-compute2-1
    provisioned: false

and the stack dcn2 has following nodes:
| 642f628d-d3a1-4fa3-951d-d07a7b31d797 | dcn2-compute-0    | de8d68fe-2900-469a-809f-d0ef66036042 | dcn2-compute2-1      | ACTIVE | ctlplane=192.168.44.33 |
| 93ee1cad-347a-4240-a58e-0329b52b664d | dcn2-compute-1    | 15d71069-a8ab-4872-aaf7-22019373d834 | dcn2-compute2-0      | ACTIVE | ctlplane=192.168.44.51 |

The exactly same command works for me on OSP17 based on RHEL8 and It failed once I switched to the compose based on RHEL9

Version-Release number of selected component (if applicable):
ansible-tripleo-ipsec-11.0.1-0.20210910011424.b5559c8.el9ost.noarch
ansible-role-tripleo-modify-image-1.3.1-0.20220216001439.30d23d5.el9ost.noarch
ansible-tripleo-ipa-0.2.3-0.20220301190449.6b0ed82.el9ost.noarch
puppet-tripleo-14.2.3-0.20220407012437.87240e8.el9ost.noarch
python3-tripleo-common-15.4.1-0.20220328184445.0c754c6.el9ost.noarch
tripleo-ansible-3.3.1-0.20220407091528.0bc2994.el9ost.noarch
openstack-tripleo-validations-14.2.2-0.20220408101530.6614654.el9ost.noarch
openstack-tripleo-common-containers-15.4.1-0.20220328184445.0c754c6.el9ost.noarch
openstack-tripleo-common-15.4.1-0.20220328184445.0c754c6.el9ost.noarch
openstack-tripleo-heat-templates-14.3.1-0.20220404155604.75fd885.el9ost.noarch
python3-tripleoclient-16.4.1-0.20220407001042.0021766.el9ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Try to scaledown by one node a deployed overcloud stack with following command: openstack overcloud node delete

Actual results:
The command throws an exception

Expected results:
The specified stack should be sacel down by the deleted node.


Additional info:

Comment 1 Rabi Mishra 2022-04-22 04:49:26 UTC
Yeah, there is a regression from https://github.com/openstack/tripleo-ansible/commit/f5444e1fd35b93b0e2e79dc7db3bbb560e04d5a3. You should probably use 'overcloud node unprovison' command instead with OSP17.

Comment 2 Marian Krcmarik 2022-04-22 11:35:50 UTC
(In reply to Rabi Mishra from comment #1)
> Yeah, there is a regression from
> https://github.com/openstack/tripleo-ansible/commit/
> f5444e1fd35b93b0e2e79dc7db3bbb560e04d5a3. You should probably use 'overcloud
> node unprovison' command instead with OSP17.

I followed this upstream docs:
https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/provisioning/baremetal_provision.html#scaling-down

what would be the right 'overcloud node unprovison' command? Something like:
openstack overcloud node unprovision --stack dcn2 -y dcn2/baremetal_deployment_scaledown.yaml

I tried that and the thing is that such command does only partial job, It unprovisions the instance and It does not clean some overcloud services, for example in my case "openstack overcloud delete" disables/removes nova-compute and neutron agents services and cleans their containers while "openstack overcloud node unprovision" does not do that.

Comment 5 David Rosenfeld 2022-06-17 13:09:47 UTC
scale down automation is in progress. The scale down works without a traceback. Abbreviated logs from automation are:

Command being executed:

"cmd": "source ~/stackrc\nset -o pipefail\nopenstack overcloud node delete -y --stack overcloud --baremetal-deployment \"/home/stack/virt/network/baremetal_deployment.yaml\" | tee -a /home/stack/overcloud_scaledown.log\n",

output:

2022-06-15 15:21:37.099 | compute-1                  : ok=18   changed=5    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0   
2022-06-15 15:21:37.101 | 2022-06-15 15:21:18.442460 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2022-06-15 15:21:37.104 | 2022-06-15 15:21:18.443062 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 20         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2022-06-15 15:21:37.106 | 2022-06-15 15:21:18.443576 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:00:39.011924 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2022-06-15 15:21:37.109 | 2022-06-15 15:21:18.444123 |                                 UUID |       Info |       Host |   Task Name |   Run Time
2022-06-15 15:21:37.111 | 2022-06-15 15:21:18.444574 | 5254000c-6e9c-09bc-6aef-0000000000a5 |    SUMMARY |  compute-1 | Delete neutron agents | 9.54s
2022-06-15 15:21:37.113 | 2022-06-15 15:21:18.445032 | 5254000c-6e9c-09bc-6aef-0000000000a4 |    SUMMARY |  compute-1 | Stop OVN containers | 3.98s
2022-06-15 15:21:37.115 | 2022-06-15 15:21:18.445465 | 5254000c-6e9c-09bc-6aef-0000000000ae |    SUMMARY |  compute-1 | Stop nova-compute container | 3.39s
2022-06-15 15:21:37.118 | 2022-06-15 15:21:18.445916 | 5254000c-6e9c-09bc-6aef-00000000003e |    SUMMARY |  compute-1 | Gathering Facts | 1.24s
2022-06-15 15:21:37.120 | 2022-06-15 15:21:18.446348 | 5254000c-6e9c-09bc-6aef-000000000024 |    SUMMARY |  compute-1 | Set all_nodes data as group_vars for overcloud | 0.04s
2022-06-15 15:21:37.122 | 2022-06-15 15:21:18.446877 | 5254000c-6e9c-09bc-6aef-0000000000a9 |    SUMMARY |  compute-1 | is additional Cell? | 0.04s
2022-06-15 15:21:37.124 | 2022-06-15 15:21:18.447300 | 5254000c-6e9c-09bc-6aef-00000000001a |    SUMMARY |  compute-1 | ansible.builtin.include_vars | 0.04s
2022-06-15 15:21:37.127 | 2022-06-15 15:21:18.447783 | 5254000c-6e9c-09bc-6aef-000000000027 |    SUMMARY |  compute-1 | include_tasks | 0.03s
2022-06-15 15:21:37.129 | 2022-06-15 15:21:18.448309 | 5254000c-6e9c-09bc-6aef-000000000017 |    SUMMARY |  compute-1 | Set legacy facts | 0.03s
2022-06-15 15:21:37.131 | 2022-06-15 15:21:18.448770 | 5254000c-6e9c-09bc-6aef-000000000029 |    SUMMARY |  compute-1 | fail | 0.03s
2022-06-15 15:21:37.134 | 2022-06-15 15:21:18.449192 | 5254000c-6e9c-09bc-6aef-00000000001e |    SUMMARY |  compute-1 | Include OVN bridge MAC address variables | 0.03s
2022-06-15 15:21:37.136 | 2022-06-15 15:21:18.449614 | 5254000c-6e9c-09bc-6aef-0000000000ab |    SUMMARY |  compute-1 | Check search output | 0.03s
2022-06-15 15:21:37.139 | 2022-06-15 15:21:18.450010 | 5254000c-6e9c-09bc-6aef-00000000001d |    SUMMARY |  compute-1 | Include Service VIP vars | 0.03s
2022-06-15 15:21:37.141 | 2022-06-15 15:21:18.450429 | 5254000c-6e9c-09bc-6aef-000000000021 |    SUMMARY |  compute-1 | Render all_nodes data as group_vars for overcloud | 0.00s
2022-06-15 15:21:37.143 | 2022-06-15 15:21:18.451051 | 5254000c-6e9c-09bc-6aef-0000000000a1 |    SUMMARY |  compute-1 | Get neutron agents ID | 0.00s
2022-06-15 15:21:37.146 | 2022-06-15 15:21:18.451563 | 5254000c-6e9c-09bc-6aef-0000000000a2 |    SUMMARY |  compute-1 | Filter only current host | 0.00s
2022-06-15 15:21:37.149 | 2022-06-15 15:21:18.452132 | 5254000c-6e9c-09bc-6aef-0000000000a8 |    SUMMARY |  compute-1 | Get nova-compute service ID | 0.00s
2022-06-15 15:21:37.151 | 2022-06-15 15:21:18.452558 | 5254000c-6e9c-09bc-6aef-0000000000aa |    SUMMARY |  compute-1 | Set fact for nova_compute services | 0.00s
2022-06-15 15:21:37.154 | 2022-06-15 15:21:18.453096 | 5254000c-6e9c-09bc-6aef-0000000000ad |    SUMMARY |  compute-1 | Disable nova-compute service | 0.00s
2022-06-15 15:21:37.156 | 2022-06-15 15:21:18.453523 | 5254000c-6e9c-09bc-6aef-0000000000af |    SUMMARY |  compute-1 | Delete nova-compute service | 0.00s
2022-06-15 15:21:37.158 | 2022-06-15 15:21:18.454143 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Comment 9 errata-xmlrpc 2022-09-21 12:20:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543


Note You need to log in before you can comment on or make changes to this bug.