1544088 – Run the correct command to delete the overcloud node, but the wrong overcloud node is deleted by heat command

Bug 1544088 - Run the correct command to delete the overcloud node, but the wrong overcloud node is deleted by heat command

Summary: Run the correct command to delete the overcloud node, but the wrong overclou...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-common
Sub Component:
Version:	10.0 (Newton)
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	z8
Target Release:	10.0 (Newton)
Assignee:	Rabi Mishra
QA Contact:	Gurenko Alex
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-02-10 06:42 UTC by liuwei
Modified:	2022-08-16 11:14 UTC (History)
CC List:	21 users (show)
Fixed In Version:	openstack-tripleo-common-5.4.7-3.el7ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-05-17 15:48:40 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	544781	None	MERGED	Fix overcloud node delete after an upgrade	2020-07-30 15:55:05 UTC
Red Hat Issue Tracker	OSP-4874	None	None	None	2022-08-16 11:14:18 UTC
Red Hat Product Errata	RHBA-2018:1597	None	None	None	2018-05-17 15:49:43 UTC

Comment 4 Robin Cernin 2018-02-12 04:16:27 UTC

Version:

OSP10 (Newton)

heat-cfntools-1.3.0-2.el7ost.noarch
openstack-heat-api-7.0.6-1.el7ost.noarch
openstack-heat-api-cfn-7.0.6-1.el7ost.noarch
openstack-heat-api-cloudwatch-7.0.6-1.el7ost.noarch
openstack-heat-common-7.0.6-1.el7ost.noarch
openstack-heat-engine-7.0.6-1.el7ost.noarch
openstack-heat-templates-0-0.14.1e6015dgit.el7ost.noarch
openstack-tripleo-heat-templates-5.3.3-1.el7ost.noarch
openstack-tripleo-heat-templates-compat-2.0.0-58.el7ost.noarch
puppet-heat-9.5.0-2.el7ost.noarch
python-heat-agent-0-0.14.1e6015dgit.el7ost.noarch
python-heat-tests-7.0.6-1.el7ost.noarch
python-heatclient-1.5.2-1.el7ost.noarch
How to reproduce:

0) Scale out with additional node (for example index 10)
1) Node with index 10 is created successfully in Heat database
2) Node with index 10 is assigned instance_uuid in Nova

Now the problem appears that for example: there is HW issue. the node can't boot to disk.

3) Node fails with ERROR in Nova state
4) Try to remove node with 'overcloud node delete ... [instance_uuid]'
5) Heat removes the last node instead

Actual results:

 Heat removes the last node instead

Expected results:

 Heat should remove the node specified and perform update

We guess this behavior is because the stack is already in FAILED state.

Comment 5 Rabi Mishra 2018-02-12 04:50:29 UTC

> We guess this behavior is because the stack is already in FAILED state.

Yes, heat would try to _replace_ all FAILED resources/nodes by default with an stack update.

Assuming that the a node is in FAILED state, I would also expect it to remove the node blacklisted (03c28a44-979b-4ed2-9463-04661df11570) and try replace the node in FAILED state, both.

Comment 33 Gurenko Alex 2018-05-10 12:54:00 UTC

Verified on puddle 2018-05-09.2

Comment 38 errata-xmlrpc 2018-05-17 15:48:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1597

Note You need to log in before you can comment on or make changes to this bug.

agurenko
akaris
coldford
cshastri
dhill
ebarrera
gkadam
kiyyappa
mburns
mschuppe
pkundal
ramishra
rcernin
rhel-osp-director-maint
sandyada
sbaker
segutier
shardy
slinaber
srevivo
ssmolyak