1174422 – Evacuate Fails 'Invalid state of instance files' using Ceph Ephemeral RBD

Bug 1174422 - Evacuate Fails 'Invalid state of instance files' using Ceph Ephemeral RBD

Summary: Evacuate Fails 'Invalid state of instance files' using Ceph Ephemeral RBD

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	5.0 (RHEL 7)
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	z4
Target Release:	5.0 (RHEL 7)
Assignee:	Eoghan Glynn
QA Contact:	Yogev Rabl
Docs Contact:
URL:
Whiteboard:
Depends On:	1148193 1174424
Blocks:	743661 1038706 rhelosp_ceph_integration
TreeView+	depends on / blocked

Reported:	2014-12-15 20:01 UTC by Scott Lewis
Modified:	2022-07-09 07:55 UTC (History)
CC List:	18 users (show)
Fixed In Version:	openstack-nova-2014.1.3-10.el7ost
Doc Type:	Bug Fix
Doc Text:	Previously, the evacuate function did not consider RBD storage as shared and the evacuate procedure failed with RBD-backed instances. With this fix, RBD storage is now marked as shared, and the evacuate function handles the shared storage attribute and therefore now operates on RBD.
Clone Of:	1148193
Environment:
Last Closed:	2015-04-16 14:35:04 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1340411	None	None	None	Never
OpenStack gerrit	130905	None	MERGED	Fix nova-compute start issue after evacuate	2021-01-21 10:34:05 UTC
OpenStack gerrit	131629	None	MERGED	Fix nova evacuate issues for RBD	2021-01-21 10:34:05 UTC
Red Hat Issue Tracker	OSP-16673	None	None	None	2022-07-09 07:55:16 UTC
Red Hat Product Errata	RHSA-2015:0843	normal	SHIPPED_LIVE	Important: openstack-nova security, bug fix, and enhancement update	2015-04-16 18:27:45 UTC

Comment 5 Yogev Rabl 2015-04-12 09:01:57 UTC

I'm not sure about how the system defines a failed compute. 
The scenario I've tested is: 
1. stopped the services openstack-nova-compute or libvirtd
2. tried to evacuate an instance 
With both options the system response to the evacuation was: 
ERROR: Compute service of <host name> is still in use. 

Eoghan, How can I change the status of the Compute?

Comment 6 Pádraig Brady 2015-04-13 11:38:27 UTC

Please ensure the openstack-nova-api service is stopped

Comment 7 Eoghan Glynn 2015-04-13 12:25:51 UTC

After discussing on IRC, the conclusion is:

 * nova-api service should not be shut down, as the POST /v2/{tenant_id}/servers/{server_id}/evacate call must be mediated by the service for each VM on the old node

 * shutting down nova-compute service should suffice, but in realistic example reproducing this issue, the entire compute node was powered-down

Comment 8 Yogev Rabl 2015-04-14 12:47:43 UTC

verified on RHEL 7, Nova version: 

openstack-nova-common-2014.1.4-3.el7ost.noarch
openstack-nova-novncproxy-2014.1.4-3.el7ost.noarch
python-novaclient-2.17.0-4.el7ost.noarch
openstack-nova-console-2014.1.4-3.el7ost.noarch
openstack-nova-conductor-2014.1.4-3.el7ost.noarch
openstack-nova-cert-2014.1.4-3.el7ost.noarch
python-nova-2014.1.4-3.el7ost.noarch
openstack-nova-compute-2014.1.4-3.el7ost.noarch
openstack-nova-api-2014.1.4-3.el7ost.noarch
openstack-nova-scheduler-2014.1.4-3.el7ost.noarch

Comment 10 errata-xmlrpc 2015-04-16 14:35:04 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0843.html

Note You need to log in before you can comment on or make changes to this bug.