Bug 1399227 - VM in error state after evacuation
Summary: VM in error state after evacuation
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 6.0 (Juno)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: async
: 6.0 (Juno)
Assignee: Sahid Ferdjaoui
QA Contact: Prasanth Anbalagan
URL:
Whiteboard:
Depends On:
Blocks: 1406345
TreeView+ depends on / blocked
 
Reported: 2016-11-28 15:14 UTC by Jeremy
Modified: 2020-01-17 16:15 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1406345 (view as bug list)
Environment:
Last Closed: 2017-01-24 11:08:51 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1438331 0 None None None 2016-11-30 13:54:22 UTC
OpenStack gerrit 203052 0 None MERGED Fixes _cleanup_rbd code to capture ImageBusy exception 2020-05-20 13:29:09 UTC

Description Jeremy 2016-11-28 15:14:24 UTC
Description of problem:
During the failure of a compute node, we initiate the evacuation for the virtual machines that are configured to be evacuated. 
In some cases when using ceph as shared storage the VM ends up in error state after evacuation as a result of a compute blade is pulled out. By checking the nova compute logs, the failure happens due to getting ImageBusy exception as can be seen below:


2016-11-23 10:18:27.916 6039 TRACE nova.compute.manager [instance: e31d20a3-f807-48bd-a0dc-f134af5cec3f] ImageBusy: error removing image
2016-11-23 10:18:27.916 6039 TRACE nova.compute.manager [instance: e31d20a3-f807-48bd-a0dc-f134af5cec3f]
2016-11-23 10:18:28.276 6039 ERROR oslo_messaging.rpc.dispatcher [req-90df906c-1b53-436d-9ce2-482df6ffa8e0 faba8d87dd324fdeb8a2d593d177c8a0 7b2adad362354277a8bbbcdbc7ef9809 - - -] Exception during message handling: error removing image
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     executor_callback))
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     executor_callback)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6911, in rebuild_instance
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     preserve_ephemeral=preserve_ephemeral)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 461, in decorated_function
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     return func(*args, **kwargs)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     payload)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     return f(self, context, *args, **kw)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 341, in decorated_function
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     LOG.warning(msg, e, instance_uuid=instance_uuid)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)

2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 312, in decorated_function
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 391, in decorated_function
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 369, in decorated_function
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     kwargs['instance'], e, sys.exc_info())
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     six.reraise(self.type_, self.value, self.tb)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 357, in decorated_function
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     return function(self, context, *args, **kwargs)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3150, in rebuild_instance
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     self._rebuild_default_impl(**kwargs)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2995, in _rebuild_default_impl
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     block_device_info=new_block_device_info)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2585, in spawn
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     admin_pass=admin_password)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2991, in _create_image
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     instance, configdrive_path, 'disk.config' + suffix)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py", line 816, in import_file
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     self.driver.remove_image(name)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/rbd_utils.py", line 276, in remove_image
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     rbd.RBD().remove(client.ioctx, name)
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/rbd.py", line 303, in remove
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher     raise make_ex(ret, 'error removing image')
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher ImageBusy: error removing image
2016-11-23 10:18:28.276 6039 TRACE oslo_messaging.rpc.dispatcher


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. evacuate instance from compute node
2.
3.

Actual results:
instance in error state

Expected results:
instnace not in error state

Additional info:
Looks similar to the following but different versions:
https://bugzilla.redhat.com/show_bug.cgi?id=1241613
https://bugzilla.redhat.com/show_bug.cgi?id=1287696

Comment 4 Eoghan Glynn 2016-11-30 13:55:52 UTC
Requires backport of https://review.openstack.org/169446 which is in 2014.2.4 upstream but not 2014.2.3 (on which latest OSP7 is based).

Comment 7 Jeremy 2016-11-30 14:48:26 UTC
Hello,
The customer is on OSP6.

Comment 8 GE Scott Knauss 2016-11-30 14:54:38 UTC
Right. OSP6. They also tested in OSP7 and saw the same problem. I believe some of what they uploaded (sosreports) were from the OSP7 environment, which caused the confusion. Their primary concern at this time though is OSP6. 

-Scott


Note You need to log in before you can comment on or make changes to this bug.