Bug 1150856 - Live Migration Hanging with RBD Storage
Summary: Live Migration Hanging with RBD Storage
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 5.0 (RHEL 7)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z3
: 5.0 (RHEL 7)
Assignee: Pádraig Brady
QA Contact: Yogev Rabl
URL:
Whiteboard:
: 1146576 (view as bug list)
Depends On:
Blocks: rhelosp_ceph_integration 1151150 1165722
TreeView+ depends on / blocked
 
Reported: 2014-10-09 05:03 UTC by Ken Schroeder
Modified: 2019-09-09 13:47 UTC (History)
19 users (show)

Fixed In Version: openstack-nova-2014.1.3-4.el7ost
Doc Type: Bug Fix
Doc Text:
The libvirt driver previously handled shared file systems incorrectly (for example, using NFS or Ceph), and live migration would fail or hang with exceptions. With this update, raw or qcow images are now treated appropriately on shared file systems, and live migration now works reliably and efficiently.
Clone Of:
: 1165722 1196350 (view as bug list)
Environment:
Last Closed: 2014-12-09 16:47:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 111082 0 None None None Never

Description Ken Schroeder 2014-10-09 05:03:18 UTC
Trying to validate support for live migration of instances with OSP5 and RHEL7.  Nova configuration is using images_type rbd and both source and dest are able to create rbd backed volumes properly.

nova.conf has:
live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE

/etc/libvirt/libvirtd.conf contains: listen_tls=0, auth_tcp="none"
/etc/sysconfig/libvirtd contains: LIBVIRTD_ARGS="--listen"

Currently Instances get stuck in migrating state and never error or complete.  Retrieved below stack trace from nova-compute.log after enabling debug.

2014-10-09 04:45:59.342 6282 ERROR nova.compute.manager [req-dcef86a2-81b4-4870-becd-ade01f200489 3671e07e02c440198b4acdd0e12c747e 7edb1fe949614b30ab04dc256a5667fe] [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] Pre live migration failed at svl6-csl-b-nova1-003
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] Traceback (most recent call last):
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4543, in live_migration
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     block_migration, disk, dest, migrate_data)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/compute/rpcapi.py", line 595, in pre_live_migration
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     disk=disk, migrate_data=migrate_data)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 150, in call
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     wait_for_reply=True, timeout=timeout)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     timeout=timeout)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 412, in send
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     return self._send(target, ctxt, message, wait_for_reply, timeout)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 405, in _send
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     raise result
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] DestinationDiskExists_Remote: The supplied disk path (/var/lib/nova/instances/ccfc08c6-2068-408d-9f6f-4959b2f87afa) already exists, it is expected not to exist.
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] Traceback (most recent call last):
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 133, in _dispatch_and_reply
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     incoming.message))
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 176, in _dispatch
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     return self._do_dispatch(endpoint, method, ctxt, args)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 122, in _do_dispatch
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     result = getattr(endpoint, method)(ctxt, **new_args)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 393, in decorated_function
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     return function(self, context, *args, **kwargs)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     payload)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     six.reraise(self.type_, self.value, self.tb)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     return f(self, context, *args, **kw)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 303, in decorated_function
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     e, sys.exc_info())
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     six.reraise(self.type_, self.value, self.tb)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 290, in decorated_function
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     return function(self, context, *args, **kwargs)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4499, in pre_live_migration
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     migrate_data)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4616, in pre_live_migration
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa]     raise exception.DestinationDiskExists(path=instance_dir)
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] DestinationDiskExists: The supplied disk path (/var/lib/nova/instances/ccfc08c6-2068-408d-9f6f-4959b2f87afa) already exists, it is expected not to exist.
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.342 6282 TRACE nova.compute.manager [instance: ccfc08c6-2068-408d-9f6f-4959b2f87afa] 
2014-10-09 04:45:59.418 6282 DEBUG nova.openstack.common.lockutils [req-dcef86a2-81b4-4870-becd-ade01f200489 3671e07e02c440198b4acdd0e12c747e 7edb1fe949614b30ab04dc256a5667fe] Got semaphore "compute_resources" lock /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:168
2014-10-09 04:45:59.418 6282 DEBUG nova.openstack.common.lockutils [req-dcef86a2-81b4-4870-becd-ade01f200489 3671e07e02c440198b4acdd0e12c747e 7edb1fe949614b30ab04dc256a5667fe] Got semaphore / lock "update_usage" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:248
2014-10-09 04:45:59.485 6282 DEBUG nova.openstack.common.lockutils [req-dcef86a2-81b4-4870-becd-ade01f200489 3671e07e02c440198b4acdd0e12c747e 7edb1fe949614b30ab04dc256a5667fe] Semaphore / lock released "update_usage" inner /usr/lib/python2.7/site-packages/nova/openstack/common/lockutils.py:252
2014-10-09 04:45:59.577 6282 ERROR oslo.messaging.rpc.dispatcher [-] Exception during message handling: The supplied disk path (/var/lib/nova/instances/ccfc08c6-2068-408d-9f6f-4959b2f87afa) already exists, it is expected not to exist.
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 133, in _dispatch_and_reply
    incoming.message))

  File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 176, in _dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)

  File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 122, in _do_dispatch
    result = getattr(endpoint, method)(ctxt, **new_args)

  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 393, in decorated_function
    return function(self, context, *args, **kwargs)

  File "/usr/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped
    payload)

  File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__
    six.reraise(self.type_, self.value, self.tb)

  File "/usr/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped
    return f(self, context, *args, **kw)

  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 303, in decorated_function
    e, sys.exc_info())

  File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 68, in __exit__
    six.reraise(self.type_, self.value, self.tb)

  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 290, in decorated_function
    return function(self, context, *args, **kwargs)

  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4499, in pre_live_migration
    migrate_data)

  File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4616, in pre_live_migration
    raise exception.DestinationDiskExists(path=instance_dir)

DestinationDiskExists: The supplied disk path (/var/lib/nova/instances/ccfc08c6-2068-408d-9f6f-4959b2f87afa) already exists, it is expected not to exist.

Comment 2 Pádraig Brady 2014-10-09 11:17:23 UTC
The exception suggests that is_shared_instance_path is not set correctly in pre_live_migration(). It's likely that https://review.openstack.org/#/c/111082/ will need backporting to fix those checks for RBD

Comment 3 Ken Schroeder 2014-10-10 17:15:42 UTC
We're using RBD storage with devices configured through libvirt.  Acccess to ceph system is properly configured and working across all source/dest live-migration targets.

Comment 7 Yogev Rabl 2014-10-27 12:52:50 UTC
The verification failed. While trying to migrate the instances from one host to another, the logs show the same error massages

Comment 10 Solly Ross 2014-11-12 17:55:03 UTC
*** Bug 1146576 has been marked as a duplicate of this bug. ***

Comment 15 Yogev Rabl 2014-12-04 14:56:50 UTC
verified in version: 
openstack-nova-cert-2014.1.3-9.el7ost.noarch
openstack-nova-novncproxy-2014.1.3-9.el7ost.noarch
python-nova-2014.1.3-9.el7ost.noarch
openstack-nova-console-2014.1.3-9.el7ost.noarch
openstack-nova-network-2014.1.3-9.el7ost.noarch
openstack-nova-common-2014.1.3-9.el7ost.noarch
openstack-nova-compute-2014.1.3-9.el7ost.noarch
openstack-nova-conductor-2014.1.3-9.el7ost.noarch
openstack-nova-scheduler-2014.1.3-9.el7ost.noarch
openstack-nova-api-2014.1.3-9.el7ost.noarch
python-novaclient-2.17.0-2.el7ost.noarch

Comment 16 Yogev Rabl 2014-12-04 14:57:10 UTC
verified in version: 
openstack-nova-cert-2014.1.3-9.el7ost.noarch
openstack-nova-novncproxy-2014.1.3-9.el7ost.noarch
python-nova-2014.1.3-9.el7ost.noarch
openstack-nova-console-2014.1.3-9.el7ost.noarch
openstack-nova-network-2014.1.3-9.el7ost.noarch
openstack-nova-common-2014.1.3-9.el7ost.noarch
openstack-nova-compute-2014.1.3-9.el7ost.noarch
openstack-nova-conductor-2014.1.3-9.el7ost.noarch
openstack-nova-scheduler-2014.1.3-9.el7ost.noarch
openstack-nova-api-2014.1.3-9.el7ost.noarch
python-novaclient-2.17.0-2.el7ost.noarch

Comment 20 Scott Lewis 2014-12-09 16:47:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2014-1933.html


Note You need to log in before you can comment on or make changes to this bug.