Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 997840 - live block migration stopped working, claiming DestinationDiskExists
live block migration stopped working, claiming DestinationDiskExists
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova (Show other bugs)
3.0
x86_64 Linux
urgent Severity high
: z2
: 3.0
Assigned To: Xavier Queralt
Jaroslav Henner
storage
: Regression, ZStream
Depends On:
Blocks: 993100
  Show dependency treegraph
 
Reported: 2013-08-16 05:35 EDT by Jaroslav Henner
Modified: 2014-06-18 03:00 EDT (History)
10 users (show)

See Also:
Fixed In Version: openstack-nova-2013.1.3-2.el6ost
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-09-03 16:21:48 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
log (9.54 KB, text/plain)
2013-08-16 05:39 EDT, Jaroslav Henner
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1193359 None None None Never
OpenStack gerrit 42588 None None None Never
Red Hat Product Errata RHSA-2013:1199 normal SHIPPED_LIVE Moderate: openstack-nova security and bug fix update 2013-09-03 20:16:56 EDT

  None (edit)
Description Jaroslav Henner 2013-08-16 05:35:01 EDT
Description of problem:
live block migration fails because nova claims it found a disk that it shouldn't be there.

Version-Release number of selected component (if applicable):
openstack-nova-common-2013.1.3-1.el6ost.noarch

How reproducible:
always

Steps to Reproduce:
1. nova live-migration --block-migrate $VM
2. check the logs
3.

Actual results:
no migration, errors in logs

Expected results:
migrated, no error

Additional info:
Comment 1 Jaroslav Henner 2013-08-16 05:39:16 EDT
Created attachment 787211 [details]
log
Comment 2 Jaroslav Henner 2013-08-16 11:58:33 EDT
I tried to reproduce on fresh deployment of 2013-08-05.1. It migration passed.
I updated to 2013-08-15.1. Migration passed. So I wonder why it started failing on my production deployment.
Comment 3 Jaroslav Henner 2013-08-16 12:27:07 EDT
I am not sure whether I made some error, but now, with puddle 2013-08-15.1 I can reproduce.:


for node in node-01.lithium node-02.lithium; do echo $node; ssh $node ls /var/lib/nova/instances; echo; done
node-01.lithium                                                                    
Warning: Permanently added 'node-01.lithium' (RSA) to the list of known hosts.
a64c03f1-3d58-4f75-b38a-a526730ca431
_base
locks

node-02.lithium
Warning: Permanently added 'node-02.lithium' (RSA) to the list of known hosts.
a64c03f1-3d58-4f75-b38a-a526730ca431

+-------------------------------------+----------------------------------------------------------+
| Property                            | Value                                                    |
+-------------------------------------+----------------------------------------------------------+
| status                              | BUILD                                                    |
| updated                             | 2013-08-16T16:23:41Z                                     |
| OS-EXT-STS:task_state               | block_device_mapping                                     |
| OS-EXT-SRV-ATTR:host                | node-01.lithium.rhev.lab.eng.brq.redhat.com              |
| key_name                            | None                                                     |
| image                               | cirros1 (ab24ccbd-4c89-4444-b0b2-a06a79c44306)           |
| hostId                              | 911886e953f179550c30da8760ac9d00bd0f5aa76dfcb5c328d7c1e3 |
| OS-EXT-STS:vm_state                 | building                                                 |
| OS-EXT-SRV-ATTR:instance_name       | instance-0000000c                                        |
| OS-EXT-SRV-ATTR:hypervisor_hostname | node-01.lithium.rhev.lab.eng.brq.redhat.com              |
| flavor                              | m1.tiny (1)                                              |
| id                                  | a64c03f1-3d58-4f75-b38a-a526730ca431                     |
...
| config_drive                        |                                                          |
+-------------------------------------+----------------------------------------------------------+
[root@folsom-rhel6 ~(keystone_admin)]# nova live-migration --block-migrate  foo
[root@folsom-rhel6 ~(keystone_admin)]# nova show foo
+-------------------------------------+----------------------------------------------------------+
| Property                            | Value                                                    |
+-------------------------------------+----------------------------------------------------------+
| status                              | ACTIVE                                                   |
| updated                             | 2013-08-16T16:23:58Z                                     |
| OS-EXT-STS:task_state               | None                                                     |
| OS-EXT-SRV-ATTR:host                | node-01.lithium.rhev.lab.eng.brq.redhat.com              |
| key_name                            | None                                                     |
| image                               | cirros1 (ab24ccbd-4c89-4444-b0b2-a06a79c44306)           |
| hostId                              | 911886e953f179550c30da8760ac9d00bd0f5aa76dfcb5c328d7c1e3 |
| OS-EXT-STS:vm_state                 | active                                                   |
| OS-EXT-SRV-ATTR:instance_name       | instance-0000000c                                        |
| OS-EXT-SRV-ATTR:hypervisor_hostname | node-01.lithium.rhev.lab.eng.brq.redhat.com              |
...
| config_drive                        |                                                          |
+-------------------------------------+----------------------------------------------------------+

I saw that directory appeared on the dest host, and then it disappeared again. I will try to retest it again then. I still think it is a regression because I was moving a VMs a lot in grizzly OpenStack.
Comment 4 Jaroslav Henner 2013-08-16 18:26:12 EDT
Reproduced. It really doesn't happen in 2013-08-05.1 but it does happen in 2013-08-15.1. I must have been too quick. Checking for MIGRATING status of VM is not enough.
Comment 5 Xavier Queralt 2013-08-19 04:57:13 EDT
Proposed backport to the stable release
Comment 9 Yogev Rabl 2013-08-25 08:42:50 EDT
We need more info in order to verify this bug: 
1. What is the setup of the RHOS components?  
2. What is the storage setup?
3. Can you please add the Cinder's logs.
4. Please elaborate, on which logs should we check (step 2). 

please return the bug to me to verify. 

thanks.
Comment 10 Xavier Queralt 2013-08-26 04:24:21 EDT
(In reply to Yogev Rabl from comment #9)
> We need more info in order to verify this bug: 
> 1. What is the setup of the RHOS components?  
> 2. What is the storage setup?
> 3. Can you please add the Cinder's logs.
> 4. Please elaborate, on which logs should we check (step 2). 
> 
> please return the bug to me to verify. 
> 
> thanks.

1. A plain RHOS setup with at least two compute nodes
2. Without shared storage (i.e. instance disks are local)
3. Cinder has nothing to do with this bug. The live block migration moves an instance's disk (don't confuse it with a disk in cinder, it's the local disk created from an image on instance creation) from one host to the other.
4. Consequently, the logs you've to check are the compute logs. Look for the exception DestinationDiskExists which shouldn't be there after the fix.


Steps to reproduce:
1. create an instance from an image (no volumes).
2. Running "nova show <instance name>", check on which host the instance is running (See property OS-EXT-SRV-ATTR:host).
3. run "nova live-migration --block-migrate <instance name>"
4. Check compute's log file in both hosts, where you shouldn't find the DestinationDiskExists exception.
5. Running "nova show <instance name>" again, the host for this instance must have changed and the instance is in status ACTIVE.
Comment 11 Jaroslav Henner 2013-08-26 09:24:49 EDT
After upgrade, it works again:
[root@controller ~(keystone_admin)]$ nova show f5c80071-8a4f-4805-8aaa-1487fafca6af | grep host
| OS-EXT-SRV-ATTR:host                | master-01...                          |
| hostId                              | 8c875ab353cd54d8cb39ba4169f51a66c5999a185d598f9754a2e974            |
| OS-EXT-SRV-ATTR:hypervisor_hostname | master-01...                          |
[root@controller ~(keystone_admin)]$ nova live-migration  f5c80071-8a4f-4805-8aaa-1487fafca6af --block-migrate 
[root@controller ~(keystone_admin)]$ nova show f5c80071-8a4f-4805-8aaa-1487fafca6af | grep host
| OS-EXT-SRV-ATTR:host                | master-02...                          |
| hostId                              | e063be730b5e973391d5353e5ce89f1965bedaa2acde75ee08624079            |
| OS-EXT-SRV-ATTR:hypervisor_hostname | master-02...                          |
[root@controller ~(keystone_admin)]$ nova live-migration  f5c80071-8a4f-4805-8aaa-1487fafca6af --block-migrate 
[root@controller ~(keystone_admin)]$ nova show f5c80071-8a4f-4805-8aaa-1487fafca6af | grep host
| OS-EXT-SRV-ATTR:host                | master-01...                          |
| hostId                              | 8c875ab353cd54d8cb39ba4169f51a66c5999a185d598f9754a2e974            |
| OS-EXT-SRV-ATTR:hypervisor_hostname | master-01...                          |
Comment 13 errata-xmlrpc 2013-09-03 16:21:48 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1199.html

Note You need to log in before you can comment on or make changes to this bug.