Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1161261 - VmReplicateDiskFinishVDSCommand is not executed when Live StorageMigration (LSM) is initated, leaving unfinished job
VmReplicateDiskFinishVDSCommand is not executed when Live StorageMigration (L...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.4.3
All Linux
high Severity high
: ---
: 3.5.0
Assigned To: Daniel Erez
lkuchlan
storage
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2014-11-06 14:12 EST by Bimal Chollera
Modified: 2016-03-16 14:57 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-02-16 14:09:01 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
images (345.44 KB, application/x-gzip)
2014-12-01 10:20 EST, lkuchlan
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 1298893 None None None 2016-03-16 14:57 EDT

  None (edit)
Description Bimal Chollera 2014-11-06 14:12:36 EST
Description of problem:

After a Live Storage Migration (LSM), a disk remains in "locked" status and the LSM sequence doesn't complete on the engine side. VmReplicateDiskFinishVDSCommand and DeleteImageGroupVDSCommand are never executed.

The associated job table entry in the RHEV database remains "STARTED".

 correlation_id |                job_id                |   action_type   |                     description                     | status  
----------------+--------------------------------------+-----------------+-----------------------------------------------------+---------
 edf0edf        | de4e97cf-e747-4657-bd07-e60aa278c0f1 | LiveMigrateDisk | Migrating Disk lsm-vm_Disk1 from LSM_GFW to NFS_GFW | STARTED

In the Admin Portal, the Disks menu will report the disk attached to the VM in "locked" status.

The engine sequence below completes but VmReplicateDiskFinishVDSCommand and DeleteImageGroupVDSCommand are never executed after the SyncImageGroupDataVDSCommand.

CloneImageGroupStructureVDSCommand
VmReplicateDiskStartVDSCommand
SyncImageGroupDataVDSCommand

I have been able to recreate this several times by live migrating 20 (arbitrary number) disks at once.

Version-Release number of selected component (if applicable):

Test environment;

   - RHEV-M 3.4.3 
   - Single host with (essentially) vdsm-4.14.13-2

How reproducible:

If I try to live migrate 20 disks at once I have encountered this problem every time. 

Steps to Reproduce:

1. In my specific case I created a pool of 20 VMs based off a template in an NFS data domain. 
2. I then started all 20 VMs. 
3. I then copied the template to a second NFS domain.
4. I then live migrated all 20 disks to the second NFS domain.

Actual results:

One of the 20 failed as described above.

Expected results:

All of the LSMs should complete and the associated jobs in the database be marked as "FINISHED".
Comment 8 Daniel Erez 2014-11-18 09:44:31 EST
Should be fixed in 3.5 build.
Comment 9 lkuchlan 2014-12-01 10:20:29 EST
Created attachment 963328 [details]
images

Tested using RHEVM 3.5 vt11
All of the LSMs completed and the associated jobs in the database be marked as "FINISHED"
Comment 10 Allon Mureinik 2015-02-16 14:09:01 EST
RHEV-M 3.5.0 has been released, closing this bug.

Note You need to log in before you can comment on or make changes to this bug.