Bug 1313744 - VM is inoperative after power off during Live storage migration
VM is inoperative after power off during Live storage migration
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage (Show other bugs)
3.6.3.3
Unspecified Unspecified
unspecified Severity high (vote)
: ovirt-3.6.5
: 3.6.5
Assigned To: Daniel Erez
Elad
: Automation, Regression, Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-02 04:57 EST by Eyal Shenitzky
Modified: 2016-07-25 01:18 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-04-21 10:37:54 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
amureini: ovirt‑3.6.z?
rule-engine: blocker?
rule-engine: planning_ack?
tnisan: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
engine logs (10.19 MB, text/plain)
2016-03-02 04:57 EST, Eyal Shenitzky
no flags Details
vdsm log (2.65 MB, text/plain)
2016-03-02 04:59 EST, Eyal Shenitzky
no flags Details
new engine log (330.82 KB, application/x-bzip)
2016-03-08 08:37 EST, Eyal Shenitzky
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2461971 None None None 2016-07-25 01:18 EDT
oVirt gerrit 54730 master MERGED core: LiveMigrateDiskCommand - shared locks are unneeded 2016-03-15 13:03 EDT
oVirt gerrit 54776 ovirt-engine-3.6 MERGED core: LiveMigrateDiskCommand - shared locks are unneeded 2016-03-16 06:38 EDT

  None (edit)
Description Eyal Shenitzky 2016-03-02 04:57:18 EST
Created attachment 1132215 [details]
engine logs

Description of problem:

Power off a VM during Live storage migration of file disk after snapshot has been created will cause the VM to become inoperative - cannot run, removed.

engine action massage: 
Cannot <run\remove> VM. Disk <disk name> is being moved or copied

snapshot cannot removed due to the same massage.

Version-Release number of selected component (if applicable):
Engine - 3.6.3.4-0.1.el6
VDSM - 4.17.23-0.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create VM with file based disk
2. Run the VM
3. Live Migrate the disk
4. Power off the VM after Live Storage Migrate snapshot created 

Actual results:
The vm does power off but become inoperative as mentioned above 

Expected results:
VM should power off nicely and be able to run again.

Additional info:
VDSM and Engine log attached
Comment 1 Eyal Shenitzky 2016-03-02 04:59 EST
Created attachment 1132229 [details]
vdsm log
Comment 2 Allon Mureinik 2016-03-02 06:24:55 EST
Eyal - why is this marked as a regression? Can you attach the logs from a clean run?
Comment 3 Red Hat Bugzilla Rules Engine 2016-03-02 06:24:59 EST
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Comment 4 Raz Tamir 2016-03-03 08:15:42 EST
Hi Allon,
This is a regression according bug 1128582
It used to work but with exception.
Comment 6 Yaniv Lavi (Dary) 2016-03-06 10:11:11 EST
Any status update of this one?
Comment 7 Daniel Erez 2016-03-07 07:48:01 EST
Hi Eyal,

Are you referring to a validation message when running the VM (i.e. "Cannot run VM. Disk is being moved or copied.")? If so, there's a workaround for the issue by restarting engine service.
Comment 8 Daniel Erez 2016-03-07 08:19:43 EST
(In reply to Daniel Erez from comment #7)
> Hi Eyal,
> 
> Are you referring to a validation message when running the VM (i.e. "Cannot
> run VM. Disk is being moved or copied.")? If so, there's a workaround for
> the issue by restarting engine service.

Just reproduced the issue, there's indeed a gap between powering off the VM during live migration and being able to run it again, but this is expected. Since the disk is still being migrated when powering the VM down, running the VM again is blocked until the operation is completely finished (i.e. until migration is failed and the lock in memory is freed). If the disk in small enough it should finish in a couple of minutes, then you should be able to run the VM again. Closing the bug since that's the expected behavior, please open again if the the operation hangs infinitely.
Comment 9 Eyal Shenitzky 2016-03-07 08:50:31 EST
Every time I try this scenario the operation hangs infinitely no matter what is the disk size.
Powering off a VM during Live Storage Migration should rollback the operation and then power off the vm, it doesn't supposed to wait until the operation is finished.
Comment 10 Daniel Erez 2016-03-07 10:13:37 EST
(In reply to Eyal Shenitzky from comment #9)
> Every time I try this scenario the operation hangs infinitely no matter what
> is the disk size.
> Powering off a VM during Live Storage Migration should rollback the
> operation and then power off the vm, it doesn't supposed to wait until the
> operation is finished.

We can't rollback the operation immediately since we don't cancel on going tasks. An operation rollback can be performed only when a failure is detected by vdsm.
* Can you please check if engine restart resolves the issue to understand if we're referring to the same problem.
* Can you please attach a list of running vdsm tasks after reproducing the scenario ('vdsClient -s 0 getAllTasks')
* While at it, please attach 'clean' engine logs; i.e. a log containing only the relevant period of time executing the scenario.

Thanks!
Comment 11 Eyal Shenitzky 2016-03-08 08:34:13 EST
Engine restart resolves does resolve the problem.

There is no running task in the VDSM after reproduction

I attached new Engine log please look at the log massages around - 8/3/16 15:31
Comment 12 Eyal Shenitzky 2016-03-08 08:37 EST
Created attachment 1134171 [details]
new engine log
Comment 13 Eyal Shenitzky 2016-03-08 09:04:03 EST
Please pay attention that the migration does failed when the VM  is power-off.
Comment 14 Elad 2016-03-31 08:32:36 EDT
Steps:
1. Create VM with file based disk
2. Run the VM
3. Live Migrate the disk
4. Power off the VM after Live Storage Migrate snapshot created 

VM is operative after live storage migration failure

Verified using:
rhevm-3.6.5-0.1.el6.noarch
vdsm-4.17.25-0.el7ev.noarch

Note You need to log in before you can comment on or make changes to this bug.