Bug 1421417 - Disk remains in 'locked' state when blocking the connection from host to storage-domain during live storage migration
Summary: Disk remains in 'locked' state when blocking the connection from host to stor...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.1.0.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.3
: 4.1.3.5
Assignee: Fred Rolland
QA Contact: Eyal Shenitzky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-12 07:55 UTC by Eyal Shenitzky
Modified: 2017-07-06 13:40 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-07-06 13:40:48 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: blocker+


Attachments (Terms of Use)
engine and vdsm logs (873.92 KB, application/x-gzip)
2017-02-12 10:06 UTC, Eyal Shenitzky
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 76444 0 master MERGED core: LSM unlock disks on failure 2017-06-20 15:58:45 UTC
oVirt gerrit 78355 0 master MERGED core: endWithFailure in LiveMigrateVmDisksCommand 2017-06-21 07:34:57 UTC
oVirt gerrit 78359 0 ovirt-engine-4.1 MERGED core: LSM unlock disks on failure 2017-06-21 09:37:33 UTC
oVirt gerrit 78360 0 ovirt-engine-4.1 ABANDONED core: endWithFailure in LiveMigrateVmDisksCommand 2017-06-21 09:17:17 UTC

Description Eyal Shenitzky 2017-02-12 07:55:06 UTC
Description of problem:

When performing Live storage migration to VM's disk and blocking the connection from the VM's host to the disk's storage domain during the migration using iptables, the disk remain in 'locked' state and the migration failed due to:

Failed to complete snapshot 'Auto-generated for Live Storage Migration' creation for VM 'xxx'.
 

Version-Release number of selected component (if applicable):
Engine - 4.1.0.4-0.1.el7
VDSM - 4.19.4-1.el7ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create a VM with disk
2. Start the VM
3. Move the disk to other storage-domain
4. Block the connection from the VM's host to the disk's storage-domain using iptables

Actual results:
Live storage migration failed and disk remains in 'locked' state

Expected results:
Live storage migration should fail nicely and disk should be in 'active' state

Additional info:
Engine and VDSM logs are attached

Comment 1 Eyal Shenitzky 2017-02-12 09:09:29 UTC
Bug also happend when when live migrate HA VM and restart the SPM host during the migration.

Steps to Reproduce:
1. Create a VM with disk
2. Update the VM to be highly available
3. Start the VM
4. Move the disk to another storage-domain
5. Restart the SPM host during the migration

Comment 2 Eyal Shenitzky 2017-02-12 10:06:26 UTC
Created attachment 1249482 [details]
engine and vdsm logs

Comment 3 Yaniv Kaul 2017-02-22 11:13:06 UTC
Is this a regression?

Comment 4 Eyal Shenitzky 2017-03-07 12:01:58 UTC
We didn't this kind of test for a long time so I don't have any reference to know if it is a regression or not

Comment 5 Eyal Shenitzky 2017-03-07 12:02:45 UTC
We didn't run* this kind...

Comment 6 Yaniv Kaul 2017-03-09 09:26:01 UTC
(In reply to Eyal Shenitzky from comment #5)
> We didn't run* this kind...

Well, you have 4.0.7 now - can you check (and btw, makes sense to automate this) ?

Comment 7 Eyal Shenitzky 2017-03-15 08:10:44 UTC
I will check It on 4.0 and add a flag in case of regression.
This bug found when I automated the above scenario.

Comment 8 Red Hat Bugzilla Rules Engine 2017-03-21 05:52:58 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 9 Eyal Shenitzky 2017-05-07 09:39:58 UTC
Remove regression tag, occur in 4.1 and also in 4.0

Comment 10 Allon Mureinik 2017-06-15 11:08:54 UTC
Patch isn't ready, and we have an unlocker utility. Pushing out to 4.1.4.

Comment 11 Eyal Shenitzky 2017-06-25 12:12:41 UTC
 Verified with the following code:
----------------------------------

VDSM - 4.19.20-1.el7ev.x86_64
RHEVM - 4.1.3.5-0.1.el7
 
Steps to reproduce:
------------------------
1. Create a VM with disk
2. Start the VM
3. Move the disk to other storage-domain
4. Block the connection from the VM's host to the disk's storage-domain using 

Moving to VERIFIED


Note You need to log in before you can comment on or make changes to this bug.