Bug 1367721 - Live storage migration of multiple disks fails with "VmReplicateDiskFinishVDS: Resource unavailable"
Summary: Live storage migration of multiple disks fails with "VmReplicateDiskFinishVDS...
Keywords:
Status: CLOSED DUPLICATE of bug 1270220
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.17.28
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.0.4
: ---
Assignee: Daniel Erez
QA Contact: Aharon Canan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-17 10:13 UTC by Markus Stockhausen
Modified: 2016-08-28 06:01 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-08-28 06:01:18 UTC
oVirt Team: Storage
Embargoed:
tnisan: ovirt-4.0.z?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
VDSM Log (486.65 KB, application/x-gzip)
2016-08-17 10:16 UTC, Markus Stockhausen
no flags Details
Engine Log (21.22 KB, application/x-gzip)
2016-08-17 10:17 UTC, Markus Stockhausen
no flags Details

Description Markus Stockhausen 2016-08-17 10:13:26 UTC
Description of problem:

Parallel live disk migration of multiple disks fails on one disk with error: 

VDSM command failed: Could not remove all image's volumes: No such file or directory',)
VDSM command failed: Resource unavailable

Version-Release number of selected component (if applicable):

VDSM 4.17.28
OVirt engine 3.6.7

How reproducible:

Dont'know

Steps to Reproduce:
1. Create VM with multiple disks (more than 5)
2. Start VM
3. Generate some disk load inside VM to all disks
4. Migrate all disks to another storage system

Actual results:

one disk migration job fails. 

Expected results:

all migration jobs complete successfully

Additional info:

See attached logs

Comment 1 Markus Stockhausen 2016-08-17 10:16:49 UTC
Created attachment 1191562 [details]
VDSM Log

Comment 2 Markus Stockhausen 2016-08-17 10:17:38 UTC
Created attachment 1191563 [details]
Engine Log

Comment 3 Daniel Erez 2016-08-21 09:35:14 UTC
Hi Markus,

A few questions for further investigation:

* According to the logs, it seems that the disk failed during the cleanup stage.
i.e. merely the source disk deletion failed. Is there any issue with the disk in target domain?
* What is the status of the failed disk?
* Is it reproduced when migrating a single disk?

Thanks!

Comment 4 Markus Stockhausen 2016-08-21 18:29:39 UTC
From reading our logs the storage system hang for a few minutes during the disk deletion. XFS had to cleanup 100.000s of extends. A known problem we are already trying to mitigate.

So I guess we are hitting BZ1270220 (fixed in 4.0.0)

Comment 5 Daniel Erez 2016-08-22 11:18:01 UTC
(In reply to Markus Stockhausen from comment #4)
> From reading our logs the storage system hang for a few minutes during the
> disk deletion. XFS had to cleanup 100.000s of extends. A known problem we
> are already trying to mitigate.
> 
> So I guess we are hitting BZ1270220 (fixed in 4.0.0)

Indeed, can you please try to see if it's reproduced on latest version?

Comment 6 Markus Stockhausen 2016-08-25 12:33:35 UTC
In preparation to our 4.0.2 update I just read BZ1367281. That is a showstopper. So at the moment I will not update and cannot verify if it is really fixed.

Comment 7 Daniel Erez 2016-08-28 06:01:18 UTC
(In reply to Markus Stockhausen from comment #6)
> In preparation to our 4.0.2 update I just read BZ1367281. That is a
> showstopper. So at the moment I will not update and cannot verify if it is
> really fixed.

OK, thanks Markus!
Closing as a duplicate of bug 1270220, please reopen if reproduced.

*** This bug has been marked as a duplicate of bug 1270220 ***


Note You need to log in before you can comment on or make changes to this bug.