Bug 891609 - 3.2 - engine: live snapshot fails due to race on multiple move of disks (live storage migration)
Summary: 3.2 - engine: live snapshot fails due to race on multiple move of disks (live...
Keywords:
Status: CLOSED DUPLICATE of bug 891610
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 6.4
Assignee: Eduardo Warszawski
QA Contact: Haim
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-03 12:05 UTC by Yeela Kaplan
Modified: 2016-02-02 22:40 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 876558
Environment:
Last Closed: 2013-01-03 13:55:25 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Yeela Kaplan 2013-01-03 12:05:41 UTC
Description of problem:

createVolume is not finished on spm when we send prepareVolume which will fail because volume does not exist yet. 

Version-Release number of selected component (if applicable):

si24.1

How reproducible:

100%

Steps to Reproduce:
1. create 3 domains in iscsi 
2. create a template with 15GB thin provision disk and OS installed and create 20 pool vm's on one domain and 2 more vm's as clone on a second domain
3. run the vms on 2 hosts and have them write (opening explorer in the vms will be enough no need for heavy writing)
4. from vm's tab -> disks -> move each of the vm's disks to the 3ed domain
  
Actual results:

some of the vms fail to create the live snapshot because prepareVolume was sent to hsm before the createVolume finished creating the volume. 

Expected results:

we should not send prepareVolume before confirming that createVolume completed successfully. 

Additional info:logs

since there are a lot of tasks running at the same time and we already debugged here is the relevant info: 

in spm log, create volume is Thread-4416:: Task is 11b6930d-5f8c-435e-9de4-46c6cc34684f

prepareVolume for same action in hsm is on Thread-4275:

this is the error with the volume that failed: 

VolumeMetadataReadError: Error while processing volume meta data: ('missing offset tag on volume 710f7022-29ae-40cd-9b41-3a8a22fd8cc1',)

engine log: 

this is the SnapshotVDSCommand for the vm: 

2012-11-14 12:15:36,360 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand] (pool-4-thread-23) START, SnapshotVDSCommand(HostName = gold-vdsd, HostId = 2d81a26a-2c20-11e2-aeab-001a4a169741, vmId=3d393cd1-666e-4283-a2b6-ce99e74656f4), log id: 57fa069c


and here is the failure in engine log:

2012-11-14 12:15:47,105 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand] (pool-4-thread-23) FINISH, SnapshotVDSCommand, log id: 57fa069c
2012-11-14 12:15:47,105 ERROR [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (pool-4-thread-23) Wasnt able to live snpashot due to error: VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to SnapshotVDS, error = Snapshot failed, rolling back.

Comment 2 Yeela Kaplan 2013-01-03 13:55:25 UTC

*** This bug has been marked as a duplicate of bug 891610 ***


Note You need to log in before you can comment on or make changes to this bug.