Bug 1214342

Summary:

After a failed snapshot, Live Snapshot Merge operation fails

Product:

[oVirt] ovirt-engine

Reporter:

Kevin Alon Goldblatt <kgoldbla>

Component:

General

Assignee:

Ala Hino <ahino>

Status:

CLOSED WONTFIX

QA Contact:

Aharon Canan <acanan>

Severity:

high

Docs Contact:

Priority:

medium

Version:

3.5.1

CC:

acanan, ahino, alitke, amureini, bugs, gklein, gpadgett, lpeer, lsurette, nsoffer, rbalakri, Rhev-m-bugs, srevivo, tnisan, v.astafiev, ykaul, ylavi

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2016-06-05 11:47:10 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

Storage

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
engine, server, vdsm and libvirt logs	none

Description Kevin Alon Goldblatt 2015-04-22 13:38:52 UTC

Description of problem:
VM disks become illegal after failed Live Merge

Version-Release number of selected component (if applicable):
ver 3.5.1 vt14.3
rhevm-3.5.1-0.4.el6ev.noarch
vdsm-4.16.13.1-1.el7ev.x86_64

How reproducible:
Ran this once

Steps to Reproduce:
1. Create a VM with several disks including 2 block preallocated and 1 thin and nfs 
2. Started the VM 
3. Created 3 snapshots snsa1, snsa2, snsa3
4. Ran a Live Storage Migration operation by selecting all the disks of the vm and moving them to new storage domains. While the disks are locked BLOCKK the storage from the host using IPTABLES >>>>> The Live storage migration fails as expected.
5. Unblock the storage on the host using IPTABLES
6. Started a Deleted snapshot operation  >>>>The LIve Storage Delete / Merge operations fails
7. Running a Live Snapshot Delete Merge operation after a successful Live Storage Migration operation works fine.

Actual results:
After a Failed Live Storage Migration operation a Live Snapshot Delete Merge operation fails

Expected results:
After a failed Live Storage Migration operation the system should rollback and all further operations on the storage should work

Additional info:

THERE IS A 3 HOUR TIME DIFFERENCE BETWEEN THE ENGINE AND THE HOST

ENGINE log

LIVE MIGRATION STARTED HERE-----------------------------------------------------------------------------------------------

2015-04-21 17:30:12,800 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-29) Correlation ID: 76723cae, Job ID: 0638ac8a-3423-4373-af15-6649513080f7, Ca
ll Stack: null, Custom Event ID: -1, Message: Snapshot 'Auto-generated for Live Storage Migration' creation for VM 'vm16' was initiated by admin@internal.

ERRORS REPORTED WHEN STORAGE IS BLOCKED ON HOST USING IP TABLES------------------------------------

2015-04-21 17:32:54,585 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-6) [5a07c591] Failed in HSMGetAllTasksStatusesVDS method
2015-04-21 17:32:54,586 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-6) [5a07c591] Failed in HSMGetAllTasksStatusesVDS method
2015-04-21 17:32:54,591 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-6) [5a07c591] BaseAsyncTask::onTaskEndSuccess: Task 6a46d1cf-a49f-4d8e-9e58-f781b05043cd (Parent Comman
d LiveMigrateVmDisks, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended successfully.
2015-04-21 17:32:54,591 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (DefaultQuartzScheduler_Worker-6) [5a07c591] Task with DB Task ID 3b6dad14-fed8-40b3-9b62-9c11dbd1ecc7 and VDSM Task ID 7e5db0a8-4
886-4dad-bddb-57f04950240f is in state Polling. End action for command ecc2e7c1-b82a-4a53-9042-8e624e5452f6 will proceed when all the entitys tasks are completed.
2015-04-21 17:32:54,591 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-6) [5a07c591] SPMAsyncTask::PollTask: Polling task 5ace41e8-feb4-43e0-8f47-5a5281001322 (Parent Command
 LiveMigrateVmDisks, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'cleanFailure'.
2015-04-21 17:32:54,596 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-6) [5a07c591] BaseAsyncTask::logEndTaskFailure: Task 5ace41e8-feb4-43e0-8f47-5a5281001322 (Parent Comma
nd LiveMigrateVmDisks, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
-- Result: cleanFailure
-- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Replace Logical Volume tag error, code = 572,
-- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Replace Logical Volume tag error, code = 572
2015-04-21 17:32:54,596 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (DefaultQuartzScheduler_Worker-6) [5a07c591] Task with DB Task ID 3b6dad14-fed8-40b3-9b62-9c11dbd1ecc7 and VDSM Task ID 7e5db0a8-4
886-4dad-bddb-57f04950240f is in state Polling. End action for command ecc2e7c1-b82a-4a53-9042-8e624e5452f6 will proceed when all the entitys tasks are completed.
2015-04-21 17:32:54,597 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-6) [5a07c591] SPMAsyncTask::PollTask: Polling task 7e5db0a8-4886-4dad-bddb-57f04950240f (Parent Command
 LiveMigrateVmDisks, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'cleanFailure'.
2015-04-21 17:32:54,601 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-6) [5a07c591] BaseAsyncTask::logEndTaskFailure: Task 7e5db0a8-4886-4dad-bddb-57f04950240f (Parent Comma
nd LiveMigrateVmDisks, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
-- Result: cleanFailure
-- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Replace Logical Volume tag error, code = 572,
-- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Replace Logical Volume tag error, code = 572
2015-04-21 17:32:54,601 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-6) [5a07c591] CommandAsyncTask::endActionIfNecessary: All tasks of command ecc2e7c1-b82a-4a53-9042-
8e624e5452f6 has ended -> executing endAction
2015-04-21 17:32:54,602 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-6) [5a07c591] CommandAsyncTask::endAction: Ending action for 3 tasks (command ID: ecc2e7c1-b82a-4a5
3-9042-8e624e5452f6): calling endAction .
2015-04-21 17:32:54,602 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-7-thread-8) [5a07c591] CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAction 
LiveMigrateVmDisks, executionIndex: 0
2015-04-21 17:32:54,722 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateVmDisksCommand] (org.ovirt.thread.pool-7-thread-8) [5a07c591] Ending command with failure: org.ovirt.engine.core.bll.lsm.LiveMigrateVmDisks
Command
2015-04-21 17:32:54,743 ERROR [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (org.ovirt.thread.pool-7-thread-8) Ending command with failure: org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand
2015-04-21 17:32:54,756 ERROR [org.ovirt.engine.core.bll.CreateSnapshotCommand] (org.ovirt.thread.pool-7-thread-8) [20a1cafc] Ending command with failure: org.ovirt.engine.core.bll.CreateSnapshotCommand
2015-04-21 17:32:54,906 ERROR [org.ovirt.engine.core.bll.CreateSnapshotCommand] (org.ovirt.thread.pool-7-thread-8) [345b289d] Ending command with failure: org.ovirt.engine.core.bll.CreateSnapshotCommand
2015-04-21 17:32:54,908 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SPMRevertTaskVDSCommand] (org.ovirt.thread.pool-7-thread-8) [345b289d] START, SPMRevertTaskVDSCommand( storagePoolId = 6d96f52d-d791-4f66-
83bd-2553ca0f3012, ignoreFailoverLimit = false, taskId = 6a46d1cf-a49f-4d8e-9e58-f781b05043cd), log id: 1ea88db9
2015-04-21 17:32:54,910 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMRevertTaskVDSCommand] (org.ovirt.thread.pool-7-thread-8) [345b289d] START, HSMRevertTaskVDSCommand(HostName = nott-vds1, HostId = eaba7
988-e524-4c19-bd7b-8f9ba0bbc769, taskId=6a46d1cf-a49f-4d8e-9e58-f781b05043cd), log id: 39908cf2
2015-04-21 17:32:55,097 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMRevertTaskVDSCommand] (org.ovirt.thread.pool-7-thread-8) [345b289d] FINISH, HSMRevertTaskVDSCommand, log id: 39908cf2
2015-04-21 17:32:55,097 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SPMRevertTaskVDSCommand] (org.ovirt.thread.pool-7-thread-8) [345b289d] FINISH, SPMRevertTaskVDSCommand, log id: 1ea88db9
2015-04-21 17:32:55,223 ERROR [org.ovirt.engine.core.bll.CreateSnapshotCommand] (org.ovirt.thread.pool-7-thread-8) [69f56a1b] Ending command with failure: org.ovirt.engine.core.bll.CreateSnapshotCommand
2015-04-21 17:32:55,352 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-8) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to complete snapshot 'Auto-generated for Live Storage Migration' creation for VM 'vm16'.
2015-04-21 17:32:55,357 INFO  [org.ovirt.engine.core.bll.lsm.LiveMigrateVmDisksCommand] (org.ovirt.thread.pool-7-thread-8) Lock freed to object EngineLock [exclusiveLocks= key: 1c9b5c2e-23ea-480c-b67b-50b969faf021 value: DISK
key: 1746d080-f1e5-4cfe-aeec-0eb5ffff92c3 value: DISK
key: 1ef1327e-1e1d-489d-9db2-2f0c2537604b value: DISK
, sharedLocks= key: 64b9a769-753a-4a51-9741-0f93f3b4f060 value: VM
]
2015-04-21 17:32:55,359 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-7-thread-8) CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type LiveMigrateVmDisks completed, handling the result.
2015-04-21 17:32:55,359 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-7-thread-8) CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type LiveMigrateVmDisks succeeded, clearing tasks.


STARTED LIVE SNAPSHOT A FEW MINUTES DELETE AFTER  UNBLOCKING THE IPTABLES--------------------------------------------

2015-04-21 17:36:45,189 INFO  [org.ovirt.engine.core.bll.RemoveMemoryVolumesCommand] (pool-6-thread-6) [48240de4] Running command: RemoveMemoryVolumesCommand internal: true.
2015-04-21 17:36:45,214 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-5) Correlation ID: 3d2dda02, Job ID: 8ec1ea49-4695-4d05-af0a-6c453dc5ac10, Call Stack: null, Custom Event ID: -1, Message: Snapshot 'snapp2' deletion for VM 'vm16' was initiated by admin@internal.
2015-04-21 17:36:45,220 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-6) [48240de4] START, DeleteImageGroupVDSCommand( storagePoolId = 6d96f52d-d791-4f66-83bd-2553ca0f3012, ignoreFailoverLimit = false, storageDomainId = e2b109f3-d36b-4696-9ee9-99e4fb7d1fd5, imageGroupId = 4692515d-f8f6-4218-a237-b7d90dc96e71, postZeros = false, forceDelete = false), log id: 48ce0bee
2015-04-21 17:36:48,189 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-6) [48240de4] FINISH, DeleteImageGroupVDSCommand, log id: 48ce0bee
2015-04-21 17:36:48,195 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (pool-6-thread-6) [48240de4] CommandAsyncTask::Adding CommandMultiAsyncTasks object for command 143bd8c8-ac95-4b4c-9078-fa34292eaf18
2015-04-21 17:36:48,195 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (pool-6-thread-6) [48240de4] CommandMultiAsyncTasks::AttachTask: Attaching task 9458d2bd-f5ac-4c2a-a0c2-52ab3fb2f65d to command 143bd8c8-ac95-4b4c-9078-fa34292eaf18.


FAILURE OF LIVE SNAPSHOT DELETE-------------------------------------------------------------------------------------------------------------

2015-04-21 17:38:13,874 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskLiveCommand] (DefaultQuartzScheduler_Worker-71) [9f60b4a] Snapshot 6b43e440-589f-4294-8732-08fb58a9f988 images e4e1d5af-65ef-4ed8-
9aba-95705976c367..1b115362-9a30-43ec-b06f-b25fa0052508 merged, but volume removal failed. Some or all of the following volumes may be orphaned: [e4e1d5af-65ef-4ed8-9aba-95705976c367]. Please retry Live Merge on
 the snapshot to complete the operation.
2015-04-21 17:38:13,880 ERROR [org.ovirt.engine.core.bll.DestroyImageCommand] (DefaultQuartzScheduler_Worker-71) [15504788] Ending command with failure: org.ovirt.engine.core.bll.DestroyImageCommand
2015-04-21 17:38:14,780 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-69) Failed in HSMGetAllTasksStatusesVDS method
2015-04-21 17:38:14,781 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-69) Failed in HSMGetAllTasksStatusesVDS method
2015-04-21 17:38:14,781 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-69) BaseAsyncTask::logEndTaskFailure: Task 4f6ea81e-e40e-4927-97b6-0a6a65a590a9 (Parent Command RemoveS
napshotSingleDiskLive, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
-- Result: cleanSuccess
-- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Could not acquire resource. Probably resource factory threw an exception.: (), code = 100,
-- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Could not acquire resource. Probably resource factory threw an exception.: (), code = 100
2015-04-21 17:38:14,781 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-69) CommandAsyncTask::endActionIfNecessary: All tasks of command 34e07226-09d0-4b67-892b-d904f3815e
fa has ended -> executing endAction
2015-04-21 17:38:14,781 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-69) CommandAsyncTask::endAction: Ending action for 1 tasks (command ID: 34e07226-09d0-4b67-892b-d90
4f3815efa): calling endAction .
2015-04-21 17:38:14,781 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-7-thread-3) CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAction RemoveSnaps
hotSingleDiskLive, executionIndex: 0
2015-04-21 17:38:14,785 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-69) BaseAsyncTask::logEndTaskFailure: Task 69cafb68-360c-4ee3-9c1d-7a118d228c8e (Parent Command RemoveS
napshotSingleDiskLive, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
-- Result: cleanSuccess
-- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Could not acquire resource. Probably resource factory threw an exception.: (), code = 100,
-- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Could not acquire resource. Probably resource factory threw an exception.: (), code = 100
2015-04-21 17:38:14,786 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-69) CommandAsyncTask::endActionIfNecessary: All tasks of command 17b08dd0-24f0-493e-bb53-88db9a5f78
17 has ended -> executing endAction
2015-04-21 17:38:14,786 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-69) CommandAsyncTask::endAction: Ending action for 1 tasks (command ID: 17b08dd0-24f0-493e-bb53-88d
b9a5f7817): calling endAction .
2015-04-21 17:38:14,786 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-7-thread-27) CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAction RemoveSnap
shotSingleDiskLive, executionIndex: 0
2015-04-21 17:38:14,804 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskLiveCommand] (org.ovirt.thread.pool-7-thread-3) [9f60b4a] Ending command with failure: org.ovirt.engine.core.bll.RemoveSnapshotSin
gleDiskLiveCommand
2015-04-21 17:38:14,814 ERROR [org.ovirt.engine.core.bll.DestroyImageCommand] (org.ovirt.thread.pool-7-thread-3) [15504788] Ending command with failure: org.ovirt.engine.core.bll.DestroyImageCommand
2015-04-21 17:38:14,815 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-7-thread-3) CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type RemoveSnapsh
otSingleDiskLive completed, handling the result.
2015-04-21 17:38:14,816 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-7-thread-3) CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type RemoveSnapsh
otSingleDiskLive hasnt succeeded, not clearing tasks, will attempt again next polling.
2015-04-21 17:38:14,819 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskLiveCommand] (org.ovirt.thread.pool-7-thread-27) [7bb45238] Ending command with failure: org.ovirt.engine.core.bll.RemoveSnapshotS
ingleDiskLiveCommand
2015-04-21 17:38:14,823 ERROR [org.ovirt.engine.core.bll.DestroyImageCommand] (org.ovirt.thread.pool-7-thread-27) [2f7596cf] Ending command with failure: org.ovirt.engine.core.bll.DestroyImageCommand
2015-04-21 17:38:14,824 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-7-thread-27) CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type RemoveSnaps
hotSingleDiskLive completed, handling the result.
2015-04-21 17:38:14,824 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-7-thread-27) CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type RemoveSnaps
hotSingleDiskLive hasnt succeeded, not clearing tasks, will attempt again next polling.
2015-04-21 17:38:23,891 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotCommand] (DefaultQuartzScheduler_Worker-87) [3d2dda02] Ending command with failure: org.ovirt.engine.core.bll.RemoveSnapshotCommand
2015-04-21 17:38:23,911 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-87) [3d2dda02] Correlation ID: 3d2dda02, Call Stack: null, Custom Event ID: -1,
 Message: Failed to delete snapshot 'snapp2' for VM 'vm16'.
2015-04-21 17:38:24,797 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-9) [232cf324] Failed in HSMGetAllTasksStatusesVDS method
2015-04-21 17:38:24,797 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-9) [232cf324] Failed in HSMGetAllTasksStatusesVDS method


NOTE THERE IS A 3 HOUR DIFFERENCE BETWEEN THE ENGINE AND HOST LOGS!!!!!!!! ===============================================================
==================================================================================================================


VDSM log

Thread-118033::DEBUG::2015-04-21 20:30:34,925::task::595::Storage.TaskManager.Task::(_updateState) Task=`5ace41e8-feb4-43e0-8f47-5a5281001322`::moving from state init -> state preparing

Thread-116487::ERROR::2015-04-21 20:33:05,069::misc::210::Storage.Misc::(readspeed) Unable to read file '/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata'
Thread-116487::ERROR::2015-04-21 20:33:05,070::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain fd8aa916-0b31-4d2e-99ef-f218967e93e1 monitoring information
Traceback (most recent call last):
.
.
Thread-40::ERROR::2015-04-21 20:33:07,916::lvm::404::Storage.LVM::(_reloadvgs) vg 88eb14b4-a7fc-40ab-be6b-e2ab7bc31dbc has pv_count 3 but pv_names ('/dev/mapper/360060160f4a030002c5dc169973de411', '/dev/mapper/36006016012823000387933e01ab5e311')
.
.
Thread-40::ERROR::2015-04-21 20:33:11,271::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain 88eb14b4-a7fc-40ab-be6b-e2ab7bc31dbc monitoring information
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/domainMonitor.py", line 232, in _monitorDomain
    self.domain.selftest()
  File "/usr/share/vdsm/storage/blockSD.py", line 831, in selftest
    lvm.chkVG(self.sdUUID)
  File "/usr/share/vdsm/storage/lvm.py", line 985, in chkVG
    raise se.StorageDomainAccessError("%s: %s" % (vgName, err))
.
.
5ace41e8-feb4-43e0-8f47-5a5281001322::DEBUG::2015-04-21 20:33:11,705::lvm::514::Storage.OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' got the operation mutex
5ace41e8-feb4-43e0-8f47-5a5281001322::DEBUG::2015-04-21 20:33:11,705::lvm::526::Storage.OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' released the operation mutex
5ace41e8-feb4-43e0-8f47-5a5281001322::ERROR::2015-04-21 20:33:11,705::volume::473::Storage.Volume::(create) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/volume.py", line 435, in create
    srcVolUUID, volPath)
  File "/usr/share/vdsm/storage/blockVolume.py", line 159, in _create
    addTags=mdTags)
  File "/usr/share/vdsm/storage/lvm.py", line 1232, in changeLVTags
    (lvname, ", ".join(addTags), ", ".join(delTags), err[-1]))
LogicalVolumeReplaceTagError: Replace Logical Volume tag error: (u'lv: `fd8aa916-0b31-4d2e-99ef-f218967e93e1/63335b3b-6d82-446c-bb16-4ed9a6b20207` add: `PU_c2a1254b-58dd-4d4c-ac0d-275093c99d05, MD_10, IU_1c9b5c2e-23ea-480c-b67b-50b969faf021` del: `OVIRT_VOL_INITIALIZING` (  Cannot process volume group fd8aa916-0b31-4d2e-99ef-f218967e93e1)',)
.
.
5ace41e8-feb4-43e0-8f47-5a5281001322::DEBUG::2015-04-21 20:33:11,725::resourceManager::649::Storage.ResourceManager::(releaseResource) No one is waiting for resource 'fd8aa916-0b31-4d2e-99ef-f218967e93e1_imageNS
.1c9b5c2e-23ea-480c-b67b-50b969faf021', Clearing records.
5ace41e8-feb4-43e0-8f47-5a5281001322::ERROR::2015-04-21 20:33:11,725::task::866::Storage.TaskManager.Task::(_setError) Task=`5ace41e8-feb4-43e0-8f47-5a5281001322`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 334, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1821, in createVolume
    desc=desc, srcImgUUID=srcImgUUID, srcVolUUID=srcVolUUID)
  File "/usr/share/vdsm/storage/sd.py", line 430, in createVolume
    preallocate, diskType, volUUID, desc, srcImgUUID, srcVolUUID)
  File "/usr/share/vdsm/storage/volume.py", line 435, in create
    srcVolUUID, volPath)
  File "/usr/share/vdsm/storage/blockVolume.py", line 159, in _create
    addTags=mdTags)
  File "/usr/share/vdsm/storage/lvm.py", line 1232, in changeLVTags
    (lvname, ", ".join(addTags), ", ".join(delTags), err[-1]))
.
.
ace41e8-feb4-43e0-8f47-5a5281001322::DEBUG::2015-04-21 20:33:11,725::resourceManager::649::Storage.ResourceManager::(releaseResource) No one is waiting for resource 'fd8aa916-0b31-4d2e-99ef-f218967e93e1_imageNS
.1c9b5c2e-23ea-480c-b67b-50b969faf021', Clearing records.
5ace41e8-feb4-43e0-8f47-5a5281001322::ERROR::2015-04-21 20:33:11,725::task::866::Storage.TaskManager.Task::(_setError) Task=`5ace41e8-feb4-43e0-8f47-5a5281001322`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 334, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1821, in createVolume
    desc=desc, srcImgUUID=srcImgUUID, srcVolUUID=srcVolUUID)
  File "/usr/share/vdsm/storage/sd.py", line 430, in createVolume
    preallocate, diskType, volUUID, desc, srcImgUUID, srcVolUUID)
  File "/usr/share/vdsm/storage/volume.py", line 435, in create
    srcVolUUID, volPath)
  File "/usr/share/vdsm/storage/blockVolume.py", line 159, in _create
    addTags=mdTags)
  File "/usr/share/vdsm/storage/lvm.py", line 1232, in changeLVTags
    (lvname, ", ".join(addTags), ", ".join(delTags), err[-1]))
.
.
5ace41e8-feb4-43e0-8f47-5a5281001322::DEBUG::2015-04-21 20:33:11,811::blockVolume::594::Storage.Misc.excCmd::(getMetadata) FAILED: <err> = "/usr/bin/dd: error reading '/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/m
etadata': Input/output error\n0+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000234782 s, 0.0 kB/s\n"; <rc> = 1
5ace41e8-feb4-43e0-8f47-5a5281001322::ERROR::2015-04-21 20:33:11,811::blockVolume::605::Storage.Volume::(getMetadata) Internal block device read failure: 'name=/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata, offset=6144, size=512'
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/blockVolume.py", line 594, in getMetadata
    offs * VOLUME_METASIZE, VOLUME_METASIZE)
  File "/usr/share/vdsm/storage/misc.py", line 251, in readblock
    raise se.MiscBlockReadException(name, offset, size)
MiscBlockReadException: Internal block device read failure: 'name=/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata, offset=6144, size=512'
5ace41e8-feb4-43e0-8f47-5a5281001322::ERROR::2015-04-21 20:33:11,812::task::866::Storage.TaskManager.Task::(_setError) Task=`5ace41e8-feb4-43e0-8f47-5a5281001322`::Unexpected error
Traceback (most recent call last):
.
.
7e5db0a8-4886-4dad-bddb-57f04950240f::ERROR::2015-04-21 20:33:13,881::volume::473::Storage.Volume::(create) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/volume.py", line 435, in create
    srcVolUUID, volPath)
  File "/usr/share/vdsm/storage/blockVolume.py", line 159, in _create
    addTags=mdTags)
  File "/usr/share/vdsm/storage/lvm.py", line 1232, in changeLVTags
    (lvname, ", ".join(addTags), ", ".join(delTags), err[-1]))
`.
.
7e5db0a8-4886-4dad-bddb-57f04950240f::ERROR::2015-04-21 20:33:13,956::blockVolume::605::Storage.Volume::(getMetadata) Internal block device read failure: 'name=/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata, offset=6656, size=512'
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/blockVolume.py", line 594, in getMetadata
    offs * VOLUME_METASIZE, VOLUME_METASIZE)
  File "/usr/share/vdsm/storage/misc.py", line 251, in readblock
    raise se.MiscBlockReadException(name, offset, size)
MiscBlockReadException: Internal block device read failure: 'name=/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata, offset=6656, size=512'
7e5db0a8-4886-4dad-bddb-57f04950240f::ERROR::2015-04-21 20:33:13,957::task::866::Storage.TaskManager.Task::(_setError) Task=`7e5db0a8-4886-4dad-bddb-57f04950240f`::Unexpected error
Traceback (most recent call last):
.
.
Thread-116487::ERROR::2015-04-21 20:33:15,088::misc::210::Storage.Misc::(readspeed) Unable to read file '/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata'
Thread-116487::ERROR::2015-04-21 20:33:15,089::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain fd8aa916-0b31-4d2e-99ef-f218967e93e1 monitoring information
Traceback (most recent call last):
.
.
Thread-40::ERROR::2015-04-21 20:33:22,998::lvm::404::Storage.LVM::(_reloadvgs) vg 88eb14b4-a7fc-40ab-be6b-e2ab7bc31dbc has pv_count 3 but pv_names ('/dev/mapper/360060160f4a030002c5dc169973de411', '/dev/mapper/36006016012823000387933e01ab5e311')
Thread-40::DEBUG::2015-04-21 20:33:22,998::lvm::416::Storage.OperationMutex::(_reloadvgs) Operation 'lvm reload operation' released the operation mutex
Thread-40::ERROR::2015-04-21 20:33:22,999::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain 88eb14b4-a7fc-40ab-be6b-e2ab7bc31dbc monitoring information
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/domainMonitor.py", line 232, in _monitorDomain
    self.domain.selftest()
  File "/usr/share/vdsm/storage/blockSD.py", line 833, in selftest
    raise se.StorageDomainAccessError(self.sdUUID)
StorageDomainAccessError: Domain is either partially accessible or entirely inaccessible: ('88eb14b4-a7fc-40ab-be6b-e2ab7bc31dbc',)
Dummy-102::DEBUG::2015-04-21 20:33:23,831::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) dd if=/rhev/data-center/6d96f52d-d791-4f66-83bd-2553ca0f3012/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None)
Dummy-102::DEBUG::2015-04-21 20:33:23,855::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.013281 s, 77.1 MB/s\n'; <rc> = 0
JsonRpc (StompReactor)::DEBUG::2015-04-21 20:33:24,240::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2015-04-21 20:33:24,241::__init__::506::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request
Thread-118153::DEBUG::2015-04-21 20:33:24,244::stompReactor::163::yajsonrpc.StompServer::(send) Sending response
Thread-116487::DEBUG::2015-04-21 20:33:25,092::blockSD::596::Storage.Misc.excCmd::(getReadDelay) /usr/bin/dd if=/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None)
Thread-116487::DEBUG::2015-04-21 20:33:25,104::blockSD::596::Storage.Misc.excCmd::(getReadDelay) FAILED: <err> = "/usr/bin/dd: error reading '/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata': Input/output error\n0+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000246275 s, 0.0 kB/s\n"; <rc> = 1
Thread-116487::ERROR::2015-04-21 20:33:25,105::misc::210::Storage.Misc::(readspeed) Unable to read file '/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata'
Thread-116487::ERROR::2015-04-21 20:33:25,105::domainMonitor::256::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain fd8aa916-0b31-4d2e-99ef-f218967e93e1 monitoring information
Traceback (most recent call last):






Thread-118150::DEBUG::2015-04-21 20:33:18,052::taskManager::171::Storage.TaskManager::(clearTask) Entry. taskID: 5ace41e8-feb4-43e0-8f47-5a5281001322

Comment 1 Kevin Alon Goldblatt 2015-04-22 13:43:13 UTC

Created attachment 1017453 [details]
engine, server, vdsm and libvirt logs

Adding logs
NOTE THAT THERE IS A 3 HOUR TIME DIFFERENCE BETWEEN THE ENGINE AND HOST LOGS

Comment 2 Allon Mureinik 2015-04-26 09:13:38 UTC

Adam, can you take a look please?

Comment 3 Adam Litke 2015-08-26 19:54:03 UTC

Ok, I think I finally understand what's happening in this bug:

During live storage migration, we create an automatic snapshot to reduce the amount of data that the qemu process must synchronize.

Thread-118033::DEBUG::2015-04-21 20:30:34,918::__init__::469::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Volume.create' in bridge with {u'preallocate': 2, u'volFormat': 4, u'srcImgUUID': u'1c9b5c2e-23ea-480c-b67b-50b969faf021', u'volumeID': u'63335b3b-6d82-446c-bb16-4ed9a6b20207', u'imageID': u'1c9b5c2e-23ea-480c-b67b-50b969faf021', u'storagepoolID': u'6d96f52d-d791-4f66-83bd-2553ca0f3012', u'storagedomainID': u'fd8aa916-0b31-4d2e-99ef-f218967e93e1', u'desc': u'', u'diskType': 2, u'srcVolUUID': u'c2a1254b-58dd-4d4c-ac0d-275093c99d05', u'size': u'3221225472'}

Part of creating the snapshot volume is to mark the old leaf as internal:

  See blockVolume.BlockVolume._create()
    volParent.clone(volPath, volFormat)
      self.setInternal()

Some time later we experience a failure to access storage

5ace41e8-feb4-43e0-8f47-5a5281001322::ERROR::2015-04-21 20:33:11,725::task::866::Storage.TaskManager.Task::(_setError) Task=`5ace41e8-feb4-43e0-8f47-5a5281001322`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 334, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1821, in createVolume
    desc=desc, srcImgUUID=srcImgUUID, srcVolUUID=srcVolUUID)
  File "/usr/share/vdsm/storage/sd.py", line 430, in createVolume
    preallocate, diskType, volUUID, desc, srcImgUUID, srcVolUUID)
  File "/usr/share/vdsm/storage/volume.py", line 435, in create
    srcVolUUID, volPath)
  File "/usr/share/vdsm/storage/blockVolume.py", line 159, in _create
    addTags=mdTags)
  File "/usr/share/vdsm/storage/lvm.py", line 1232, in changeLVTags
    (lvname, ", ".join(addTags), ", ".join(delTags), err[-1]))
LogicalVolumeReplaceTagError: Replace Logical Volume tag error: (u'lv: `fd8aa916-0b31-4d2e-99ef-f218967e93e1/63335b3b-6d82-446c-bb16-4ed9a6b20207` add: `PU_c2a1254b-58dd-4d4c-ac0d-275093c99d05, MD_10, IU_1c9b5c2e-23ea-480c-b67b-50b969faf021` del: `OVIRT_VOL_INITIALIZING` (  Cannot process volume group fd8aa916-0b31-4d2e-99ef-f218967e93e1)',)

This causes the createVolume task to fail and start recovery

5ace41e8-feb4-43e0-8f47-5a5281001322::DEBUG::2015-04-21 20:33:11,725::task::885::Storage.TaskManager.Task::(_run) Task=`5ace41e8-feb4-43e0-8f47-5a5281001322`::Task._run: 5ace41e8-feb4-43e0-8f47-5a5281001322 () {} failed - stopping task

Part of recovery is supposed to fix the parent volume (making it a leaf again)

  See parentVolumeRollback()

But since storage cannot be accessed, recovery fails

5ace41e8-feb4-43e0-8f47-5a5281001322::ERROR::2015-04-21 20:33:11,812::task::866::Storage.TaskManager.Task::(_setError) Task=`5ace41e8-feb4-43e0-8f47-5a5281001322`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 403, in run
    return function(self.task, *argslist)
  File "/usr/share/vdsm/storage/volume.py", line 317, in parentVolumeRollback
    pvol.prepare()
  File "/usr/share/vdsm/storage/volume.py", line 764, in prepare
    if not self.isLegal():
  File "/usr/share/vdsm/storage/volume.py", line 700, in isLegal
    legality = self.getMetaParam(LEGALITY)
  File "/usr/share/vdsm/storage/volume.py", line 918, in getMetaParam
    meta = self.getMetadata()
  File "/usr/share/vdsm/storage/blockVolume.py", line 606, in getMetadata
    raise se.VolumeMetadataReadError("%s: %s" % (metaId, e))
VolumeMetadataReadError: Error while processing volume meta data: ("('fd8aa916-0b31-4d2e-99ef-f218967e93e1', 12): Internal block device read failure: 'name=/dev/fd8aa916-0b31-4d2e-99ef-f218967e93e1/metadata, offset=6144, size=512'",)

The image is now left without a leaf volume in the chain.  Later we attempt a live merge.  

Thread-118301::DEBUG::2015-04-21 20:37:13,728::__init__::469::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.merge' in bridge with {u'topVolUUID': u'cf807726-3843-47e1-bf5b-ff0e81d51519', u'vmID': u'64b9a769-753a-4a51-9741-0f93f3b4f060', u'drive': {u'domainID': u'fd8aa916-0b31-4d2e-99ef-f218967e93e1', u'volumeID': u'c2a1254b-58dd-4d4c-ac0d-275093c99d05', u'poolID': u'6d96f52d-d791-4f66-83bd-2553ca0f3012', u'imageID': u'1c9b5c2e-23ea-480c-b67b-50b969faf021'}, u'bandwidth': u'0', u'jobUUID': u'76c42850-9bad-4bfc-8867-e33b921c9995', u'baseVolUUID': u'2c6abcc8-4a7f-4f7c-bdfa-c1d1430f7a56'}

The merge starts and libvirt finishes all of it's work and it's time to synchronize the new chain to storage.

  See LiveMergeCleanupThread and imageSyncVolumeChain

But this fails because we cannot lock the image:

Thread-118305::ERROR::2015-04-21 20:37:15,505::task::866::Storage.TaskManager.Task::(_setError) Task=`ade72a05-082e-462b-b3d8-d27127d57604`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1839, in imageSyncVolumeChain
    rm.LockType.shared):
  File "/usr/share/vdsm/storage/resourceManager.py", line 523, in acquireResource
    raise se.ResourceAcqusitionFailed()
ResourceAcqusitionFailed: Could not acquire resource. Probably resource factory threw an exception.: ()

Which happens because Image.getChain() cannot find the leaf volume.

Thread-118305::ERROR::2015-04-21 20:37:15,476::image::201::Storage.Image::(getChain) There is no leaf in the image 1c9b5c2e-23ea-480c-b67b-50b969faf021
Thread-118305::WARNING::2015-04-21 20:37:15,478::resourceManager::591::Storage.ResourceManager::(registerResource) Resource factory failed to create resource 'fd8aa916-0b31-4d2e-99ef-f218967e93e1_imageNS.1c9b5c2e-23ea-480c-b67b-50b969faf021'. Canceling request.
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/resourceManager.py", line 587, in registerResource
    obj = namespaceObj.factory.createResource(name, lockType)
  File "/usr/share/vdsm/storage/resourceFactories.py", line 193, in createResource
    lockType)
  File "/usr/share/vdsm/storage/resourceFactories.py", line 122, in __getResourceCandidatesList
    imgUUID=resourceName)
  File "/usr/share/vdsm/storage/image.py", line 202, in getChain
    raise se.ImageIsNotLegalChain(imgUUID)
ImageIsNotLegalChain: Image is not a legal chain: (u'1c9b5c2e-23ea-480c-b67b-50b969faf021',)


The merge job ends without updating anything and engine treats this as a live merge failure.  Sadly, the merge actually did complete according to libvirt and qemu but we are now out of sync.

Comment 4 Adam Litke 2015-08-26 20:00:15 UTC

I'd like to note that using a garbage collection approach instead of rollbacks would resolve this problem since the bad state would be cleaned up as soon as storage comes back online.  We will have this in place as part of my SPM removal work :)

I'm not sure if there's a good way to fix this problem in 3.5.x.  It would be far too invasive to rerun recoveries when they fail with exceptions.

Comment 5 Adam Litke 2015-08-26 20:07:58 UTC

The patch at http://gerrit.ovirt.org/45387 makes sure we handle errors when running imageSyncVolumeChain.  This will correct some of the problematic behavior by not allowing the live merge job to resolve until imageSyncVolumeChain completes successfully.  It will not prevent the image from being corrupted by the interrupted createVolume operation though.

Comment 6 Allon Mureinik 2015-09-06 09:24:11 UTC

As noted in comment 3, the issue here isn't LSM, it's the failure in taking a snapshot. While the reported scenario indeed involves LSM, it may happen with any failed attempt to create a snapshot (regardless of its "liveliness"), followed by a live merge.

Setting priority to medium as this is an edge case of an edge case.

Adam - could you please provide a manual procedure to recover from such a corruption?

Comment 7 Adam Litke 2015-09-08 20:18:43 UTC

There are several recovery actions which may or may not have been executed to the only real way to recover from this is to examine logs and the actual storage.

For this very specific case the procedure is:

1. Identify the current leaf volume.  It will be the same volume that was the leaf prior to creating the failed snapshot.

2. Modify the volume metadata on storage changing VOLTYPE=INTERNAL to VOLTYPE=LEAF

For attached and connected file-based domains, the metadata is located in /rhev/data-center/<POOL-ID>/<DOMAIN-ID>/images/<IMAGE-ID>/<VOLUME-ID>.meta

For block domains, I don't believe we have a tool to assist with modifying the metadata at this time.

Comment 8 Sandro Bonazzola 2015-10-26 12:38:38 UTC

this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015.
Please review this bug and if not a blocker, please postpone to a later release.
All bugs not postponed on GA release will be automatically re-targeted to

- 3.6.1 if severity >= high
- 4.0 if severity < high

Comment 9 Red Hat Bugzilla Rules Engine 2015-12-02 00:03:47 UTC

Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.

Comment 10 Red Hat Bugzilla Rules Engine 2015-12-02 00:03:47 UTC

Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 11 Sandro Bonazzola 2016-05-02 09:56:17 UTC

Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 12 Yaniv Lavi 2016-05-23 13:17:39 UTC

oVirt 4.0 beta has been released, moving to RC milestone.

Comment 13 Yaniv Lavi 2016-05-23 13:24:10 UTC

oVirt 4.0 beta has been released, moving to RC milestone.

Comment 14 Yaniv Lavi 2016-05-30 11:28:58 UTC

Should this be on MODIFIED or ON_QA?

Comment 15 Ala Hino 2016-05-30 11:57:56 UTC

Neither; I still didn't investigate

Comment 16 Allon Mureinik 2016-06-05 11:47:10 UTC

As explained in comments 3 and 7, this is a double (if not triple) negative flow, and we don't really have a good solutions for it on top of the patches already provided.