Bug 996945 - Broken snapshot can not be deleted.
Summary: Broken snapshot can not be deleted.
Keywords:
Status: CLOSED DUPLICATE of bug 1056935
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.4.0
Assignee: Daniel Erez
QA Contact: Aharon Canan
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-14 10:40 UTC by Leonid Natapov
Modified: 2016-02-10 16:38 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-03-27 09:57:21 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (3.79 MB, application/x-gzip)
2013-08-14 10:43 UTC, Leonid Natapov
no flags Details
logs (1016.00 KB, application/x-tar)
2013-08-25 14:31 UTC, Leonid Natapov
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 27611 0 master MERGED core: avoid broken status on snapshot removal Never

Description Leonid Natapov 2013-08-14 10:40:17 UTC
Description of problem:

I was in the process of deleting snapshot and at the same time executed ovirt-engine restart. As a result snapshot was not deleted and was marked as BROKEN.
After engine restarted I tried to delete this broken snapshot and failed. 
Got the following error message from the backend: Failed to delete snapshot 

vdsm log shows the following error:

85e', Clearing records.
69288a27-a44d-4036-bec9-31ac210b092e::DEBUG::2013-08-14 13:32:42,735::resourceManager::648::ResourceManager::(releaseResource) No one is waiting for resource 'aff59543-f54c-4580-91f9-722a0b834df2_imageNS.fc3c8116-4f51-48b4-99a0-a095c22caf08', Clearing records.
69288a27-a44d-4036-bec9-31ac210b092e::ERROR::2013-08-14 13:32:42,735::task::850::TaskManager.Task::(_setError) Task=`69288a27-a44d-4036-bec9-31ac210b092e`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 318, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1937, in mergeSnapshots
    sdUUID, vmUUID, imgUUID, ancestor, successor, postZero)
  File "/usr/share/vdsm/storage/image.py", line 1091, in merge
    dstVol = vols[ancestor]
KeyError: '7c945ef7-3925-46b9-bc8a-b233187edc4b'


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.Create VM with thin disk.
2.Create snapshot.
3.Delet snapshot and at the same time do restart to the engine.
4.After engine restarted try to delete BROKEN snapshot.

Actual results:
deletion failed.

Expected results:
Deletion succeeded.

Additional info:
vdsm-4.12.0-52.gitce029ba.el6ev.x86_64
engine - is9.
vdsm log and angine log attached.

Comment 1 Leonid Natapov 2013-08-14 10:43:57 UTC
Created attachment 786494 [details]
logs

Comment 4 Sergey Gotliv 2013-08-24 19:43:27 UTC
I can't open engine log. Do you have a way to reproduce this bug and attach appropriate logs?

Comment 5 Leonid Natapov 2013-08-25 14:02:12 UTC
I reproduced it on is11. 
Here is the engine.log output while trying to delete broken snapshot.
Come to me for more info.

2013-08-25 16:57:50,492 INFO  [org.ovirt.engine.core.bll.RemoveSnapshotCommand] (ajp-/127.0.0.1:8702-3) Lock Acquired to object EngineLock [exclusiveLocks= key: a6065a5c-80f1-4df1-b327-895e3516ae0c value: VM
, sharedLocks= ]
2013-08-25 16:57:50,684 INFO  [org.ovirt.engine.core.bll.RemoveSnapshotCommand] (ajp-/127.0.0.1:8702-3) Running command: RemoveSnapshotCommand internal: false. Entities affected :  ID: a6065a5c-80f1-4df1-b327-895e3516ae0c Type: VM
2013-08-25 16:57:50,712 INFO  [org.ovirt.engine.core.bll.RemoveSnapshotCommand] (ajp-/127.0.0.1:8702-3) Lock freed to object EngineLock [exclusiveLocks= key: a6065a5c-80f1-4df1-b327-895e3516ae0c value: VM
, sharedLocks= ]
2013-08-25 16:57:50,764 INFO  [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand] (ajp-/127.0.0.1:8702-3) Running command: RemoveSnapshotSingleDiskCommand internal: true. Entities affected :  ID: 00000000-0000-0000-0000-000000000000 Type: Storage
2013-08-25 16:57:50,804 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.MergeSnapshotsVDSCommand] (ajp-/127.0.0.1:8702-3) START, MergeSnapshotsVDSCommand( storagePoolId = 5849b030-626e-47cb-ad90-3ce782d831b3, ignoreFailoverLimit = false, storageDomainId = df72dcde-837d-4a82-8427-a7e77d2e239e, imageGroupId = 3854ef9f-c542-45e0-b712-4dd43cc7b50e, imageId = f4444ad9-a7ac-43ca-9c54-21c595e74e34, imageId2 = afbe499b-ea35-4d9c-b47c-1bb31cd170e7, vmId = a6065a5c-80f1-4df1-b327-895e3516ae0c, postZero = false), log id: 390681a0
2013-08-25 16:57:51,512 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.MergeSnapshotsVDSCommand] (ajp-/127.0.0.1:8702-3) FINISH, MergeSnapshotsVDSCommand, log id: 390681a0
2013-08-25 16:57:51,523 INFO  [org.ovirt.engine.core.bll.CommandAsyncTask] (ajp-/127.0.0.1:8702-3) CommandAsyncTask::Adding CommandMultiAsyncTasks object for command 5ca1ec86-3cfc-4250-bc6c-485397e28855
2013-08-25 16:57:51,524 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (ajp-/127.0.0.1:8702-3) CommandMultiAsyncTasks::AttachTask: Attaching task d16685eb-ee34-4555-9068-52ae8441895a to command 5ca1ec86-3cfc-4250-bc6c-485397e28855.
2013-08-25 16:57:51,542 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager] (ajp-/127.0.0.1:8702-3) Adding task d16685eb-ee34-4555-9068-52ae8441895a (Parent Command RemoveSnapshot, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters), polling hasn't started yet..
2013-08-25 16:57:51,564 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-3) Correlation ID: 8e28177, Job ID: ecfd1159-bdee-4eab-b380-2cd784abf40e, Call Stack: null, Custom Event ID: -1, Message: Snapshot '2' deletion for VM 'lesik1' was initiated by admin@internal.
2013-08-25 16:57:51,565 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (ajp-/127.0.0.1:8702-3) BaseAsyncTask::StartPollingTask: Starting to poll task d16685eb-ee34-4555-9068-52ae8441895a.
2013-08-25 16:57:52,250 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager] (DefaultQuartzScheduler_Worker-31) Polling and updating Async Tasks: 2 tasks, 1 tasks to poll now
2013-08-25 16:57:52,322 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (DefaultQuartzScheduler_Worker-31) SPMAsyncTask::PollTask: Polling task d16685eb-ee34-4555-9068-52ae8441895a (Parent Command RemoveSnapshot, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status running.
2013-08-25 16:57:52,322 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager] (DefaultQuartzScheduler_Worker-31) Finished polling Tasks, will poll again in 10 seconds.
2013-08-25 16:58:02,034 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager] (DefaultQuartzScheduler_Worker-45) Setting new tasks map. The map contains now 1 tasks
2013-08-25 16:58:02,323 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager] (DefaultQuartzScheduler_Worker-44) Polling and updating Async Tasks: 1 tasks, 1 tasks to poll now
2013-08-25 16:58:02,334 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-44) Failed in HSMGetAllTasksStatusesVDS method
2013-08-25 16:58:02,334 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-44) Error code GeneralException and error message VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = 'f4444ad9-a7ac-43ca-9c54-21c595e74e34'
2013-08-25 16:58:02,334 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (DefaultQuartzScheduler_Worker-44) SPMAsyncTask::PollTask: Polling task d16685eb-ee34-4555-9068-52ae8441895a (Parent Command RemoveSnapshot, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'cleanSuccess'.
2013-08-25 16:58:02,340 ERROR [org.ovirt.engine.core.bll.SPMAsyncTask] (DefaultQuartzScheduler_Worker-44) BaseAsyncTask::LogEndTaskFailure: Task d16685eb-ee34-4555-9068-52ae8441895a (Parent Command RemoveSnapshot, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
-- Result: cleanSuccess
-- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = 'f4444ad9-a7ac-43ca-9c54-21c595e74e34',
-- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = 'f4444ad9-a7ac-43ca-9c54-21c595e74e34'
2013-08-25 16:58:02,341 INFO  [org.ovirt.engine.core.bll.CommandAsyncTask] (DefaultQuartzScheduler_Worker-44) CommandAsyncTask::EndActionIfNecessary: All tasks of command 5ca1ec86-3cfc-4250-bc6c-485397e28855 has ended -> executing EndAction
2013-08-25 16:58:02,342 INFO  [org.ovirt.engine.core.bll.CommandAsyncTask] (DefaultQuartzScheduler_Worker-44) CommandAsyncTask::EndAction: Ending action for 1 tasks (command ID: 5ca1ec86-3cfc-4250-bc6c-485397e28855): calling EndAction .
2013-08-25 16:58:02,342 INFO  [org.ovirt.engine.core.bll.CommandAsyncTask] (pool-5-thread-49) CommandAsyncTask::EndCommandAction [within thread] context: Attempting to EndAction RemoveSnapshot, executionIndex: 0
2013-08-25 16:58:02,354 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotCommand] (pool-5-thread-49) [8e28177] Ending command with failure: org.ovirt.engine.core.bll.RemoveSnapshotCommand
2013-08-25 16:58:02,399 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand] (pool-5-thread-49) [7a52ebe0] Ending command with failure: org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand
2013-08-25 16:58:02,429 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-49) Correlation ID: 8e28177, Job ID: ecfd1159-bdee-4eab-b380-2cd784abf40e, Call Stack: null, Custom Event ID: -1, Message: Failed to delete snapshot '2' for VM 'lesik1'.
2013-08-25 16:58:02,429 INFO  [org.ovirt.engine.core.bll.CommandAsyncTask] (pool-5-thread-49) CommandAsyncTask::HandleEndActionResult [within thread]: EndAction for action type RemoveSnapshot completed, handling the result.
2013-08-25 16:58:02,430 INFO  [org.ovirt.engine.core.bll.CommandAsyncTask] (pool-5-thread-49) CommandAsyncTask::HandleEndActionResult [within thread]: EndAction for action type RemoveSnapshot succeeded, clearing tasks.
2013-08-25 16:58:02,440 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (pool-5-thread-49) SPMAsyncTask::ClearAsyncTask: Attempting to clear task d16685eb-ee34-4555-9068-52ae8441895a
2013-08-25 16:58:02,441 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand] (pool-5-thread-49) START, SPMClearTaskVDSCommand( storagePoolId = 5849b030-626e-47cb-ad90-3ce782d831b3, ignoreFailoverLimit = false, taskId = d16685eb-ee34-4555-9068-52ae8441895a), log id: 1f9f75fa
2013-08-25 16:58:02,443 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (pool-5-thread-49) START, HSMClearTaskVDSCommand(HostName = purple-vds3, HostId = a45793c0-8c8b-49a1-9039-3f841c87c342, taskId=d16685eb-ee34-4555-9068-52ae8441895a), log id: 1dfa804e
2013-08-25 16:58:02,469 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (pool-5-thread-49) FINISH, HSMClearTaskVDSCommand, log id: 1dfa804e
2013-08-25 16:58:02,469 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand] (pool-5-thread-49) FINISH, SPMClearTaskVDSCommand, log id: 1f9f75fa
2013-08-25 16:58:02,471 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (pool-5-thread-49) BaseAsyncTask::RemoveTaskFromDB: Removed task d16685eb-ee34-4555-9068-52ae8441895a from DataBase
2013-08-25 16:58:02,471 INFO  [org.ovirt.engine.core.bll.CommandAsyncTask] (pool-5-thread-49) CommandAsyncTask::HandleEndActionResult [within thread]: Removing CommandMultiAsyncTasks object for entity 5ca1ec86-3cfc-4250-bc6c-485397e2885

Comment 6 Leonid Natapov 2013-08-25 14:31:45 UTC
Created attachment 790094 [details]
logs

new logs

Comment 9 Ayal Baron 2013-12-18 09:59:08 UTC
Daniel, would this bug even be relevant in light of single disk snapshots

Comment 12 Daniel Erez 2014-03-27 09:57:21 UTC

*** This bug has been marked as a duplicate of bug 1056935 ***


Note You need to log in before you can comment on or make changes to this bug.