Bug 1160336 - Live Merge of snapshots with nfs disks fails when Restarting vdsm during the delete operation
Summary: Live Merge of snapshots with nfs disks fails when Restarting vdsm during the ...
Keywords:
Status: CLOSED DUPLICATE of bug 1155583
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.5.0
Hardware: x86_64
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 3.5.1
Assignee: Adam Litke
QA Contact:
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-04 15:07 UTC by Kevin Alon Goldblatt
Modified: 2016-02-10 16:51 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-11-04 18:36:09 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
engine vdsm and server logs (1.10 MB, application/x-gzip)
2014-11-04 15:07 UTC, Kevin Alon Goldblatt
no flags Details

Description Kevin Alon Goldblatt 2014-11-04 15:07:15 UTC
Created attachment 953651 [details]
engine vdsm and server logs

Description of problem:
Restaring the vdsm service during a Live merge operation causes the operation to fail

Description of problem:
On a LIVE VM with 4 snapshots, successfully deleted 2nd and 3rd snapshots but failed to delete the first snapshot created

Version-Release number of selected component (if applicable):
rhevm-3.5.0-0.17.beta.el6ev.noarch
vdsm-4.16.7.2-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create a vm with 4 disks (2 thin provisioned and 2 preallocated nfs disks)
2. Created snapshots 1, 2, 3 and 4 
3. Select to delete snapshot 2 and during the delete and live merge operation I restarted the vdsm >>> Operation fails

Actual results:
Failed to delete snapshot

Expected results:
Delete snapshot operation should resume and complete after vdsm restarts

Additional info:
From engine.log:
-------------------------
2014-11-03 18:29:29,570 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-9) Correlation ID: 4480deef, Job ID: 3440f909-c147-4ae4-abc1-7504e42ed28f, Call Stack: null, Custom Event ID: -1, Message: Snapshot 'snap2' deletion for VM 'nfs4' was initiated by admin.
2014-11-03 18:29:29,582 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-3) [2dc9f734] START, DeleteImageGroupVDSCommand( storagePoolId = b2da04d8-13d3-4a4c-9e91-e6ee493c4fdd, ignoreFailoverLimit = false, storageDomainId = b473691a-e6f6-4c44-81cb-6bee354608fd, imageGroupId = f959c800-81e5-4578-ab34-ecca4e34b59b, postZeros = false, forceDelete = false), log id: 75df3a82
2014-11-03 18:29:30,490 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-3) [2dc9f734] FINISH, DeleteImageGroupVDSCommand, log id: 75df3a82
2014-11-03 18:29:30,492 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (pool-6-thread-3) [2dc9f734] CommandAsyncTask::Adding CommandMultiAsyncTasks object for command 45a52e0f-6662-4a77-8bad-ef8975fbf6c9
2014-11-03 18:29:30,492 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (pool-6-thread-3) [2dc9f734] CommandMultiAsyncTasks::AttachTask: Attaching task 8cdc508e-4382-4d27-8a83-8cf219c2234e to command 45a52e0f-6662-4a77-8bad-ef8975fbf6c9.
2014-11-03 18:29:30,497 INFO  [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (pool-6-thread-3) [2dc9f734] Adding task 8cdc508e-4382-4d27-8a83-8cf219c2234e (Parent Command RemoveMemoryVolumes, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters), polling hasn't started yet..
2014-11-03 18:29:30,503 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-3) [2dc9f734] START, DeleteImageGroupVDSCommand( storagePoolId = b2da04d8-13d3-4a4c-9e91-e6ee493c4fdd, ignoreFailoverLimit = false, storageDomainId = b473691a-e6f6-4c44-81cb-6bee354608fd, imageGroupId = 41b8b33f-219f-4f10-bd57-79252dfa061a, postZeros = false, forceDelete = false), log id: 7e1de327
2014-11-03 18:29:31,337 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-3) [2dc9f734] FINISH, DeleteImageGroupVDSCommand, log id: 7e1de327
2014-11-03 18:29:31,338 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (pool-6-thread-3) [2dc9f734] CommandMultiAsyncTasks::AttachTask: Attaching task 7cb80223-e46c-4060-b008-2b106e53e375 to command 45a52e0f-6662-4a77-8bad-ef8975fbf6c9.
.
.
2014-11-03 18:32:42,163 INFO  [org.ovirt.engine.core.bll.RemoveSnapshotCommandCallback] (DefaultQuartzScheduler_Worker-19) [4480deef] All Live Merge child commands have completed, status FAILED
.
.
2014-11-03 18:32:52,233 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotCommand] (DefaultQuartzScheduler_Worker-30) [4480deef] Ending command with failure: org.ovirt.engine.core.bll.RemoveSnapshotCommand
2014-11-03 18:32:52,260 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-30) [4480deef] Correlation ID: 4480deef, Call Stack: null, Custom Event ID: -1, Message: Failed to delete snapshot 'snap2' for VM 'nfs4'


From vdsm.log
--------------------
Thread-6112::INFO::2014-11-03 18:29:52,824::vm::5741::vm.Vm::(merge) vmId=`1b74d641-debf-4fe7-9288-ef31225141df`::Starting merge with jobUUID='a0e0ae74-ec74-4a15-8554-04fd07f70eb9'




See logs

Comment 1 Adam Litke 2014-11-04 18:36:09 UTC
Looking at the logs I can see that some of the disks were on block devices again.

Thread-6113::DEBUG::2014-11-03 18:29:52,497::vm::5790::vm.Vm::(_diskXMLGetVolumeChainInfo) vmId=`1b74d641-debf-4fe7-9288-ef31225141df`::<allocation/> missing from backing chain for block device vdb
Thread-6113::ERROR::2014-11-03 18:29:52,500::vm::5690::vm.Vm::(merge) vmId=`1b74d641-debf-4fe7-9288-ef31225141df`::merge: libvirt does not support volume chain monitoring.  Unable to perform live merge.


I would recommend retrying your test using only nfs disks until there is a resolution for 1155583.

*** This bug has been marked as a duplicate of bug 1155583 ***


Note You need to log in before you can comment on or make changes to this bug.