Created attachment 953651 [details] engine vdsm and server logs Description of problem: Restaring the vdsm service during a Live merge operation causes the operation to fail Description of problem: On a LIVE VM with 4 snapshots, successfully deleted 2nd and 3rd snapshots but failed to delete the first snapshot created Version-Release number of selected component (if applicable): rhevm-3.5.0-0.17.beta.el6ev.noarch vdsm-4.16.7.2-1.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create a vm with 4 disks (2 thin provisioned and 2 preallocated nfs disks) 2. Created snapshots 1, 2, 3 and 4 3. Select to delete snapshot 2 and during the delete and live merge operation I restarted the vdsm >>> Operation fails Actual results: Failed to delete snapshot Expected results: Delete snapshot operation should resume and complete after vdsm restarts Additional info: From engine.log: ------------------------- 2014-11-03 18:29:29,570 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-9) Correlation ID: 4480deef, Job ID: 3440f909-c147-4ae4-abc1-7504e42ed28f, Call Stack: null, Custom Event ID: -1, Message: Snapshot 'snap2' deletion for VM 'nfs4' was initiated by admin. 2014-11-03 18:29:29,582 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-3) [2dc9f734] START, DeleteImageGroupVDSCommand( storagePoolId = b2da04d8-13d3-4a4c-9e91-e6ee493c4fdd, ignoreFailoverLimit = false, storageDomainId = b473691a-e6f6-4c44-81cb-6bee354608fd, imageGroupId = f959c800-81e5-4578-ab34-ecca4e34b59b, postZeros = false, forceDelete = false), log id: 75df3a82 2014-11-03 18:29:30,490 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-3) [2dc9f734] FINISH, DeleteImageGroupVDSCommand, log id: 75df3a82 2014-11-03 18:29:30,492 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (pool-6-thread-3) [2dc9f734] CommandAsyncTask::Adding CommandMultiAsyncTasks object for command 45a52e0f-6662-4a77-8bad-ef8975fbf6c9 2014-11-03 18:29:30,492 INFO [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (pool-6-thread-3) [2dc9f734] CommandMultiAsyncTasks::AttachTask: Attaching task 8cdc508e-4382-4d27-8a83-8cf219c2234e to command 45a52e0f-6662-4a77-8bad-ef8975fbf6c9. 2014-11-03 18:29:30,497 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (pool-6-thread-3) [2dc9f734] Adding task 8cdc508e-4382-4d27-8a83-8cf219c2234e (Parent Command RemoveMemoryVolumes, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters), polling hasn't started yet.. 2014-11-03 18:29:30,503 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-3) [2dc9f734] START, DeleteImageGroupVDSCommand( storagePoolId = b2da04d8-13d3-4a4c-9e91-e6ee493c4fdd, ignoreFailoverLimit = false, storageDomainId = b473691a-e6f6-4c44-81cb-6bee354608fd, imageGroupId = 41b8b33f-219f-4f10-bd57-79252dfa061a, postZeros = false, forceDelete = false), log id: 7e1de327 2014-11-03 18:29:31,337 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (pool-6-thread-3) [2dc9f734] FINISH, DeleteImageGroupVDSCommand, log id: 7e1de327 2014-11-03 18:29:31,338 INFO [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (pool-6-thread-3) [2dc9f734] CommandMultiAsyncTasks::AttachTask: Attaching task 7cb80223-e46c-4060-b008-2b106e53e375 to command 45a52e0f-6662-4a77-8bad-ef8975fbf6c9. . . 2014-11-03 18:32:42,163 INFO [org.ovirt.engine.core.bll.RemoveSnapshotCommandCallback] (DefaultQuartzScheduler_Worker-19) [4480deef] All Live Merge child commands have completed, status FAILED . . 2014-11-03 18:32:52,233 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotCommand] (DefaultQuartzScheduler_Worker-30) [4480deef] Ending command with failure: org.ovirt.engine.core.bll.RemoveSnapshotCommand 2014-11-03 18:32:52,260 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-30) [4480deef] Correlation ID: 4480deef, Call Stack: null, Custom Event ID: -1, Message: Failed to delete snapshot 'snap2' for VM 'nfs4' From vdsm.log -------------------- Thread-6112::INFO::2014-11-03 18:29:52,824::vm::5741::vm.Vm::(merge) vmId=`1b74d641-debf-4fe7-9288-ef31225141df`::Starting merge with jobUUID='a0e0ae74-ec74-4a15-8554-04fd07f70eb9' See logs
Looking at the logs I can see that some of the disks were on block devices again. Thread-6113::DEBUG::2014-11-03 18:29:52,497::vm::5790::vm.Vm::(_diskXMLGetVolumeChainInfo) vmId=`1b74d641-debf-4fe7-9288-ef31225141df`::<allocation/> missing from backing chain for block device vdb Thread-6113::ERROR::2014-11-03 18:29:52,500::vm::5690::vm.Vm::(merge) vmId=`1b74d641-debf-4fe7-9288-ef31225141df`::merge: libvirt does not support volume chain monitoring. Unable to perform live merge. I would recommend retrying your test using only nfs disks until there is a resolution for 1155583. *** This bug has been marked as a duplicate of bug 1155583 ***