| Summary: | [vdsm] Snapshot merge fails on "SourceImageActionError: Error during source image manipulation" | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Elad <ebenahar> | ||||||
| Component: | General | Assignee: | Ala Hino <ahino> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Kevin Alon Goldblatt <kgoldbla> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 4.18.0 | CC: | ahino, amureini, bugs, dev-unix-virtualization, mst, stirabos, tnisan | ||||||
| Target Milestone: | ovirt-4.0.0-rc | Keywords: | Regression | ||||||
| Target Release: | 4.18.9 | Flags: | rule-engine:
ovirt-4.0.0+
rule-engine: blocker+ rule-engine: planning_ack+ rule-engine: devel_ack+ rule-engine: testing_ack+ |
||||||
| Hardware: | x86_64 | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-07-05 07:45:23 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
ovirt-engine-4.0.0-0.0.master.20160310051350.git3151c66.el7.centos.noarch Adam, as this week's QE contact please have a look This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. Created attachment 1139978 [details]
vdsm, libvirt, messages and engine logs
(In reply to Elad from comment #0) > Description of problem: > Snapshot merge (cold) fails every time, on file and block storage, with the > following error message in vdsm: > > > d97ca3a1-7b65-47eb-b8df-a3664f4e43a6::ERROR::2016-03-22 > 10:27:49,527::image::1338::Storage.Image::(merge) can't set attribute > Traceback (most recent call last): > File "/usr/share/vdsm/storage/image.py", line 1321, in merge > [vols[vName] for vName in chain]) > File "/usr/share/vdsm/storage/image.py", line 1146, in _baseRawVolumeMerge > srcVol.rename(tmpUUID) > File "/usr/share/vdsm/storage/fileVolume.py", line 589, in rename > self._manifest.volumePath = volPath > AttributeError: can't set attribute The _manifest was only introduced post 3.6, re-targettitng accordingly. We have faced the same issue. But, deletion of 2nd snapshot of a power-off VM happens successfully. We have not seen this issue with live VM. We are using ovirt-engine-3.6.2.6-1.el7.centos.noarch and vdsm-4.17.999-900.giteba4b25.el7.centos.noarch Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA. This was fixed as part of the work done for 3.6.6 - moving to ON_QA. Ala, is there anything we need to document here, or is it documented elsewhere? Please either provide the doctext, or comment with the BZ tracking the doc text and set requires-doctext-. BZ 1323629 documents the behavior Test the with the following code:
------------------------------------------
rhevm-4.0.0.2-0.1.el7ev.noarch
vdsm-4.18.1-11.gita92976e.el7ev.x86_64
Tested with the following scenario:
-------------------------------------
1.Created Vm with 3 disks, 1 iscsi - preallocated, 1 iscsi - thin and 1 nfs - thin
2. Created snapshots sbu1, sbu2 and sbu3 while VM is dowm
3. Started the VM and created su1, su2 and su3
4. Selected snbu2 and tried deleting >>>>> The operation failed with "Failed to delete snapshot 'sbu2' for VM 'vm1'."
vdsm log displays error Volume does not exist"
Engine.log:
-----------------
2016-06-05 14:11:54,499 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-5) [] Correlation ID: 69d26cca, Job ID: d1762475-b0e4-44c6-9d15-4d36f2b1a4e0, Call Stack: null, Custom Event ID: -1, Message: Snapshot 'sbu2' deletion for VM 'vm1' was initiated by admin@internal-authz.
.
.
.
2016-06-05 14:14:09,898 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.GetImageInfoVDSCommand] (DefaultQuartzScheduler2) [6067c86] START, GetImageInfoVDSCommand( GetImageInfoVDSCommandPar
ameters:{runAsync='true', storagePoolId='3f76ceea-84c5-4893-8c7d-3bf2d8eea69e', ignoreFailoverLimit='false', storageDomainId='7590d3ff-797a-4042-90f8-4094e6d01665', imageGroupId='181ba273-4
150-4ad0-a209-95184cd1fbd4', imageId='3c27fae4-6ac7-436e-8302-e165632e5108'}), log id: 135a9ad8
2016-06-05 14:14:10,928 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.GetImageInfoVDSCommand] (DefaultQuartzScheduler2) [6067c86] FINISH, GetImageInfoVDSCommand, return: org.ovirt.engine
.core.common.businessentities.storage.DiskImage@7786d8dc, log id: 135a9ad8
2016-06-05 14:14:11,206 INFO [org.ovirt.engine.core.utils.transaction.TransactionSupport] (DefaultQuartzScheduler2) [6067c86] transaction rolled back
2016-06-05 14:14:11,206 INFO [org.ovirt.engine.core.utils.transaction.TransactionSupport] (DefaultQuartzScheduler2) [6067c86] transaction rolled back
2016-06-05 14:14:11,206 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (DefaultQuartzScheduler2) [6067c86] Error invoking callback method 'onFailed' for 'FAILED' command 'c2
b114ea-5b53-4686-86c1-175c97ad19a8'
2016-06-05 14:14:11,206 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (DefaultQuartzScheduler2) [6067c86] Exception: java.lang.NullPointerException
at org.ovirt.engine.core.utils.ovf.OvfWriter.buildDisk(OvfWriter.java:204) [utils.jar:]
at org.ovirt.engine.core.utils.ovf.OvfManager.buildOvf(OvfManager.java:118) [utils.jar:]
at org.ovirt.engine.core.utils.ovf.OvfManager.exportVm(OvfManager.java:29) [utils.jar:]
at org.ovirt.engine.core.bll.storage.disk.image.ImagesHandler.prepareSnapshotConfigWithAlternateImage(ImagesHandler.java:829) [bll.jar:]
at org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand.lambda$updateVmConfigurationForImageChange$1(RemoveSnapshotSingleDiskLiveCommand.java:394) [bll.jar:]
at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:204) [utils.jar:]
at org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand.updateVmConfigurationForImageChange(RemoveSnapshotSingleDiskLiveCommand.java:392) [bll.jar:]
at org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand.handleBackwardLiveMerge(RemoveSnapshotSingleDiskLiveCommand.java:382) [bll.jar:]
at org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand.lambda$syncDbRecords$0(RemoveSnapshotSingleDiskLiveCommand.java:303) [bll.jar:]
at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:204) [utils.jar:]
at org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand.syncDbRecords(RemoveSnapshotSingleDiskLiveCommand.java:291) [bll.jar:]
at org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand.onFailed(RemoveSnapshotSingleDiskLiveCommand.java:447) [bll.jar:]
at org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback.onFailed(RemoveSnapshotSingleDiskLiveCommandCallback.java:18) [bll.jar:]
at org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.endCallback(CommandCallbacksPoller.java:56) [bll.jar:]
at org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) [bll.jar:]
at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source) [:1.8.0_91]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_91]
at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_91]
at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:77) [scheduler.jar:]
at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:51) [scheduler.jar:]
at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_91]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_91]
2016-06-05 14:14:12,249 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [6067c86] Command 'RemoveSnapshot' id: 'fec75b49-40a7-4439-8dde-
d9db3bb221b2' child commands '[b77126ce-f696-4bdc-a0f6-b411841c5ad4, 24b9ea03-2be1-4f0a-a1ab-3ba54c209f58, c2b114ea-5b53-4686-86c1-175c97ad19a8]' executions were completed, status 'FAILED'
2016-06-05 14:14:13,520 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand] (DefaultQuartzScheduler10) [6067c86] Ending command 'org.ovirt.engine.core.bll.snapshots.RemoveSnap
shotCommand' with failure.
2016-06-05 14:14:13,724 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler10) [6067c86] Correlation ID: 69d26cca, Job ID: d1762475-b0e4-44c6-9d15-4d36f2b1a4e0, Call Stack: null, Custom Event ID: -1, Message: Failed to delete snapshot 'sbu2' for VM 'vm1'.
VDSM log:
--------------------------------
Thread-430::DEBUG::2016-06-05 14:13:30,758::resourceManager::241::Storage.ResourceManager.Request::(grant) ResName=`Storage.7590d3ff-797a-4042-90f8-4094e6d01665`ReqID=`8e5f76d1-4683-46d5-b823-43ad815e8165`::Granted request
Thread-430::DEBUG::2016-06-05 14:13:30,758::task::829::Storage.TaskManager.Task::(resourceAcquired) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::_resourcesAcquired: Storage.7590d3ff-797a-4042-90f8-4094e6d01665 (shared)
Thread-430::DEBUG::2016-06-05 14:13:30,758::task::995::Storage.TaskManager.Task::(_decref) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::ref 1 aborting False
Thread-430::DEBUG::2016-06-05 14:13:30,759::lvm::288::Storage.Misc.excCmd::(cmd) /usr/bin/taskset --cpu-list 0-3 /usr/bin/sudo -n /usr/sbin/lvm lvs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ '\''a|/dev/mapper/3514f0c5a516000c8|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,vg_name,attr,size,seg_start_pe,devices,tags 7590d3ff-797a-4042-90f8-4094e6d01665 (cwd None)
Thread-430::DEBUG::2016-06-05 14:13:30,961::lvm::288::Storage.Misc.excCmd::(cmd) SUCCESS: <err> = ' WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!\n'; <rc> = 0
Thread-430::DEBUG::2016-06-05 14:13:30,963::lvm::463::Storage.LVM::(_reloadlvs) lvs reloaded
Thread-430::WARNING::2016-06-05 14:13:30,963::lvm::609::Storage.LVM::(getLv) lv: 9f680fdb-fe8a-462c-9977-7aec183ec6af not found in lvs vg: 7590d3ff-797a-4042-90f8-4094e6d01665 response
Thread-430::ERROR::2016-06-05 14:13:30,963::task::868::Storage.TaskManager.Task::(_setError) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 875, in _run
return fn(*args, **kargs)
File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper
res = f(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 3039, in getVolumeInfo
volUUID=volUUID).getInfo()
File "/usr/share/vdsm/storage/sd.py", line 619, in produceVolume
volUUID)
File "/usr/share/vdsm/storage/blockVolume.py", line 370, in __init__
manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID)
File "/usr/share/vdsm/storage/blockVolume.py", line 72, in __init__
volUUID)
File "/usr/share/vdsm/storage/volume.py", line 80, in __init__
self.validate()
File "/usr/share/vdsm/storage/blockVolume.py", line 156, in validate
raise se.VolumeDoesNotExist(self.volUUID)
VolumeDoesNotExist: Volume does not exist: ('9f680fdb-fe8a-462c-9977-7aec183ec6af',)
Thread-430::DEBUG::2016-06-05 14:13:30,989::task::887::Storage.TaskManager.Task::(_run) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::Task._run: 18cc50f9-e374-46bb-9389-e48f192628c8 ('7590d3ff-797a-4042-90f8-4094e6d01665', '3f76ceea-84c5-4893-8c7d-3bf2d8eea69e', '9c3f4477-4041-4c4f-8add-b547db25c839', '9f680fdb-fe8a-462c-9977-7aec183ec6af') {} failed - stopping task
Thread-430::DEBUG::2016-06-05 14:13:30,989::task::1248::Storage.TaskManager.Task::(stop) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::stopping in state preparing (force False)
Thread-430::DEBUG::2016-06-05 14:13:30,989::task::995::Storage.TaskManager.Task::(_decref) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::ref 1 aborting True
Thread-430::INFO::2016-06-05 14:13:30,989::task::1173::Storage.TaskManager.Task::(prepare) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::aborting: Task is aborted: 'Volume does not exist' - code 201
Thread-430::DEBUG::2016-06-05 14:13:30,990::task::1178::Storage.TaskManager.Task::(prepare) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::Prepare: aborted: Volume does not exist
Thread-430::DEBUG::2016-06-05 14:13:30,990::task::995::Storage.TaskManager.Task::(_decref) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::ref 0 aborting True
Thread-430::DEBUG::2016-06-05 14:13:30,990::task::930::Storage.TaskManager.Task::(_doAbort) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::Task._doAbort: force False
Thread-430::DEBUG::2016-06-05 14:13:30,990::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-430::DEBUG::2016-06-05 14:13:30,990::task::597::Storage.TaskManager.Task::(_updateState) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::moving from state preparing -> state aborting
Thread-430::DEBUG::2016-06-05 14:13:30,990::task::552::Storage.TaskManager.Task::(__state_aborting) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::_aborting: recover policy none
Thread-430::DEBUG::2016-06-05 14:13:30,990::task::597::Storage.TaskManager.Task::(_updateState) Task=`18cc50f9-e374-46bb-9389-e48f192628c8`::moving from state aborting -> state failed
Thread-430::DEBUG::2016-06-05 14:13:30,991::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.7590d3ff-797a-4042-90f8-4094e6d01665': < ResourceRef 'Storage.7590d3ff-797a-4042-90f8-4094e6d01665', isValid: 'True' obj: 'None'>}
Thread-430::DEBUG::2016-06-05 14:13:30,991::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release. Created attachment 1164915 [details]
vdsm server and engine logs
Added logs
Not related. This issue introduced after commit 6e55d137d5ca and fixed in this commit a4602dba0f6985fd812871742d996c4a787af7bf. Testedwith the following code: ------------------------------------------ rhevm-4.0.0.4-0.1.el7ev.noarch vdsm-4.18.2-0.el7ev.x86_64 Tested with the following scenario: ------------------------------------- 1.Created Vm with 3 disks, 1 iscsi - preallocated, 1 iscsi - thin and 1 nfs - thin 2. Created snapshots sd1, sd2, sd3 and sd4 while VM is dowm 3. Deleted sd2 successfully - snapshot merge successfull 4. Deleted sd3 successfully - snapshot merge successfull Moving to VERIFIED! oVirt 4.0.0 has been released, closing current release. |
Description of problem: Snapshot merge (cold) fails every time, on file and block storage, with the following error message in vdsm: d97ca3a1-7b65-47eb-b8df-a3664f4e43a6::ERROR::2016-03-22 10:27:49,527::image::1338::Storage.Image::(merge) can't set attribute Traceback (most recent call last): File "/usr/share/vdsm/storage/image.py", line 1321, in merge [vols[vName] for vName in chain]) File "/usr/share/vdsm/storage/image.py", line 1146, in _baseRawVolumeMerge srcVol.rename(tmpUUID) File "/usr/share/vdsm/storage/fileVolume.py", line 589, in rename self._manifest.volumePath = volPath AttributeError: can't set attribute d97ca3a1-7b65-47eb-b8df-a3664f4e43a6::ERROR::2016-03-22 10:27:49,532::task::868::Storage.TaskManager.Task::(_setError) Task=`d97ca3a1-7b65-47eb-b8df-a3664f4e43a6`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 875, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/task.py", line 334, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 78, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1864, in mergeSnapshots sdUUID, vmUUID, imgUUID, ancestor, successor, postZero) File "/usr/share/vdsm/storage/image.py", line 1339, in merge raise se.SourceImageActionError(imgUUID, sdUUID, str(e)) SourceImageActionError: Error during source image manipulation: u"image=5c912ed8-7c34-476e-8a07-8e28c2a49e95, source domain=dc52ca50-9f03-45c9-be9e-1a4a1f84cdde: can't set attribute" Version-Release number of selected component (if applicable): vdsm-4.17.999-724.gitb8cb30a.el7.centos.noarch libvirt-daemon-1.2.17-13.el7_2.4.x86_64 qemu-kvm-rhev-2.3.0-31.el7_2.9.x86_64 How reproducible: Always Steps to Reproduce: 1. Create a VM 2. Create a snapshot 3. Delete the snapshot (Reproduced over file and block storage) Actual results: Snspshot merge fails Expected results: Snapshot merge succeeds Additional info: vdsm, libvirt, messages and engine logs