Bug 1029069
| Summary: | Live storage migration snapshot removal fails, probably due to unexpected qemu-img output | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Retired] oVirt | Reporter: | Sander <bugzilla> | ||||||
| Component: | vdsm | Assignee: | Federico Simoncelli <fsimonce> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Haim <hateya> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 3.3 | CC: | abaron, acathrow, amureini, bazulay, bugzilla, iheim, mgoldboi, yeylon | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 3.3.2 | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | storage | ||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2013-11-27 15:00:00 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
I've done some additional testing with just the live snapshot creation and removal.
1. Create VM with preallocated storage
2. Start VM
3. Make live snapshot
4. Shutdown VM
5. Try to remove snapshot
The result is the same "Failed to delete snapshot" error and a "BROKEN" snapshot.
I'll attach the full vdsm log (forced logrotate before the test).
In this log I do not see the qemu-img parse error.
I do see the following errors:
d09b8ba0-659b-4d33-906b-2ae84931c29a::ERROR::2013-11-12 10:08:04,661::task::850::TaskManager.Task::(_setError) Task=`d09b8ba0-659b-4d33-906b-2ae84931c29a`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 857, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/storage/task.py", line 318, in run
return self.cmd(*self.argslist, **self.argsdict)
File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper
return f(self, *args, **kwargs)
File "/usr/share/vdsm/storage/sp.py", line 1937, in mergeSnapshots
sdUUID, vmUUID, imgUUID, ancestor, successor, postZero)
File "/usr/share/vdsm/storage/image.py", line 1162, in merge
srcVol.shrinkToOptimalSize()
File "/usr/share/vdsm/storage/blockVolume.py", line 315, in shrinkToOptimalSize
volParams = self.getVolumeParams()
File "/usr/share/vdsm/storage/volume.py", line 1008, in getVolumeParams
volParams['imgUUID'] = self.getImage()
File "/usr/share/vdsm/storage/blockVolume.py", line 494, in getImage
return self.getVolumeTag(TAG_PREFIX_IMAGE)
File "/usr/share/vdsm/storage/blockVolume.py", line 464, in getVolumeTag
return _getVolumeTag(self.sdUUID, self.volUUID, tagPrefix)
File "/usr/share/vdsm/storage/blockVolume.py", line 662, in _getVolumeTag
tags = lvm.getLV(sdUUID, volUUID).tags
File "/usr/share/vdsm/storage/lvm.py", line 851, in getLV
raise se.LogicalVolumeDoesNotExistError("%s/%s" % (vgName, lvName))
LogicalVolumeDoesNotExistError: Logical volume does not exist: ('9d4e8a43-4851-42ff-a684-f3d802527cf7/_remove_me_gxRZPitO_3d8b0fa7-03f0-4c54-a79b-15a72922b3f1',)
...
Traceback (most recent call last):
File "/usr/share/vdsm/clientIF.py", line 331, in teardownVolumePath
res = self.irs.teardownImage(drive['domainID'],
File "/usr/share/vdsm/vm.py", line 1344, in __getitem__
raise KeyError(key)
KeyError: 'domainID'
d09b8ba0-659b-4d33-906b-2ae84931c29a::WARNING::2013-11-12 10:07:43,261::lvm::594::Storage.LVM::(getLv) lv: 3d8b0fa7-03f0-4c54-a79b-15a72922b3f1_MERGE not found in lvs vg: 9d4e8a43-4851-42ff-a684-f3d802527cf7 response
...
d09b8ba0-659b-4d33-906b-2ae84931c29a::ERROR::2013-11-12 10:08:04,661::task::850::TaskManager.Task::(_setError) Task=`d09b8ba0-659b-4d33-906b-2ae84931c29a`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 857, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/storage/task.py", line 318, in run
return self.cmd(*self.argslist, **self.argsdict)
File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper
return f(self, *args, **kwargs)
File "/usr/share/vdsm/storage/sp.py", line 1937, in mergeSnapshots
sdUUID, vmUUID, imgUUID, ancestor, successor, postZero)
File "/usr/share/vdsm/storage/image.py", line 1162, in merge
srcVol.shrinkToOptimalSize()
File "/usr/share/vdsm/storage/blockVolume.py", line 315, in shrinkToOptimalSize
volParams = self.getVolumeParams()
File "/usr/share/vdsm/storage/volume.py", line 1008, in getVolumeParams
volParams['imgUUID'] = self.getImage()
File "/usr/share/vdsm/storage/blockVolume.py", line 494, in getImage
return self.getVolumeTag(TAG_PREFIX_IMAGE)
File "/usr/share/vdsm/storage/blockVolume.py", line 464, in getVolumeTag
return _getVolumeTag(self.sdUUID, self.volUUID, tagPrefix)
File "/usr/share/vdsm/storage/blockVolume.py", line 662, in _getVolumeTag
tags = lvm.getLV(sdUUID, volUUID).tags
File "/usr/share/vdsm/storage/lvm.py", line 851, in getLV
raise se.LogicalVolumeDoesNotExistError("%s/%s" % (vgName, lvName))
LogicalVolumeDoesNotExistError: Logical volume does not exist: ('9d4e8a43-4851-42ff-a684-f3d802527cf7/_remove_me_gxRZPitO_3d8b0fa7-03f0-4c54-a79b-15a72922b3f1',)
Created attachment 822789 [details]
20131112 snapshot only vdsm.log
Created attachment 822790 [details]
20131112 snapshot only libvirtd.log
Might be related to BZ 1015071 and BZ 1027464 ? The patch at http://gerrit.ovirt.org/#/c/19983/ from BZ 1015071 seems to fix this issue. Seems like this one is fixed in 3.3.1, after upgrading from 3.3.0 to 3.3.1 snapshot removal works. |
Description of problem: When I move a disk (of a running vm) to a different storage pool a live snapshot ('Auto-generated for Live Storage Migration') is created. This snapshot is not removed after the storage migration is completed and cannot be removed from a running VM ( probably because live snapshot deletion is not implemented yet?). When I stop the VM and then remove the snapshot this fails and the snapshot state changes to "BROKEN". I've done some additional testing and the problem only occurs when a preallocated disk is used. One of the errors in vdsm.log is: 14:02:53,772::image::1164::Storage.Image::(merge) Auto shrink after merge failed Traceback (most recent call last): File "/usr/share/vdsm/storage/image.py", line 1162, in merge srcVol.shrinkToOptimalSize() File "/usr/share/vdsm/storage/blockVolume.py", line 320, in shrinkToOptimalSize qemuImg.FORMAT.QCOW2) File "/usr/lib64/python2.6/site-packages/vdsm/qemuImg.py", line 109, in check raise QImgError(rc, out, err, "unable to parse qemu-img check output") QImgError: ecode=0, stdout=['No errors were found on the image.'], stderr=[], message=unable to parse qemu-img check output Thread-20::ERROR::2013-11-08 I've looked at the parsing error and the problem seems to be that the qemuImg.check() function expects a qemu-img version that outputs a string like: "Image end offset: 262144". The only version I could find that does that is the one in the F19 virt-preview repository. Stock F19 qemu-img doesn't have this output. Nor do the stock EL6 or qemu-img-rhev EL6 versions. The only function that calls qemuImg.check is blockVolume.shrinkToOptimalSize() this function needs (and uses) the parsed offset to shrink the LV containing the qcow2 image. Version-Release number of selected component (if applicable): vdsm-4.12.1-4.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create VM with preallocated disk ( in my case on FC storage ). 2. Start VM 3. Do a live migration of the preallocated disk to a different storage domain. 4. Shut down VM 5. Remove the 'Auto-generated for Live Storage Migration' snapshot. Actual results: Snapshot removal fails and snapshot state changes to "BROKEN" Expected results: Snapshot removal succeeds. Additional info: Mailing list thread: http://lists.ovirt.org/pipermail/users/2013-November/017695.html Ovirt 3.3 on EL6 (centos6).