Bug 1532133
Summary: | Preallocated volume convert to sparse volume after live storage migration to file based storage domain | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Ribu Tho <rabraham> | ||||||
Component: | vdsm | Assignee: | Nir Soffer <nsoffer> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Yosi Ben Shimon <ybenshim> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 4.1.4 | CC: | appraprv, fgarciad, lsurette, mlipchuk, nsoffer, rabraham, ratamir, srevivo, tnisan, trichard, ybenshim, ycui, ykaul, ylavi | ||||||
Target Milestone: | ovirt-4.2.2 | Flags: | lsvaty:
testing_plan_complete-
|
||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | v4.20.15 | Doc Type: | Bug Fix | ||||||
Doc Text: |
Red Hat Virtualization uses the qemu-img tool to copy disks during live storage migration, instead of dd. This tool converts unused space in the image to holes, making the destination disk sparse. Raw preallocated disks copied during live storage migration were converted to raw sparse disks.
Now, you can use the qemu-img preallocation option when copying raw preallocated disks to file-based storage domains, so that the disks are kept preallocated after the migration.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-05-15 17:54:02 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1550117 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Description
Ribu Tho
2018-01-08 05:57:47 UTC
In vdsm log we see: 5.0g volume created: 2018-01-08 04:53:32,896+1100 INFO (jsonrpc/7) [dispatcher] Run and protect: createVolume(sdUUID=u'0694def1-b588-4a43-b71f-bd66df4fef24', spUUID=u'598c8196-032c-02fa-00f6-000000000230', imgU UID=u'4e849764-7876-4a46-bfaa-0b82dc283475', size=u'5368709120', volFormat=5, preallocate=1, diskType=2, volUUID=u'db5b9b50-98ec-43e7-8de3-4b40c1a54502', desc=u'{"DiskAlias":"test-iscsi_Disk 3","DiskDescription":"gluster-test"}', srcImgUUID=u'00000000-0000-0000-0000-000000000000', srcVolUUID=u'00000000-0000-0000-0000-000000000000', initialSize=None) (logUtils:51) Vdsm preallocates 5.0G as asked: 2018-01-08 04:54:01,232+1100 DEBUG (tasks/3) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 /usr/bin/nice -n 19 /usr/bin/ionice -c 3 /usr/bin/dd if=/dev/zero of=/rhev/data-center/598c8196-032c-02fa-00f6-000000000230/0694def1-b588-4a43-b71f-bd66df4fef24/images/4e849764-7876-4a46-bfaa-0b82dc283475/db5b9b50-98ec-43e7-8de3-4b40c1a54502 bs=1048576 seek=0 skip=0 conv=notrunc count=5120 oflag=direct (cwd None) (commands:69) conv=fsync flag is missing in this dd command. Can try to run the same dd command with this storage - does it create sparse file? Does it change is we add conv=notrunc,fsync? (In reply to Nir Soffer from comment #3) > In vdsm log we see: > > 5.0g volume created: > > 2018-01-08 04:53:32,896+1100 INFO (jsonrpc/7) [dispatcher] Run and protect: > createVolume(sdUUID=u'0694def1-b588-4a43-b71f-bd66df4fef24', > spUUID=u'598c8196-032c-02fa-00f6-000000000230', imgU > UID=u'4e849764-7876-4a46-bfaa-0b82dc283475', size=u'5368709120', > volFormat=5, preallocate=1, diskType=2, > volUUID=u'db5b9b50-98ec-43e7-8de3-4b40c1a54502', > desc=u'{"DiskAlias":"test-iscsi_Disk > 3","DiskDescription":"gluster-test"}', > srcImgUUID=u'00000000-0000-0000-0000-000000000000', > srcVolUUID=u'00000000-0000-0000-0000-000000000000', initialSize=None) > (logUtils:51) > > Vdsm preallocates 5.0G as asked: > > 2018-01-08 04:54:01,232+1100 DEBUG (tasks/3) [storage.Misc.excCmd] > /usr/bin/taskset --cpu-list 0-1 /usr/bin/nice -n 19 /usr/bin/ionice -c 3 > /usr/bin/dd if=/dev/zero > of=/rhev/data-center/598c8196-032c-02fa-00f6-000000000230/0694def1-b588-4a43- > b71f-bd66df4fef24/images/4e849764-7876-4a46-bfaa-0b82dc283475/db5b9b50-98ec- > 43e7-8de3-4b40c1a54502 bs=1048576 seek=0 skip=0 conv=notrunc count=5120 > oflag=direct (cwd None) (commands:69) > > conv=fsync flag is missing in this dd command. > > Can try to run the same dd command with this storage - does it create sparse > file? > > Does it change is we add conv=notrunc,fsync? Nir, I have checked the issue by creating with dd command below . # /usr/bin/dd if=/dev/zero of=file bs=1048576 seek=0 skip=0 conv=notrunc count=5120 oflag=direct The file output resulted in a raw image of virtual size and actual size equalling to 5GB in size. It was a RAW-RAW image and there was no issue as the original bug highlighted above in my comments. Ribu (In reply to Ribu Tho from comment #4) > I have checked the issue by creating with dd command below . > > # /usr/bin/dd if=/dev/zero of=file bs=1048576 seek=0 skip=0 conv=notrunc > count=5120 oflag=direct Ribu, where is file located? We want to test writing to gluster volume. Please try: /usr/bin/dd if=/dev/zero \ of=/rhev/data-center/598c8196-032c-02fa-00f6-000000000230/0694def1-b588-4a43-b71f-bd66df4fef24/images/4e849764-7876-4a46-bfaa-0b82dc283475/db5b9b50-98ec-43e7-8de3-4b40c1a54502 \ bs=1048576 conv=notrunc count=5120 oflag=direct If this creates sparse file, we need to move this bug to gluster. Nir, Yes , this was tested writing to a gluster volume in my lab machine only. It completed successfully for me creating a 5GB raw device . ################################################## sh-4.2$ whoami vdsm sh-4.2$ df -T gsslab-24-218.rhev.gsslab.bne.redhat.com:/labvol fuse.glusterfs 31434752 5315072 26119680 17% /rhev/data-center/mnt/glusterSD/gsslab-24-218.rhev.gsslab.bne.redhat.com:_labvol sh-4.2$ qemu-img info file image: file file format: raw virtual size: 5.0G (5368709120 bytes) disk size: 5.0G sh-4.2$ pwd /rhev/data-center/mnt/glusterSD/gsslab-24-218.rhev.gsslab.bne.redhat.com:_labvol/b4e62649-95ac-4006-8584-2d26bb3c6712/images ################################################## Ribu Ribu, the customer case attached to this bug says: After live migration of storage on a VM successfully completed, the new storage does not accurately display the used space. This is very different flow from what you describe in the bug. I don't think that vdsm writing zeros to gluster can create sparse file, but live storage migration may create sparse file. Please check the customer case and add here detailed description of: - The source storage - The source image chain size and format before the migration - The target storage - The source image chain size and format after the migration We had a bug in 4.0 or 4.1 about creating sparse files when copying images instead of preallocated files. I think this was fixed in 4.2. I checked the code used during live storage migration, and I can confirm that we create sparse file if the target file is on file based domain (NFS, Gluster). Here is a comment from the code: # To avoid prezeroing preallocated volumes on NFS domains # we create the target as a sparse volume (since it will be # soon filled with the data coming from the copy) and then # we change its metadata back to the original value. This optimization was added for bug 910445 in 2013. In the past, this optimization was harmless for raw files, because we copied raw files using dd - so the unused space was filled by zeros, and the result was preallocated raw file. However year later we switch to copy raw images using qemu-img convert for bug 1156115. With qemu-img convert, allocated data is not copied, creating sparse files. Can be fixed by using qemu-img convert preallocation=falloc option. If the file system does not support fallocate(), the copy can be much slower with preallocation, but if the user wants a preallocated image, this is the price. We have patches for adding this option to qemu-img create: https://gerrit.ovirt.org/69848. Adding the option to qemu-img convert should be easy after these patches are merged. Maor don't you work on a similar bug? is this a duplicate? (In reply to Nir Soffer from comment #11) > I checked the code used during live storage migration, and I can confirm > that > we create sparse file if the target file is on file based domain (NFS, > Gluster). > > Here is a comment from the code: > > # To avoid prezeroing preallocated volumes on NFS domains > # we create the target as a sparse volume (since it will be > # soon filled with the data coming from the copy) and then > # we change its metadata back to the original value. > > This optimization was added for bug 910445 in 2013. > > In the past, this optimization was harmless for raw files, because we copied > raw > files using dd - so the unused space was filled by zeros, and the result was > preallocated raw file. > > However year later we switch to copy raw images using qemu-img convert for > bug 1156115. With qemu-img convert, allocated data is not copied, creating > sparse > files. > > Can be fixed by using qemu-img convert preallocation=falloc option. > > If the file system does not support fallocate(), the copy can be much slower > with > preallocation, but if the user wants a preallocated image, this is the price. > > We have patches for adding this option to qemu-img create: > https://gerrit.ovirt.org/69848. > > Adding the option to qemu-img convert should be easy after these patches are > merged. > > Maor don't you work on a similar bug? is this a duplicate? This is the bug you mention, it does look like the same issue: https://bugzilla.redhat.com/1429286 - RAW-Preallocated disk is converted to RAW-sparse while cloning a VM in file based storage domain (edit) Created attachment 1401496 [details]
Engine_log
Created attachment 1401497 [details]
VDSM_log
Verification failed. Tested using: ovirt-engine-4.2.2.1-0.1.el7.noarch Environment status at the time of failure: - 1 VM - 2 disks: a) 20 GB NFS thin provision (bootable) b) 5 GB glusterFS preallocated - os installed. - VM running. Started live storage migration of disk b (5 GB) After the failure, the disk is: - thin provision - actual size of 10 GB - virtual size of 5 GB - stays on the same SD (gluster) The errors indicates on failure in SnapShot creation and VM disk replication. Attached engine and VDSM logs. Moving to ASSISGNED I sincerely doubt this has anything to do with these patches, but it's troublesome: 2018-02-27 20:27:11,484+0200 ERROR (jsonrpc/5) [virt.vm] (vmId='72e757c1-b6d5-4872-b97e-67e24d75a926') Unable to take snapshot (vm:4484) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4481, in snapshot self._dom.snapshotCreateXML(snapxml, snapFlags) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2585, in snapshotCreateXML if ret is None:raise libvirtError('virDomainSnapshotCreateXML() failed', dom=self) libvirtError: internal error: unable to execute QEMU command 'transaction': Could not read L1 table: Input/output error Yosi - can we please have a bug on this issue specifically (taking a snapshot on gluster fails), with all the relevant details? Tested using: ovirt-engine-4.2.2.2-0.1.el7.noarch vdsm-4.20.20-1.el7ev.x86_64 qemu-img-rhev-2.10.0-21.el7.x86_64 Actual result: The disk was successfully moved to other glusterFS SD. The disk allocation policy was preallocated as it was before the live storage migration. In GUI - the actual size = virtual size = 5 GiB as started. No snapshot remained as a result of failure or timeout. Moving to VERIFIED The bug is not blocked on anything, fixing title. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1489 BZ<2>Jira Resync sync2jira sync2jira |