Bug 1119342 - VM Import fails due to undersized target volume
Summary: VM Import fails due to undersized target volume
Keywords:
Status: CLOSED DUPLICATE of bug 1130246
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.4.0
Hardware: All
OS: Linux
urgent
high
Target Milestone: ---
: 3.5.0
Assignee: Maor
QA Contact: Aharon Canan
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-14 14:47 UTC by Allie DeVolder
Modified: 2019-04-28 09:02 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-11-27 16:46:57 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Allie DeVolder 2014-07-14 14:47:40 UTC
Description of problem:
When attempting to import a VM from a 2.x environment, the created target volume is too small, resulting in low level image copy fail errors

Version-Release number of selected component (if applicable):
3.4

How reproducible:
Unknown

Steps to Reproduce:
1. Import 4-disk VM from Export Domain
2. Import fails

Actual results:
Import fails with 
CopyImageError: low level Image copy failed: ('General Storage Exception: ("rc: 1, err: [\'error while writing sector 334995456: No space left on device\']",)',)

Expected results:
Successful import

Comment 11 Maor 2014-09-16 13:17:57 UTC
It should be verified with qcow as discussed with Nir

Comment 13 Maor 2014-09-23 13:42:02 UTC
I think that we need to verify and investigate how qcow and qemu-img convert behaves with big files.
I'm not sure it is a bug, since the engine can't guarantee the exact free space required for the disk to be copied.
qcow creates also metadata files, so what needs to do in this bug is to investigate how qcow works and how much space does the metadata requires and how much does the image requires handling big files.

In the bug there is a gap of several GIGA while the image size is around 170GB.
If that is reasonable that the metadata files will take 1GB, then it is not a bug IMO, although if it doesn't we need to check why this happened.

Comment 14 Allon Mureinik 2014-09-28 14:50:52 UTC
(In reply to Maor from comment #13)
> I think that we need to verify and investigate how qcow and qemu-img convert
> behaves with big files.
> I'm not sure it is a bug, since the engine can't guarantee the exact free
> space required for the disk to be copied.
You attempt to import a vm and it fails. 
This is absolutely a bug.

It seems we try to ensure the capacity:

   # Extend volume (for LV only) size to the actual size
   dstVol.extend((volParams['apparentsize'] + 511) / 512)

However, We usually guestimate that the overhead of the qcow format is 10% - perhaps we're missing something here and for importing raw->qcow we need to over-allocate (and truncate later)?

Comment 15 Allon Mureinik 2014-09-28 14:51:23 UTC
Fede, do you have a take on this?

Comment 16 Federico Simoncelli 2014-10-24 16:17:21 UTC
Thread-11701::INFO::2014-07-10 20:32:53,327::logUtils::44::dispatcher::(wrapper) Run and protect: copyImage(sdUUID='f906ce1f-e196-4388-8620-23c0e3194ede', spUUID='6d1d600b-134b-4f2a-92fd-b686db80aa79', vmUUID='', srcImgUUID='7f84fe88-a3a7-491b-8b0b-f20769ed3ac7', srcVolUUID='e1792552-3f85-4d0b-b458-a2f3dc5c63a5', dstImgUUID='9cd68b90-a2fa-4c0c-951d-4a44da939937', dstVolUUID='1f397aae-8a3b-4c40-b615-a277ec825711', description='', dstSdUUID='2c1deccf-2a96-4a45-ae8c-1ad67ddb5a2d', volType=8, volFormat=4, preallocate=1, postZero='true', force='true')

copyImage is reserved to create templates (squashing the entire chain).

In this case engine should have called moveImage (with operation type COPY) that takes care of properly allocate the correct space.

Comment 17 Federico Simoncelli 2014-10-24 21:55:42 UTC
To complete comment 16:

as I said copyImage is reserved to create templates therefore it was always used with target format RAW (not COW).

I am sure that fixing this on the vdsm side is trivial but adding the support for that is currently out of scope IMO.

So the solution is either use copyImage with RAW target format or use moveImage with operation type COPY (this one is probably preferred).

Let me know if there's a *new* really specific reason to use copyImage with COW format.

Comment 18 Tal Nisan 2014-10-26 16:03:32 UTC
Maor, can you take it from here?

Comment 19 Maor 2014-10-30 19:21:48 UTC
I've been trying to reproduce this issue, currently created a VM with disk size of 160GB and exported it, duo to the large space, the export and import operation takes some time.
Will update as soon it will be completed with the conclusions

Comment 20 Maor 2014-11-04 07:48:11 UTC
I've succeeded to import the VM with 160GB disk:

copyImage(sdUUID='4af3f4b9-d7fa-4801-97b3-e7fc58c87b86', spUUID='e034a89c-e1a7-4509-bcfe-350005664e15', vmUUID='', srcImgUUID='697014ec-6f3d-4a
ed-a621-f38570466e7b', srcVolUUID='aaa3fd5e-5f95-4566-acd3-44c3b89c8a19', dstImgUUID='697014ec-6f3d-4aed-a621-f38570466e7b', dstVolUUID='aaa3fd5e-5f95-4566-acd3-44c3b89c8a19', description='', dstSdUUID='966a8347-8342-48aa-ab61-4247cc8aeab
6', volType=8, volFormat=4, preallocate=2, postZero='false', force='true')
....
Run and protect: getVolumeInfo, Return response: {'info': {'status': 'OK', 'domain': '966a8347-8342-48aa-ab61-4247cc8aeab6', 'voltype': 'LEAF', 'description':
 '', 'parent': '00000000-0000-0000-0000-000000000000', 'format': 'COW', 'image': '697014ec-6f3d-4aed-a621-f38570466e7b', 'ctime': '1415016259', 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '188978561024', 'children'
: [], 'pool': '', 'capacity': '171798691840', 'uuid': 'aaa3fd5e-5f95-4566-acd3-44c3b89c8a19', 'truesize': '188978561024', 'type': 'SPARSE'}}


vs
Customer:
copyImage(sdUUID='f906ce1f-e196-4388-8620-23c0e3194ede', spUUID='6d1d600b-134b-4f2a-92fd-b686db80aa79', vmUUID='', srcImgUUID='7f84fe88-a3a7-491b-8b0b-f20769ed3ac7', srcVolUUID='e1792552-3f85-4d0b-b458-a2f3dc5c63a5', dstImgUUID='9cd68b90-a2fa-4c0c-951d-4a44da939937', dstVolUUID='1f397aae-8a3b-4c40-b615-a277ec825711', description='', dstSdUUID='2c1deccf-2a96-4a45-ae8c-1ad67ddb5a2d', volType=8, volFormat=4, preallocate=1, postZero='true', force='true')


I was digging into the logs at the sosreport tar file, trying to follow and reproduce the scenario,
but it seems that the logs does not contain the errors mentioned in the bug at https://bugzilla.redhat.com/show_bug.cgi?id=1119342#c1.
I was looking at the VDSM.log at vdsm.log.44.xz (which has the same date as the error posted at https://bugzilla.redhat.com/show_bug.cgi?id=1119342#c1), but I didn't saw the mentioning errors there,
I've also looked at the engine log and didn't saw the errors there either.

Allan, can u please point me to the full log file mentioning this error

Comment 22 Maor 2014-11-09 15:48:57 UTC
Thanks for the logs Allan, 
based on the code in VDSM we allocate 10% more for cow metadata, 
it might not be enough if your disk is fully allocated.

can you please share the output of the following commands, to know what is the true size of the images:

du -b /rhev/data-center/6d1d600b-134b-4f2a-92fd-b686db80aa79/f906ce1f-e196-4388-8620-23c0e3194ede/images/7f84fe88-a3a7-491b-8b0b-f20769ed3ac7/e1792552-3f85-4d0b-b458-a2f3dc5c63a5*

and also can you please share the meta data of this volume please.

Comment 23 Maor 2014-11-09 15:55:37 UTC
(In reply to Maor from comment #22)
...
> and also can you please share the meta data of this volume please.

Using cat command on the meta data as so:
cat /rhev/data-center/6d1d600b-134b-4f2a-92fd-b686db80aa79/f906ce1f-e196-4388-8620-23c0e3194ede/images/7f84fe88-a3a7-491b-8b0b-f20769ed3ac7/*.meta

and also ls -ls:
ls -ls /rhev/data-center/6d1d600b-134b-4f2a-92fd-b686db80aa79/f906ce1f-e196-4388-8620-23c0e3194ede/images/7f84fe88-a3a7-491b-8b0b-f20769ed3ac7/*


Thank you

Comment 24 Maor 2014-11-27 13:07:19 UTC
Additional info:
It looks that in the customer environment the copy Image is being done with volume format of COW and volume type of PREALLOCATE, this combination is not allowed any more with oVirt 3.5.

In oVirt 3.5, once we try to copy an image which has a COW volume format and SPARSE volume type with a copy collapse option to a block domain, we do this with COW and SPARSE so the high watermark can be used properly.

Comment 25 Maor 2014-11-27 14:39:09 UTC
I think this is a duplicate bug of https://bugzilla.redhat.com/1130246, Allan, can u please try to reproduce this on 3.4.2 (The version which the fix is part of it)

Comment 27 Allon Mureinik 2014-11-27 16:42:32 UTC
(In reply to Maor from comment #24)
> Additional info:
> It looks that in the customer environment the copy Image is being done with
> volume format of COW and volume type of PREALLOCATE, this combination is not
> allowed any more with oVirt 3.5.
It was never allowed.
There was a short time in 3.4.0 that importing sparse images to block storage (iirc) produced such images, but that was a BUG, and was fixed in 3.4.2.


Note You need to log in before you can comment on or make changes to this bug.