Bug 672346 - [RFE] use preallocated qcow metadata for file volumes
Summary: [RFE] use preallocated qcow metadata for file volumes
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 3.6.0
Assignee: Allon Mureinik
QA Contact: Aharon Canan
URL:
Whiteboard: storage
: 741319 (view as bug list)
Depends On: 678050
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-01-24 21:49 UTC by Ayal Baron
Modified: 2016-02-10 19:43 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-09-13 07:53:52 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)

Description Ayal Baron 2011-01-24 21:49:03 UTC
Description of problem:

When creating a qcow image, vdsm does not create it with preallocated space for the metadata:
# qemu-img create -f qcow2 -o ?
Supported options:
size             Virtual disk size
backing_file     File name of a base image
backing_fmt      Image format of the base image
encryption       Encrypt the image
cluster_size     qcow2 cluster size
preallocation    Preallocation mode (allowed values: off, metadata)

Such disks perform a lot worse than disks with the metadata preallocated.

Comment 1 Erez Shinan 2011-02-06 16:02:28 UTC
CreateVolume takes a 'prealloc' argument but ignores it. Should I depend on it to decided whether to preallocate metadata or not, or should I always preallocate it?

Comment 2 Ayal Baron 2011-02-07 09:56:22 UTC
(In reply to comment #1)
> CreateVolume takes a 'prealloc' argument but ignores it. Should I depend on it
> to decided whether to preallocate metadata or not, or should I always
> preallocate it?

always preallocate it.  It's 2MB or so that make a world of difference.

Comment 3 Erez Shinan 2011-02-16 14:15:12 UTC
we have a problem: preallocation=metadata tries to touch the last byte on the image, which does not exist when we use thinly-provisioned lv

<kwolf> On file systems we must increase the file size (but can leave it sparse), but with block devices things could look different
<kwolf> During preallocation, qcow2 does a write to the very last cluster allocated
<erez> without this feature we don't have thin provisioning..
<erez> I see
<erez> Why does it do it? To validate that it exists?
<kwolf> Because otherwise reads would access space after the EOF
<kwolf> Which fails
<kwolf> On block devices, this doesn't matter, obviously
<kwolf> Or does it?
<kwolf> Probably it does. Hm.
<erez> (danken) that's unfortunate... we won't be able to use it on block devices, which is the only place where this matters
<erez> but why does it matter that all addresses exist in the block device?
<erez> if qemu accesses a nonexisting one, it will block on ENOSPC anyway, right?
<kwolf> The problem is not with writes, but with reads
<kwolf> I don't remember the details, though
<kwolf> Maybe we could fix the read function to return zeros (as it's supposed to work)

Comment 4 Erez Shinan 2011-02-16 14:20:43 UTC
<erez> should I open a bug on it?
<kwolf> You can open one, but don't expect me to get to it before 6.2

Comment 5 Kevin Wolf 2011-02-17 12:30:10 UTC
Also note that the performance characteristics on non-preallocated qcow2 images should be improved a lot with current 6.1 (qemu-kvm-0.12.1.2-2.145.el6).

Comment 6 Dor Laor 2011-02-20 21:54:05 UTC
(In reply to comment #5)
> Also note that the performance characteristics on non-preallocated qcow2 images
> should be improved a lot with current 6.1 (qemu-kvm-0.12.1.2-2.145.el6).

Yap, please retest w/ latest kernel and qemu-kvm

Comment 9 Dan Kenigsberg 2012-04-09 15:00:43 UTC
Kevin, can we now use preallocation-metadata on thinly-provisioned qcow2 within lv? If we can, should we?

Comment 10 Kevin Wolf 2012-04-10 09:28:40 UTC
No, metadata preallocation on block devices still requires the block device to be large enough from the very beginning.

Comment 11 Dan Kenigsberg 2012-04-10 14:39:16 UTC
On Tue, Apr 10, 2012 at 01:44:19PM +0200, Kevin Wolf wrote:
> 
> In fact I believe it's pretty pointless in the common case.
> 
> Metadata preallocation creates a mapping for all clusters, that is, it
> turns the image into something that is very close to a (sparse) raw
> image. If the first thing you do is to write to virtual offset 4 GB,
> this means that your LV has to grow to 4 GB, even if there was nothing
> else before it. When you create a file system on the image, you'll have
> writes close to the end of the image, expanding it to its full size
> immediately.
> 
> The only potential (non-)use case is when you partition the image and
> leave the partitions at the end completely unused, you would save the
> unused space then. I doubt that this is really an interesting case.


Too bad, but we cannot use Metadata preallocation.

Comment 12 Ayal Baron 2012-04-11 09:23:12 UTC
(In reply to comment #11)
> On Tue, Apr 10, 2012 at 01:44:19PM +0200, Kevin Wolf wrote:
> > 
> > In fact I believe it's pretty pointless in the common case.
> > 
> > Metadata preallocation creates a mapping for all clusters, that is, it
> > turns the image into something that is very close to a (sparse) raw
> > image. If the first thing you do is to write to virtual offset 4 GB,
> > this means that your LV has to grow to 4 GB, even if there was nothing
> > else before it. When you create a file system on the image, you'll have
> > writes close to the end of the image, expanding it to its full size
> > immediately.
> > 
> > The only potential (non-)use case is when you partition the image and
> > leave the partitions at the end completely unused, you would save the
> > unused space then. I doubt that this is really an interesting case.
> 
> 
> Too bad, but we cannot use Metadata preallocation.

1. We can use it for files
2. it would be worthwhile to test if there is any benefit to having just the tables preallocated (need to discuss wit Kevin)

Comment 13 Ayal Baron 2013-09-04 22:55:50 UTC
*** Bug 741319 has been marked as a duplicate of this bug. ***

Comment 17 Mark Wagner 2014-01-17 18:44:54 UTC
Sanjay is DRI for RHEV now, deferring to him

Comment 18 Sanjay Rao 2014-01-17 19:23:29 UTC
From a performance stand point, we all know that pre-allocation will give a significant performance advantage when writing for the first time to the device. There are other advantages for pre-allocation like reduced file fragmentation which in turn also helps performance even for subsequent writes.

As disk space is no longer at a premium, we always recommend customers to pre-allocate files for all application types.

Comment 19 Sanjay Rao 2014-01-17 20:03:53 UTC
Actually I ran some tests recently with qcow2 devices on a file system and here is what I saw. 

Just to be clear, I ran this on KVM on RHEL6.5 using virt-manager.

I created the devices to run with virtio drives and no-cache. I noted that when devices are created with qcow2, they are sparse files even when I used the pre-allocate flag in the virt-manager. 

The tests were done with 8K blocks and I ran Seq Writes, Sequential Read, Random Writes and Random Reads.

                                 Seq Wr   Seq Rd   Rand Wr   Rand Rd
qcow2/virtio/pre-alloc/nocache  160.82    705.28     28.83     42.52   (not pre-allocated on host)

qcow2/virtio/sparse/nocache     168.43   741.59      29.36     43.00   (not pre-allocated on host)

Since the files were not pre-allocated even with the pre-allocate flag, I did not see any difference in performance for both options (pre-allocate or sparse).

Comment 20 Sanjay Rao 2014-01-17 20:08:33 UTC
I do not have numbers with and without the metadata pre-allocated.


Note You need to log in before you can comment on or make changes to this bug.