Bug 891347

Summary: Use fallocate when copying disk images around in _base to improve copy performance and out of space errors
Product: Red Hat OpenStack Reporter: Perry Myers <pmyers>
Component: openstack-novaAssignee: Pádraig Brady <pbrady>
Status: CLOSED ERRATA QA Contact: Kashyap Chamarthy <kchamart>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 2.0 (Folsom)CC: afazekas, apevec, kchamart, ndipanov
Target Milestone: snapshot4   
Target Release: 2.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-nova-2012.2.3-2.el6ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-21 14:14:34 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
debugging info none

Description Perry Myers 2013-01-02 11:47:52 EST
Description of problem:
Currently Nova does a copy of disk images in /var/lib/nova/instances/_base, but if we fallocate the space first we can improve performance of the copies and also we'll know quicker if the disk has enough space so we can error out sooner.
Comment 1 Pádraig Brady 2013-01-02 11:58:37 EST
Performance of the copy is not improved, but of subsequent running of the VM,
since it will not need to extend the image bit by bit.
Comment 2 Yaniv Kaul 2013-01-03 02:55:41 EST
(In reply to comment #1)
> Performance of the copy is not improved, but of subsequent running of the VM,
> since it will not need to extend the image bit by bit.

Why would anyone extend the base image? Aren't _base just the base templates on top of which we perform a qcow2 snapshot to run a VM?

I'd try to copy to a raw sparse image.
Comment 4 Pádraig Brady 2013-02-27 04:14:50 EST
To test this, ensure that preallocate_images=space in nova.conf and
after booting an instance, run du on the instances/$instance/disk file
to ensure it's fully allocated
Comment 5 Pádraig Brady 2013-02-27 04:39:06 EST
@Yaniv you're right, the merged upstream patch https://review.openstack.org/#/c/22054/ preallocates the $instance/disk as this is what will be extended at run time
Comment 7 Kashyap Chamarthy 2013-02-28 08:25:27 EST
Just a note here - the config directive 'preallocate' should be documented in nova.conf

#-----------------------#
[tuser1@interceptor export(keystone_admin)]$ grep -i preallocate -A4 /usr/lib/python2.6/site-packages/nova/virt/driver.py
    cfg.StrOpt('preallocate_images',
               default='none',
               help='VM image preallocation mode: '
                    '"none" => no storage provisioning is done up front, '
                    '"space" => storage is fully allocated at instance start'),
[tuser1@interceptor export(keystone_admin)]$ 
#-----------------------#

Search for the string in nova.conf
#-----------------------#
[tuser1@interceptor export(keystone_admin)]$ sudo grep -i preallocate /etc/nova/*
[tuser1@interceptor export(keystone_admin)]$ 
#-----------------------#


Version Info:
#-----------------------#
[tuser1@interceptor export(keystone_admin)]$ rpm -q openstack-nova
openstack-nova-2012.2.3-3.el6ost.noarch
[tuser1@interceptor export(keystone_admin)]$ 
#-----------------------#
[tuser1@interceptor export(keystone_admin)]$ rpm -q openstack-nova --changelog | grep 891347
- Support preallocated VM images #891347
[tuser1@interceptor export(keystone_admin)]$ 
#-----------------------#
Comment 9 Kashyap Chamarthy 2013-03-01 03:24:12 EST
I don't see the qcow2 overlay (location, for ex: /var/lib/nova/instances/instance-00000019/disk) being used is preallocated. Am I missing anything trivial here ?

Some details below:


# Add the directive 'preallocate_images=space' in nova.conf & restart all services. (just nova is sufficient, actually)
#-----------------------#
[tuser1@interceptor export(keystone_admin)]$ sudo grep preallocate /etc/nova/nova.conf
# preallocate images -- 891347
preallocate_images=space
[tuser1@interceptor export(keystone_admin)]$ 
#-----------------------#
[root@interceptor ~(keystone_admin)]# for j in `for i in $(ls -1 /etc/init.d/openstack-*) ; do $i status | grep running ; done | awk '{print $1}'` ; do service $j restart ; done
#-----------------------#


# boot a new instance
[tuser1@interceptor ~(keystone_user1)]$ #nova boot --flavor 1 --key_name oskey --image 1e6292f9-82bd-4cdb-969e-c863cb1c6692 fed-t1


# list, and check it's running
[tuser1@interceptor ~(keystone_user1)]$ nova list 
+--------------------------------------+-----------+---------+-------------------+
| ID                                   | Name      | Status  | Networks          |
+--------------------------------------+-----------+---------+-------------------+
| 54ed73b5-4628-4dae-87bf-5f59a39f094b | fed-t1    | ACTIVE  | net1=10.65.207.51 |
| bb494526-e993-4adc-a453-902a5a9f530d | fedora-t8 | SHUTOFF | net1=10.65.207.53 |
+--------------------------------------+-----------+---------+-------------------+
[tuser1@interceptor ~(keystone_user1)]$ 


# do a virsh list
[tuser1@interceptor ~(keystone_user1)]$ sudo virsh list
 Id    Name                           State
----------------------------------------------------
 29    instance-00000019              running

[tuser1@interceptor ~(keystone_user1)]$ 


# find the block device in use
[tuser1@interceptor ~(keystone_user1)]$ sudo virsh domblklist instance-00000019
Target     Source
------------------------------------------------
vda        /var/lib/nova/instances/instance-00000019/disk

[tuser1@interceptor ~(keystone_user1)]$


# find the disk image (the qcow2 overlay in use) info
[tuser1@interceptor ~(keystone_user1)]$ sudo qemu-img info /var/lib/nova/instances/instance-00000019/disk
image: /var/lib/nova/instances/instance-00000019/disk
file format: qcow2
virtual size: 9.8G (10486808576 bytes)
disk size: 8.8M
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/06a057b9c7b0b27e3b496f53d1e88810a0d1d5d3
[tuser1@interceptor ~(keystone_user1)]$ 


# Also find the info of the backing file
[tuser1@interceptor ~(keystone_user1)]$ sudo qemu-img info /var/lib/nova/instances/_base/06a057b9c7b0b27e3b496f53d1e88810a0d1d5d3
image: /var/lib/nova/instances/_base/06a057b9c7b0b27e3b496f53d1e88810a0d1d5d3
file format: raw
virtual size: 9.8G (10486808576 bytes)
disk size: 738M
[tuser1@interceptor ~(keystone_user1)]$
Comment 10 Pádraig Brady 2013-03-01 05:23:19 EST
Yes "disk size:" should match the virtual size here.
Are there any warnings or errors from the fallocate calls in the logs?
If not can you enable debugging to see the fallocate calls being executed?
One possibility is that instances/ is on NFS or some other
file system that doesn't support fallocate?
Comment 11 Kashyap Chamarthy 2013-03-01 06:52:12 EST
Created attachment 704148 [details]
debugging info
Comment 12 Kashyap Chamarthy 2013-03-01 07:35:30 EST
(In reply to comment #11)
> Created attachment 704148 [details]
> debugging info

After some more investigation with Pádraig, it appears 'fallocate' cmd is being prevented to run.


Manually test to fallocate an overlay:

- shutoff the nova instance
- find the block device of the overlay
- and fallocate the overlay to the virtual size (10486808576 bytes) 
#-----------------------#
[tuser1@interceptor ~(keystone_user1)]$ sudo virsh domblklist instance-00000018
Target     Source
------------------------------------------------
vda        /var/lib/nova/instances/instance-00000018/disk
#-----------------------#
[tuser1@interceptor ~(keystone_user1)]$ ls -lash /var/lib/nova/instances/instance-00000018/disk
9.8M -rw-r--r--. 1 root root 9.9M Mar  1 13:05 /var/lib/nova/instances/instance-00000018/disk
#-----------------------#
[tuser1@interceptor ~(keystone_user1)]$ qemu-img info /var/lib/nova/instances/instance-00000018/disk
image: /var/lib/nova/instances/instance-00000018/disk
file format: qcow2
virtual size: 9.8G (10486808576 bytes)
disk size: 9.8M
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/06a057b9c7b0b27e3b496f53d1e88810a0d1d5d3
#-----------------------#
[tuser1@interceptor ~(keystone_user1)]$ sudo fallocate -n -l 10486808576 /var/lib/nova/instances/instance-00000018/disk
fallocate: /var/lib/nova/instances/instance-00000018/disk: fallocate failed: No space left on device
[tuser1@interceptor ~(keystone_user1)]$
#-----------------------#
[tuser1@interceptor ~(keystone_user1)]$ qemu-img info /var/lib/nova/instances/instance-00000018/disk
image: /var/lib/nova/instances/instance-00000018/disk
file format: qcow2
virtual size: 9.8G (10486808576 bytes)
disk size: 9.1G
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/06a057b9c7b0b27e3b496f53d1e88810a0d1d5d3
[tuser1@interceptor ~(keystone_user1)]$ 
#-----------------------#

It can be seen now, the disk has 9.1G allocated blocks:
#-----------------------#
[tuser1@interceptor ~(keystone_user1)]$ sudo ls -lash /var/lib/nova/instances/instance-00000018/disk
9.1G -rw-r--r--. 1 root root 9.9M Mar  1 13:05 /var/lib/nova/instances/instance-00000018/disk
[tuser1@interceptor ~(keystone_user1)]$
#-----------------------#


To check the size inside the instance, start the nova instance; ssh into it, and run 'df -hT' to check the file system space:
#-----------------------#
[tuser1@interceptor ~(keystone_user1)]$ ssh -i oskey.priv root@10.65.207.53
reverse mapping checking getaddrinfo for dhcp207-53.lab.eng.pnq.redhat.com [10.65.207.53] failed - POSSIBLE BREAK-IN ATTEMPT!
[root@localhost ~]# 
[root@localhost ~]# df -hT
Filesystem     Type      Size  Used Avail Use% Mounted on
rootfs         rootfs    9.3G  881M  8.4G  10% /
devtmpfs       devtmpfs  239M     0  239M   0% /dev
tmpfs          tmpfs     247M     0  247M   0% /dev/shm
tmpfs          tmpfs     247M  784K  246M   1% /run
/dev/vda2      ext4      9.3G  881M  8.4G  10% /
tmpfs          tmpfs     247M     0  247M   0% /sys/fs/cgroup
tmpfs          tmpfs     247M     0  247M   0% /media
[root@localhost ~]# 
#-----------------------#
Comment 13 Alan Pevec 2013-03-05 04:54:33 EST
(In reply to comment #12)
> After some more investigation with Pádraig, it appears 'fallocate' cmd is
> being prevented to run.
...
> [tuser1@interceptor ~(keystone_user1)]$ sudo fallocate -n -l 10486808576
> /var/lib/nova/instances/instance-00000018/disk
> fallocate: /var/lib/nova/instances/instance-00000018/disk: fallocate failed:
> No space left on device

What's the conclusion here, you don't have enough space on your machine or is it something else?
Comment 14 Kashyap Chamarthy 2013-03-05 07:13:43 EST
(In reply to comment #13)
> (In reply to comment #12)
> > After some more investigation with Pádraig, it appears 'fallocate' cmd is
> > being prevented to run.
> ...
> > [tuser1@interceptor ~(keystone_user1)]$ sudo fallocate -n -l 10486808576
> > /var/lib/nova/instances/instance-00000018/disk
> > fallocate: /var/lib/nova/instances/instance-00000018/disk: fallocate failed:
> > No space left on device
> 
> What's the conclusion here, 

Sorry for not summarizing at the end. Conclusion so far is: 'fallocate' command isn't being invoked OpenStack. This needs further investigation.

> you don't have enough space on your machine or
> is it something else?

It's something else. That 'No space left on device' isn't accurate.

If you followed the command after fallocate:

#-----------------------#
[tuser1@interceptor ~(keystone_user1)]$ sudo ls -lash /var/lib/nova/instances/instance-00000018/disk
9.1G -rw-r--r--. 1 root root 9.9M Mar  1 13:05 /var/lib/nova/instances/instance-00000018/disk
[tuser1@interceptor ~(keystone_user1)]$
#-----------------------#

The "9.9M" is supposed to be the file size, while the "9.1G" is the allocated space on disk. Essentially, we've allocated beyond the end of the file (as we invoked fallocate using -n)
Comment 15 Pádraig Brady 2013-03-05 11:39:57 EST
Worked through this with Kashyap today.
I'm fairly sure it's working as expected.
That was observed above is that fallocation is not done
when using a flavor with a 0 disk size.
fallocation was seen to be in place when using a flavor with disk size set.
Comment 16 Kashyap Chamarthy 2013-03-06 05:48:55 EST
Thanks Pádraig. Just summarizing it here.

Summary:
--------
'fallocate' is invoked successfully only for flavor-2 ('small') or higher, as they have elastic FS (meaning, nova can resize the FS of the image). However, flavor-1 images ('tiny') has 'Disk - 0' which means no scope for elastic FS). That can be noticed below:


Test info:
---------
[1] Boot flavor 2 ('small') instance

[tuser1@interceptor glance(keystone_user1)]$ nova boot --flavor 2 --key_name oskey --image 1e6292f9-82bd-4cdb-969e-c863cb1c6692 test9-fed

[2] Run nova list

[tuser1@interceptor glance(keystone_user1)]$ nova list
+--------------------------------------+------------+---------+-------------------+
| ID                                   | Name       | Status  | Networks          |
+--------------------------------------+------------+---------+-------------------+
| 8dc43c0d-519a-4238-84c7-29159f6d26ff | test9-fed  | ACTIVE  | net1=10.65.207.53 |
+--------------------------------------+------------+---------+-------------------+
[tuser1@interceptor glance(keystone_user1)]$ 


[3] Find out the disk image in use
[tuser1@interceptor glance(keystone_user1)]$ sudo virsh domblklist instance-0000002f
Target     Source
------------------------------------------------
vda        /export/nova/instances/instance-0000002f/disk


[4] Find qemu-img info of the disk (it can be noted here, that disk size *and* virtual size are both same
[tuser1@interceptor glance(keystone_user1)]$ sudo qemu-img info /export/nova/instances/instance-0000002f/disk
image: /export/nova/instances/instance-0000002f/disk
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 20G
cluster_size: 65536
backing file: /export/nova/instances/_base/06a057b9c7b0b27e3b496f53d1e88810a0d1d5d3_20


[5] Search for 'fallocate' in nova compute log:
======
[root@interceptor ~]# grep fallocate /var/log/nova/compute.log
2013-03-06 14:47:30 WARNING nova.virt.libvirt.imagebackend [req-a3d2bbec-560f-49cf-9b3d-a7800daedfb8 320ce46de7e24a75a7ff8906d7355ff7 57ff99aae24b4035b52177a722c4091f] fallocate gate vars: size=21474836480, preallocate=True, can=True
======



Additional info:
----------------
Technical notes from discussion with Kevin Wolf (qemu dev) on the benefit of using 'fallocate' for qcow2 images:

"It won't improve the qcow2 internal allocations, but at least you get some kind of preallocation on the file system level.

"When you write to an unallocated block you get allocations on multiple levels.  qcow2 will allocate a cluster in the file format; it still has to do this after fallocate. Then it writes to the image file, so the file system will allocate a block for it; this is the one that you may save. Possibly LVM has to do another allocation, etc."
Comment 17 Kashyap Chamarthy 2013-03-06 06:13:31 EST
Turning to VERIFIED per Comment #16


Version info:
#-----------------------#
[tuser1@interceptor glance(keystone_user1)]$ rpm -q openstack-nova --changelog | grep 891347
- Support preallocated VM images #891347
[tuser1@interceptor glance(keystone_user1)]$ rpm -q openstack-nova 
openstack-nova-2012.2.3-4.el6ost.noarch
[tuser1@interceptor glance(keystone_user1)]$ cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.4 (Santiago)
[tuser1@interceptor glance(keystone_user1)]$ 
#-----------------------#
Comment 19 errata-xmlrpc 2013-03-21 14:14:34 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0657.html