Bug 1195768

Summary: while filling thin disk, actual disk size increasing above the provisioned size
Product: Red Hat Enterprise Virtualization Manager Reporter: Raz Tamir <ratamir>
Component: ovirt-engineAssignee: Nir Soffer <nsoffer>
Status: CLOSED ERRATA QA Contact: Ori Gofen <ogofen>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.5.0CC: acanan, amureini, anande, lpeer, lsurette, nsoffer, rbalakri, Rhev-m-bugs, yeylon, ykaul, ylavi
Target Milestone: ovirt-3.6.0-rc   
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1196056 (view as bug list) Environment:
Last Closed: 2016-03-09 20:58:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1196056    
Attachments:
Description Flags
vdsm and engine logs
none
query
none
no-snapshots
none
single-disk none

Description Raz Tamir 2015-02-24 14:20:50 UTC
Created attachment 994728 [details]
vdsm and engine logs

Description of problem:
Performing dd to 2GB disk bs=1M count=2048 will cause the actual disk size will be 3GB


Version-Release number of selected component (if applicable):
vt13.11

How reproducible:
100%

Steps to Reproduce:
1. create 2GB thin
2. dd to it 'dd if=/dev/urandom of=/dev/<disk_logical_name>' bs=1M count=2048'
3. 

Actual results:
explained above

Expected results:


Additional info:

Comment 1 Allon Mureinik 2015-02-24 22:51:18 UTC
Nir, IIUC, https://gerrit.ovirt.org/#/c/38088/ should resolve this. Is this right?

Comment 2 Nir Soffer 2015-02-25 06:52:50 UTC
(In reply to Allon Mureinik from comment #1)
> Nir, IIUC, https://gerrit.ovirt.org/#/c/38088/ should resolve this. Is this
> right?

Right, moving these patches to this bug.

Comment 4 Nir Soffer 2015-02-26 07:59:28 UTC
This example in the description is not a bug but expected behavior of the
system.

When working with qcow2 format, we need *more* space then the virtual size
to allow the guest to use the entire virtual size of the disk.

For example, if we create a 2G disk, qemu may need up to 2.2G for storing
2G of data on the device. The actual amount of additional space is tricky
the compute, and vdsm is using an estimate of 10% when computing the size
of qcow2 volumes. Currently vdsm extend chunk is 1G (or 2G during live 
storage migartion), so the disk is extended to 3G.

If we limit the disk size to the virtual size (http://gerrit.ovirt.org/38088),
and a vm is trying to fill up the disk, the vm will pause without a way 
to resume it, since qemu cannot complete the write operation.

The current code allows such write to complete by extending the disk when
the free space is bellow the configured watermark limit (default 512MB)
We cannot change this behavior.

For images under 10G, we can optimize the allocation and allocate less than
one chunk (1G) but this is low priority change.

The real bug here is that in 3.5, disk extend is *unlimited*. This is not
a problem normal conditions, but it is a problem if the extend logic is
broken, as seen in bug 1176673.

Moving to ASSIGNED since the suggested patches are incorrect. We need first
to fix the limit in master before we can port the fix to 3.5.

Lowering severity as this is not an issue in normal conditions, and only
a nice to have property.

Comment 5 Nir Soffer 2015-02-26 08:46:40 UTC
The new patches should fix this issue correctly:
1. https://gerrit.ovirt.org/38178 change the limit to allow up to 10%
   extra allocation (rounded to next lv extent)
2. https://gerrit.ovirt.org/38179 avoid pointless extension requests
   required if we limit the disk size

Comment 6 Nir Soffer 2015-02-26 09:25:06 UTC
Testing:

Successful write:
1. Add second 1G disk to vm
2. On the guest, run
   dd if=/dev/zero of=/dev/vdb bs=8M count=128

The operation must succeed
The disk should be extended to about 1.12G:
vm should not pause

Failed write:
1. Add second 1G disk to vm
2. On the guest, run
   dd if=/dev/zero of=/dev/vdb bs=8M count=129

The operation should fail in the guest with "No space left on device"
The disk should be extended to about 1.12G:
vm should not pause

To check volume size, use lvm:
    pvscan --cache
    lvs vgname/lvname

You can repeat both tests with bigger disk (e.g. 8G), writing
more data (count=1024). The volume will be extended up to about
9G.

Comment 7 Allon Mureinik 2015-03-08 17:57:00 UTC
Nir, I removed the abandoned 3.5 backports.
For 3.6, I see two patches on master that are merged. 
Is there anything else we're waiting for, or can this bug be moved to MODIFIED?

Comment 8 Nir Soffer 2015-03-16 17:36:58 UTC
(In reply to Allon Mureinik from comment #7)
> Nir, I removed the abandoned 3.5 backports.
> For 3.6, I see two patches on master that are merged. 
> Is there anything else we're waiting for, or can this bug be moved to
> MODIFIED?

I think we are done.

Comment 9 Ori Gofen 2015-07-05 13:32:23 UTC
Verified on oVirt 3.6.0.3 the qcow does not get extended beyond configured limit, though it's final size is bigger than the actual size reported.

Comment 10 Anand Nande 2015-07-06 05:51:32 UTC
(In reply to Nir Soffer from comment #4)
> This example in the description is not a bug but expected behavior of the
> system.
> 
> When working with qcow2 format, we need *more* space then the virtual size
> to allow the guest to use the entire virtual size of the disk.
> 
> For example, if we create a 2G disk, qemu may need up to 2.2G for storing
> 2G of data on the device. The actual amount of additional space is tricky
> the compute, and vdsm is using an estimate of 10% when computing the size
> of qcow2 volumes. Currently vdsm extend chunk is 1G (or 2G during live 
> storage migartion), so the disk is extended to 3G.
> 
> If we limit the disk size to the virtual size
> (http://gerrit.ovirt.org/38088),
> and a vm is trying to fill up the disk, the vm will pause without a way 
> to resume it, since qemu cannot complete the write operation.
> 
> The current code allows such write to complete by extending the disk when
> the free space is bellow the configured watermark limit (default 512MB)
> We cannot change this behavior.
> 
> For images under 10G, we can optimize the allocation and allocate less than
> one chunk (1G) but this is low priority change.
> 
> The real bug here is that in 3.5, disk extend is *unlimited*. This is not
> a problem normal conditions, but it is a problem if the extend logic is
> broken, as seen in bug 1176673.
> 
> Moving to ASSIGNED since the suggested patches are incorrect. We need first
> to fix the limit in master before we can port the fix to 3.5.
> 
> Lowering severity as this is not an issue in normal conditions, and only
> a nice to have property.

1GB extension is acceptable - but in my case (rhev-3.2) its much larger.
Virtual Machine which has been allocated a OS disk of 25GB (Thin), is showing that the "actual size" is 58GB. __Without Snapshot__

See snapshots attached.

Comment 11 Anand Nande 2015-07-06 05:54:44 UTC
Created attachment 1048665 [details]
query

Comment 12 Anand Nande 2015-07-06 05:55:36 UTC
Created attachment 1048666 [details]
no-snapshots

Comment 13 Anand Nande 2015-07-06 05:56:08 UTC
Created attachment 1048667 [details]
single-disk

Comment 14 Nir Soffer 2015-07-06 06:57:45 UTC
(In reply to Anand Nande from comment #10)
> 1GB extension is acceptable - but in my case (rhev-3.2)
> its much larger.
> Virtual Machine which has been allocated a OS disk of 25GB (Thin), is
> showing that the "actual size" is 58GB. __Without Snapshot__

This is a known issue in versions before 3.6.

In 3.6, we limit the extend size to 1.1 * virtual size.

Comment 15 Anand Nande 2015-07-06 08:39:16 UTC
Customers question : Is there a way to reclaim this space?

Comment 16 Nir Soffer 2015-07-06 10:18:18 UTC
(In reply to Anand Nande from comment #15)
> Customers question : Is there a way to reclaim this space?

You should be able to reclaim this, since the vm cannot use more then the
virtual size.

You can use lvm commands to shrink the lv to virtual size * 1.1
(lvm will round the value using 128MiB chunks).

Assuming virtual disk size of 25GiB, we should resize to 27.5GiB 
(rounding up the next GiB for simplicity)

    lvreduce -L28G /dev/vgname/lvname

Note that no other host should access this vg while you make this change.
The safest way to do this is to shutdown all hosts that can access this 
storage domain.

If you cannot allow downtime, you will have to stop the host running as
spm and stop engine (so it cannot elect new spm) before doing this change.

Please have good backup before doing this.

Comment 18 errata-xmlrpc 2016-03-09 20:58:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0376.html