Bug 1179690

Summary: faulty storage allocation checks when adding a vm to a pool
Product: Red Hat Enterprise Virtualization Manager Reporter: Vered Volansky <vered>
Component: ovirt-engineAssignee: Vered Volansky <vered>
Status: CLOSED CURRENTRELEASE QA Contact: lkuchlan <lkuchlan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: acanan, amureini, gklein, kgoldbla, lpeer, lsurette, rbalakri, Rhev-m-bugs, tnisan, yeylon, ykaul, ylavi
Target Milestone: ovirt-3.6.0-rcKeywords: ZStream
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-3.6.0_qa1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1185614 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1184928    
Bug Blocks: 960934, 1185614    
Attachments:
Description Flags
engine vdsm and server logs none

Description Vered Volansky 2015-01-07 11:00:44 UTC
AddVmAndAttachToPoolCommand - VMs are added to the pool with empty disks (Thinly provisioned from the template). There are no memory volumes, nor snapshots.
Storage Allocation checks in this case should take into consideration that the new disks will always be SPARSE/COW, i.e., 1G on block domains, 1M on file domains, according to the tracker bug tables (bz960934).

Verification should include a storage domain with and without enough space.
In case of insufficient space a relevant CDA message should pop.

Comment 1 Allon Mureinik 2015-01-11 12:21:31 UTC
Additional insight:  In 3.5.0, before the patch referenced in the tracker, there was a faulty allocation check that takes into consideration the size of the template's disk, instead of just a thin QCOW layer on top of it per VM in the pool.

The meaning of this is that if the template uses preallocated disks, you must have enough space for another preallocated disk per VM added to the pool, effectively killing the notion of over-committing storage.
Thus, marking as a REGRESSION.

Comment 2 Kevin Alon Goldblatt 2015-01-21 17:24:12 UTC
With file preallocated and thin disks, when reaching low disk space, the message reported is: "Error while executing action:

nfs_vm_from_tp_pool:

    Cannot edit VM-Pool. One or more provided storage domains are in maintenance/non-operational status."


INSTEAD OF "LOW DISK SPACE"

-4525-8893-a1753fe4b581
2015-01-21 18:54:13,472 ERROR [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-13) [2e48497] Can not found any default active domain for one of the disks of template with id : b13cd9ea-5afc-44e0-8791-499cedae03
84
2015-01-21 18:54:13,473 WARN  [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-13) [2e48497] CanDoAction of action UpdateVmPoolWithVms failed. Reasons:VAR__TYPE__DESKTOP_POOL,VAR__ACTION__UPDATE,ACTION_TYPE_FAI
LED_MISSED_STORAGES_FOR_SOME_DISKS
2015-01-21 18:55:14,713 INFO  [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler_Worker-24) Setting new tasks map. The map contains now 0 tasks
2015-01-21 18:55:14,713 INFO  [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler_Worker-24) Cleared all tasks of pool ebd56e62-24c3-48f6-a838-f1681f1fc5a3.
2015-01-21 18:57:45,511 ERROR [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-12) [5c3713a] Can not found any default active domain for one of the disks of template with id : b13cd9ea-5afc-44e0-8791-499cedae03
84
2015-01-21 18:57:45,511 WARN  [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-12) [5c3713a] CanDoAction of action UpdateVmPoolWithVms failed. Reasons:VAR__TYPE__DESKTOP_POOL,VAR__ACTION__UPDATE,ACTION_TYPE_FAI
LED_MISSED_STORAGES_FOR_SOME_DISKS
2015-01-21 18:58:05,142 ERROR [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-7) [58fafbee] Can not found any default active domain for one of the disks of template with id : b13cd9ea-5afc-44e0-8791-499cedae03
84
2015-01-21 18:58:05,142 WARN  [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-7) [58fafbee] CanDoAction of action UpdateVmPoolWithVms failed. Reasons:VAR__TYPE__DESKTOP_POOL,VAR__ACTION__UPDATE,ACTION_TYPE_FAI
LED_MISSED_STORAGES_FOR_SOME_DISKS
2015-01-21 18:59:32,536 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-71) [1aa64db0] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Critical, Low disk s
pace. nfs2 domain has 4 GB of free space

Comment 3 Kevin Alon Goldblatt 2015-01-21 17:42:11 UTC
Created attachment 982472 [details]
engine vdsm and server logs

Comment 4 Vered Volansky 2015-01-22 07:06:18 UTC
This error is generated long before any allocation handling is done.
Kevin, was this flow ever tested before this bz's patch?
If not - please try it and report the results here.

This cannot be related to this bz.
Also please make sure the message is truly wrong for your env.

Thanks.

Comment 5 Vered Volansky 2015-01-22 08:36:13 UTC
Kevin, another question.
When you say "with file" - did it pass on block under the same conditions or just wasn't checked on it?

Comment 6 Vered Volansky 2015-01-22 10:07:27 UTC
The issue here is in ImagesHandler.fillImagesMapBasedOnTemplate(), which is called from CommonVmPoolWithVmsCommand.ensureDestinationImageMap().

In there, there's a threshold check, which doesn't, and shouldn't, pass, but with no error at the time, hence the early fail.

The threshold check shouldn't be there at all, since now there are storage allocation checks in the system. Let it fails then.

In any case, this is still a different bug. Please open another one, and maybe block this bug's verification on the new one, but this is definitely not a fail.

Comment 7 Kevin Alon Goldblatt 2015-01-22 13:28:41 UTC
The following scenario's were tested:

Adding vms to N existing pool:  

Available space:     Space required:     Pool based on Template:     Result:
------------------------------------------------------------------------------------------------------------------
5g                   2g                  VM with BLOCK disks        Passed
5g                   10g                 VM with BLOCK disks        Failed with Low space on Disk

5g                   20mb                VM with FILE disks         Passed


4g                   2g                  VM with BLOCK disks   Failed with 
"Cannot edit VM-Pool. One or more provided storage domains are in maintenance/non-operational status."

4g                   20mb                VM with FILE disks    Failed with 
"Cannot edit VM-Pool. One or more provided storage domains are in maintenance/non-operational status."



So what we have here is when the available space on the storage domain in wWITHIN THE THRESHOLD limit, the error displayed when trying to add more vms to the pool is:

currently reads: "Cannot edit VM-Pool. One or more provided storage domains are in maintenance/non-operational status."

SHOULD read: "Cannot add vms to pool, Low disk space on storage device!"

Comment 8 Vered Volansky 2015-01-22 14:55:30 UTC
Moving to modified. The issue is in bz1184928, which this bz depends on.
Once bz1184928 is verified, this one can also be verified.

Comment 12 Kevin Alon Goldblatt 2015-05-04 10:58:54 UTC
Checked with version:
--------------------------
v3.6
ovirt-engine-3.6.0-0.0.master.20150412172306.git55ba764.el6.noarch
vdsm-4.17.0-632.git19a83a2.el7.x86_64


How reproducible:
100%


Tested with the following scenario:

Steps to Reproduce:
1. Create a VM with a disk 
2. Create a template
3. Create a Pool of VM's including 1 VM according the the matrix below:

Available space:     Space required:     Pool based on Template:     Result:
------------------------------------------------------------------------------------------------------------------
5g                   2g                  VM with BLOCK disks        BLOCKED *
5g                   10g                 VM with BLOCK disks        BLOCKED *

5g                   20mb                VM with FILE disks         Passed


4g                   2g                  VM with BLOCK disks        BLOCKED *

4g                   20mb                VM with FILE disks         Passed
 

BLOCKED * - Due to https://bugzilla.redhat.com/show_bug.cgi?id=1218165



BLOCKED DUE TO:
https://bugzilla.redhat.com/show_bug.cgi?id=1218165

Comment 13 lkuchlan 2015-05-21 08:41:58 UTC
Tested using:
ovirt-engine-3.6.0-0.0.master.20150412172306.git55ba764.el6.noarch
vdsm-4.17.0-632.git19a83a2.el7.x86_64

Verification instructions:
1. Create a VM with a disk 
2. Create a template
3. Create a Pool of VM's including 1 VM according the the matrix below:

Available space:     Space required:     Pool based on Template:     Result:
------------------------------------------------------------------------------

5g                   2g                  VM with BLOCK disks     is allowed - Passed
5g                   10g                 VM with BLOCK disks     is not allowed(low space) - Passed

5g                   20mb                VM with FILE disks      is allowed - Passed


4g                   2g                  VM with BLOCK disks     is not allowed(low space) - Passed 

4g                   20mb                VM with FILE disks      is not allowed(low space) - Passed

Comment 14 Allon Mureinik 2016-03-10 10:49:57 UTC
RHEV 3.6.0 has been released, setting status to CLOSED CURRENTRELEASE

Comment 15 Allon Mureinik 2016-03-10 10:51:08 UTC
RHEV 3.6.0 has been released, setting status to CLOSED CURRENTRELEASE

Comment 16 Allon Mureinik 2016-03-10 12:07:31 UTC
RHEV 3.6.0 has been released, setting status to CLOSED CURRENTRELEASE