Bug 1179690
Summary: | faulty storage allocation checks when adding a vm to a pool | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Vered Volansky <vered> | ||||
Component: | ovirt-engine | Assignee: | Vered Volansky <vered> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | lkuchlan <lkuchlan> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.5.0 | CC: | acanan, amureini, gklein, kgoldbla, lpeer, lsurette, rbalakri, Rhev-m-bugs, tnisan, yeylon, ykaul, ylavi | ||||
Target Milestone: | ovirt-3.6.0-rc | Keywords: | ZStream | ||||
Target Release: | 3.6.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | ovirt-engine-3.6.0_qa1 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1185614 (view as bug list) | Environment: | |||||
Last Closed: | Type: | Bug | |||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1184928 | ||||||
Bug Blocks: | 960934, 1185614 | ||||||
Attachments: |
|
Description
Vered Volansky
2015-01-07 11:00:44 UTC
Additional insight: In 3.5.0, before the patch referenced in the tracker, there was a faulty allocation check that takes into consideration the size of the template's disk, instead of just a thin QCOW layer on top of it per VM in the pool. The meaning of this is that if the template uses preallocated disks, you must have enough space for another preallocated disk per VM added to the pool, effectively killing the notion of over-committing storage. Thus, marking as a REGRESSION. With file preallocated and thin disks, when reaching low disk space, the message reported is: "Error while executing action: nfs_vm_from_tp_pool: Cannot edit VM-Pool. One or more provided storage domains are in maintenance/non-operational status." INSTEAD OF "LOW DISK SPACE" -4525-8893-a1753fe4b581 2015-01-21 18:54:13,472 ERROR [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-13) [2e48497] Can not found any default active domain for one of the disks of template with id : b13cd9ea-5afc-44e0-8791-499cedae03 84 2015-01-21 18:54:13,473 WARN [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-13) [2e48497] CanDoAction of action UpdateVmPoolWithVms failed. Reasons:VAR__TYPE__DESKTOP_POOL,VAR__ACTION__UPDATE,ACTION_TYPE_FAI LED_MISSED_STORAGES_FOR_SOME_DISKS 2015-01-21 18:55:14,713 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler_Worker-24) Setting new tasks map. The map contains now 0 tasks 2015-01-21 18:55:14,713 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler_Worker-24) Cleared all tasks of pool ebd56e62-24c3-48f6-a838-f1681f1fc5a3. 2015-01-21 18:57:45,511 ERROR [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-12) [5c3713a] Can not found any default active domain for one of the disks of template with id : b13cd9ea-5afc-44e0-8791-499cedae03 84 2015-01-21 18:57:45,511 WARN [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-12) [5c3713a] CanDoAction of action UpdateVmPoolWithVms failed. Reasons:VAR__TYPE__DESKTOP_POOL,VAR__ACTION__UPDATE,ACTION_TYPE_FAI LED_MISSED_STORAGES_FOR_SOME_DISKS 2015-01-21 18:58:05,142 ERROR [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-7) [58fafbee] Can not found any default active domain for one of the disks of template with id : b13cd9ea-5afc-44e0-8791-499cedae03 84 2015-01-21 18:58:05,142 WARN [org.ovirt.engine.core.bll.UpdateVmPoolWithVmsCommand] (ajp-/127.0.0.1:8702-7) [58fafbee] CanDoAction of action UpdateVmPoolWithVms failed. Reasons:VAR__TYPE__DESKTOP_POOL,VAR__ACTION__UPDATE,ACTION_TYPE_FAI LED_MISSED_STORAGES_FOR_SOME_DISKS 2015-01-21 18:59:32,536 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-71) [1aa64db0] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Critical, Low disk s pace. nfs2 domain has 4 GB of free space Created attachment 982472 [details]
engine vdsm and server logs
This error is generated long before any allocation handling is done. Kevin, was this flow ever tested before this bz's patch? If not - please try it and report the results here. This cannot be related to this bz. Also please make sure the message is truly wrong for your env. Thanks. Kevin, another question. When you say "with file" - did it pass on block under the same conditions or just wasn't checked on it? The issue here is in ImagesHandler.fillImagesMapBasedOnTemplate(), which is called from CommonVmPoolWithVmsCommand.ensureDestinationImageMap(). In there, there's a threshold check, which doesn't, and shouldn't, pass, but with no error at the time, hence the early fail. The threshold check shouldn't be there at all, since now there are storage allocation checks in the system. Let it fails then. In any case, this is still a different bug. Please open another one, and maybe block this bug's verification on the new one, but this is definitely not a fail. The following scenario's were tested: Adding vms to N existing pool: Available space: Space required: Pool based on Template: Result: ------------------------------------------------------------------------------------------------------------------ 5g 2g VM with BLOCK disks Passed 5g 10g VM with BLOCK disks Failed with Low space on Disk 5g 20mb VM with FILE disks Passed 4g 2g VM with BLOCK disks Failed with "Cannot edit VM-Pool. One or more provided storage domains are in maintenance/non-operational status." 4g 20mb VM with FILE disks Failed with "Cannot edit VM-Pool. One or more provided storage domains are in maintenance/non-operational status." So what we have here is when the available space on the storage domain in wWITHIN THE THRESHOLD limit, the error displayed when trying to add more vms to the pool is: currently reads: "Cannot edit VM-Pool. One or more provided storage domains are in maintenance/non-operational status." SHOULD read: "Cannot add vms to pool, Low disk space on storage device!" Moving to modified. The issue is in bz1184928, which this bz depends on. Once bz1184928 is verified, this one can also be verified. Checked with version: -------------------------- v3.6 ovirt-engine-3.6.0-0.0.master.20150412172306.git55ba764.el6.noarch vdsm-4.17.0-632.git19a83a2.el7.x86_64 How reproducible: 100% Tested with the following scenario: Steps to Reproduce: 1. Create a VM with a disk 2. Create a template 3. Create a Pool of VM's including 1 VM according the the matrix below: Available space: Space required: Pool based on Template: Result: ------------------------------------------------------------------------------------------------------------------ 5g 2g VM with BLOCK disks BLOCKED * 5g 10g VM with BLOCK disks BLOCKED * 5g 20mb VM with FILE disks Passed 4g 2g VM with BLOCK disks BLOCKED * 4g 20mb VM with FILE disks Passed BLOCKED * - Due to https://bugzilla.redhat.com/show_bug.cgi?id=1218165 BLOCKED DUE TO: https://bugzilla.redhat.com/show_bug.cgi?id=1218165 Tested using: ovirt-engine-3.6.0-0.0.master.20150412172306.git55ba764.el6.noarch vdsm-4.17.0-632.git19a83a2.el7.x86_64 Verification instructions: 1. Create a VM with a disk 2. Create a template 3. Create a Pool of VM's including 1 VM according the the matrix below: Available space: Space required: Pool based on Template: Result: ------------------------------------------------------------------------------ 5g 2g VM with BLOCK disks is allowed - Passed 5g 10g VM with BLOCK disks is not allowed(low space) - Passed 5g 20mb VM with FILE disks is allowed - Passed 4g 2g VM with BLOCK disks is not allowed(low space) - Passed 4g 20mb VM with FILE disks is not allowed(low space) - Passed RHEV 3.6.0 has been released, setting status to CLOSED CURRENTRELEASE RHEV 3.6.0 has been released, setting status to CLOSED CURRENTRELEASE RHEV 3.6.0 has been released, setting status to CLOSED CURRENTRELEASE |