Bug 1053742 - Faulty storage check when adding a VM with disks
Summary: Faulty storage check when adding a VM with disks
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: ovirt-engine-core
Version: 3.4
Hardware: Unspecified
OS: All
medium
medium
Target Milestone: ---
: 3.5.0
Assignee: Vered Volansky
QA Contact: Ori Gofen
URL:
Whiteboard: storage
Depends On:
Blocks: 960934 1053744 1178011
TreeView+ depends on / blocked
 
Reported: 2014-01-15 17:17 UTC by Allon Mureinik
Modified: 2016-02-10 17:24 UTC (History)
16 users (show)

Fixed In Version: ovirt-3.5.0-alpha1
Doc Type: Bug Fix
Doc Text:
Clone Of: 1053704
: 1053744 (view as bug list)
Environment:
Last Closed: 2014-10-17 12:19:57 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
vdsm+engine logs (1.86 MB, application/gzip)
2014-08-04 15:46 UTC, Ori Gofen
no flags Details
vdsm+engine correct logs (1.61 MB, application/gzip)
2014-08-04 15:48 UTC, Ori Gofen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 26734 0 master MERGED core: Fix AddVm faulty storage allocation checks Never

Description Allon Mureinik 2014-01-15 17:17:41 UTC
+++ This bug was initially created as a clone of Bug #1053704 +++

+++ This bug was initially created as a clone of Bug #960934 +++

Description of problem:

For each new empty disk of the VM, the required space validations should be:
      | File Domain                             | Block Domain
 -----|-----------------------------------------|-------------
 qcow | 1M (header size)                        | 1G
 -----|-----------------------------------------|-------------
 raw  | preallocated: disk capacity (getSize()) | disk capacity
      | thin (sparse): 1M                       | (there is no raw sparse on 
                                                   block domains)

Comment 1 Vered Volansky 2014-04-08 08:31:35 UTC
Allon, assuming this relates to AddVmFromScratchCommand?
It's only allowed to do that through REST, BTW. Through the UI it's first add a VM, then add disks to it. Different commands.

Comment 2 Allon Mureinik 2014-04-10 12:22:46 UTC
(In reply to Vered Volansky from comment #1)
> Allon, assuming this relates to AddVmFromScratchCommand?
> It's only allowed to do that through REST, BTW. Through the UI it's first
> add a VM, then add disks to it. Different commands.
You can clone the BZ and have a different patch for REST and GUI.

Comment 3 Allon Mureinik 2014-04-13 12:14:13 UTC
(In reply to Allon Mureinik from comment #2)
> (In reply to Vered Volansky from comment #1)
> > Allon, assuming this relates to AddVmFromScratchCommand?
> > It's only allowed to do that through REST, BTW. Through the UI it's first
> > add a VM, then add disks to it. Different commands.
> You can clone the BZ and have a different patch for REST and GUI.

Misunderstood the question.
Yes, this BZ is specific to REST. The GUI flow was handled in bug 1053704.

Comment 4 Vered Volansky 2014-04-29 02:59:03 UTC
This bug was fixed for bz917682, which is targeted for 3.4.1 .
Therefore targeting this to 3.4.1 as well. Will change to modified when backported.

Comment 5 Vered Volansky 2014-07-10 07:31:18 UTC
Thie following is the full data on storage allocation cheacks:

For new empty disks (unchanged from the original)
      | File Domain                             | Block Domain
 -----|-----------------------------------------|-------------
 qcow | 1M (header size)                        | 1G
 -----|-----------------------------------------|-------------
 raw  | preallocated: disk capacity (getSize()) | disk capacity
      | thin (sparse): 1M                       | (there is no raw sparse on 
                                                   block domains)


For cloned disks:
      | File Domain                             | Block Domain
 -----|-----------------------------------------|-------------
 qcow | preallocated : 1.1 * disk capacity      |1.1 * min(used ,capacity) 
      | sparse: 1.1 * min(used ,capacity)       |
 -----|-----------------------------------------|-------------
 raw  | preallocated: disk capacity             |disk capacity
      | sparse: min(used,capacity)


Some more info on how to reproduce:
Create one of each of the disks within the table with sufficient/insufficient storage space, and verify success and failure (should CDA failure).

Comment 6 Ori Gofen 2014-08-04 15:46:05 UTC
Created attachment 923925 [details]
vdsm+engine logs

reproduced on beta.2 failed with several exceptions on creation of raw preallocated file

2014-08-04 18:25:44,613 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-75) Failed in HSMGetAllTasksStatusesVDS method
2014-08-04 18:25:44,614 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-75) SPMAsyncTask::PollTask: Polling task ec8bfc50-7f59-4a09-b059-dfef5d3f1427 (Parent Command AddDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'cleanSuccess'.
2014-08-04 18:25:44,629 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-75) BaseAsyncTask::LogEndTaskFailure: Task ec8bfc50-7f59-4a09-b059-dfef5d3f1427 (Parent Command AddDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
-- Result: cleanSuccess
-- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot zero out volume, code = 374,
-- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot zero out volume, code = 374
2014-08-04 18:25:44,646 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-75) CommandAsyncTask::EndActionIfNecessary: All tasks of command 5c5a7ac1-ff29-4389-9059-020c44f43aaa has ended -> executing endAction
2014-08-04 18:25:44,654 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-75) CommandAsyncTask::endAction: Ending action for 1 tasks (command ID: 5c5a7ac1-ff29-4389-9059-020c44f43aaa): calling endAction .
2014-08-04 18:25:44,662 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-8-thread-23) CommandAsyncTask::EndCommandAction [within thread] context: Attempting to endAction AddDisk, executionIndex: 0
2014-08-04 18:25:44,672 ERROR [org.ovirt.engine.core.bll.AddDiskCommand] (org.ovirt.thread.pool-8-thread-23) [421c4c60] Ending command with failure: org.ovirt.engine.core.bll.AddDiskCommand

Comment 7 Ori Gofen 2014-08-04 15:48:24 UTC
Created attachment 923928 [details]
vdsm+engine correct logs

Comment 8 Vered Volansky 2014-08-06 10:04:49 UTC
After a discussion with Ori, it turns out the verification was too strict. This bz fix was intended to give a preliminary solution to non-existing allocation checks. I Opened bz1127121 in order to handle the scenario used by Ori.

The verification needed on this bug is as follows:
1. Create a Vm and add disks to it.
2. Create a template of the Vm.
3. Add a thinly-provisioned VM over that template USING REST.

Verify step 3 against a storage domain with and without enough space according to the one table in comment #0. With enough space should succeed and without should fail on appropriate CDA.

Note this bz was openned to follow an issue specific to REST API.

Changing back to MODIFIED.

Comment 9 Eyal Edri 2014-08-06 10:38:34 UTC
if not code change was done, it should go back to on_qa and re-verfied.

Comment 10 Ori Gofen 2014-08-07 14:51:01 UTC
Vered,the bug fails qa on File domain

Comment 11 Gil Klein 2014-08-07 17:18:24 UTC
Moving back to assigned per Ori's comment #10

Comment 12 Allon Mureinik 2014-08-08 15:19:49 UTC
(In reply to Gil Klein from comment #11)
> Moving back to assigned per Ori's comment #10
Gil, I don't understand this decision.

As Vered explained in comment #8, Ori's verification was beyond the scope of this fix, and the issue he found will be tracked in a separate BZ (see bug 1127121 for details):

(In reply to Vered Volansky from comment #8)
> After a discussion with Ori, it turns out the verification was too strict.
> This bz fix was intended to give a preliminary solution to non-existing
> allocation checks. I Opened bz1127121 in order to handle the scenario used
> by Ori.

There's nothing else to do with /this/ BZ within its defined scope (unless further verification uncovers a different issue, of course).

Please advise.

Comment 13 Allon Mureinik 2014-08-10 15:50:32 UTC
(In reply to Allon Mureinik from comment #12)
> (In reply to Gil Klein from comment #11)
> > Moving back to assigned per Ori's comment #10
> Gil, I don't understand this decision.
> 
> As Vered explained in comment #8, Ori's verification was beyond the scope of
> this fix, and the issue he found will be tracked in a separate BZ (see bug
> 1127121 for details):

Completely missed this comment:
(In reply to Gil Klein from comment #11)
> Moving back to assigned per Ori's comment #10

Vered - we need to see what happens on a file SD.
Thanks!

Comment 14 Vered Volansky 2014-08-11 06:07:59 UTC
(In reply to Ori from comment #10)
> Vered,the bug fails qa on File domain

OK, steps to reproduce, new logs, description of the issue using the validation expected on this bug, not the strict one.

Thanks.

Comment 15 Ori Gofen 2014-08-11 10:46:59 UTC
Step to reproduce:
Setup: have a File domain with 7G free space

1. Create a Vm + disk(3G preallocated)
2. Create a template of the Vm.
3. Add a thinly-provisioned VM over that template USING REST.

results:
the operation is successful due to the fact that preallocated file disks(created from template) has 24k size rather than the virtual

the expected result is to block this scenario with CNA.

supplying logs to this steps will take some time.
I'll attach them as soon as I can.

Comment 16 Vered Volansky 2014-08-11 11:52:06 UTC
After a discussion with Ori and with his agreement:
Comment #15 actually verifies this specific bug. Moving back to ON QA.

Comment 17 Ori Gofen 2014-08-11 13:44:00 UTC
(In reply to Vered Volansky from comment #16)
> After a discussion with Ori and with his agreement:
> Comment #15 actually verifies this specific bug. Moving back to ON QA.

Agreed,The problem with this bug is that the subject is too general,hence, reveals to many scenario's in which some of them the actual result is far from the expected.

After a discussion with Vered, we had agreed that the Steps to verify this bug are only as mentioned in comment #8 against all relevant disks.

all other flows such as cloning vm(on file) or multi-process issues, that contain bugs will be open separately.

moving to verified on rc1

Comment 18 Sandro Bonazzola 2014-10-17 12:19:57 UTC
oVirt 3.5 has been released and should include the fix for this issue.


Note You need to log in before you can comment on or make changes to this bug.